Thursday, November 12, 2015

ANOVA and Post Hoc Tests

Pictures are great but they're not quite precise enough to tell us if there's a statistically significant difference between groups of data.

Which brings us to the ANOVA (analysis of variance) test.  ANOVA requires a categorical response variable and a quantitative explanatory variable which is perfect for my research questions (I mean, my Mars Missions).

Mission 1:  Determine if there's an association between the number of visible layers in a crater and its latitude.  (Or, in hypothesis testing lingo, the alternative hypothesis is that craters that are grouped by their number of layers are, on average, found in different latitudes.  The null hypothesis is that there's no difference in latitudes between the groups.)

ANOVA code in SAS
Running the code gives the following results:

The ANOVA Procedure
Dependent Variable: LATITUDE_CIRCLE_IMAGE Crater Latitude (degrees)

Class Level Information
ClassLevelsValues
NUMBER_LAYERS60 1 2 3 4 5
Number of Observations Read384343
Number of Observations Used384343
SourceDFSum of SquaresMean SquareF ValuePr > F
Model52004589.2400917.8356.57<.0001
Error384337432133768.41124.4  
Corrected Total384342434138357.6   
R-SquareCoeff VarRoot MSELATITUDE_CIRCLE_IMAGE Mean
0.004617-465.766533.53150-7.199209
SourceDFAnova SSMean SquareF ValuePr > F
NUMBER_LAYERS52004589.224400917.845356.57<.0001
And within that giant amount of information, there's one major point:  the p-value (which is listed Pr<F in the table) is less than 0.0001 so that means the null hypothesis can be rejected.  In other words, the craters have been grouped into 6 groups (0 layers, 1 layer, and so on up to 5 layers) and for each group, a mean latitude was calculated.  Since the p-value is so low, the mean latitudes for the groups are not all equal to each other.  Which is sorta useful to know but it'd be really great to know which groups are different.

That's where the Duncan Test (a Post Hoc test) comes in.  The results from the Duncan Test are shown below.

Means with the same letter are not significantly different.
Duncan GroupingMeanNNUMBER_LAYERS
 A12.68034352
 A   
BA2.3907393
BA   
BA0.67155
BA   
BA-1.497154671
BA   
BA-2.094854
B    
B -7.6493646120

This table indicates the craters with 1, 2, 3, 4, and 5 layers do not have a statistically significant differences in their mean latitudes.  And same goes for the craters with 0, 1, 3, 4, and 5 layers.  However, craters with no visible layers do have a different latitude (on average) than craters with 2 layers, which makes a lot of sense, given that this is the graph of latitude vs number of layers:

Probably should have included this one in my last post, too.  Oh well. Better late than never.


Mission 2: Determine if there's an association between the number of layers and the morphology of the ejecta.

For this mission, my alternate hypothesis is that each morphology (SLE, DLE, MLE, Pd, and Rd) have a different number of visible layers.  The null hypothesis is that they don't have different numbers of layers.

More ANOVA code
Here are the results of the ANOVA test:

The ANOVA Procedure
Class Level Information
ClassLevelsValues
morph5DLE MLE Pd Rd SLE
Number of Observations Read384343
Number of Observations Used44625
SourceDFSum of SquaresMean SquareF ValuePr > F
Model417456.690254364.1725632610.8<.0001
Error446205971.316790.13383  
Corrected Total4462423428.00704   
R-SquareCoeff VarRoot MSENUMBER_LAYERS Mean
0.74512165.511550.3658220.558409
SourceDFAnova SSMean SquareF ValuePr > F
morph417456.690254364.1725632610.8<.0001
This one also has a p-value less than 0.0001 so the null hypothesis can be rejected (ie all of the morphologies don't have the same number of layers).  But, again, it'd be nice to know which ones are significantly different from each other and so we run another post hoc test:

Means with the same letter are not significantly different.
Duncan GroupingMeanNmorph
A3.1102581MLE
    
B1.99502777DLE
    
C1.003314196SLE
    
D0.123027069Rd
D   
D0.00002Pd

Three types (MLE, DLE, and SLE) are in their own little groups - they've all got means different from the others.  The other two types (Rd and Pd) have means that are not significantly different.  That means that the number of layers and morphology types are associated for MLE, SLE, DLE, and Rd/Pd but there's no association between layers and morphology if we only look at the Rd and Pd groups.

Bonus Mission:  Determine if there's an association between the number of layers and the depth of a crater.

Alternative hypothesis:  Craters grouped by their number of visible layers will have different average depths.  (Null:  the depths won't be different, regardless of how many layers are visible).

The final bit of ANOVA code
And the results:


The ANOVA Procedure
Dependent Variable: DEPTH_RIMFLOOR_TOPOG Crater Depth (km)
Class Level Information
ClassLevelsValues
NUMBER_LAYERS60 1 2 3 4 5
Number of Observations Read384343
Number of Observations Used384343
SourceDFSum of SquaresMean SquareF ValuePr > F
Model53714.16500742.8330018850.4<.0001
Error38433715145.501770.03941  
Corrected Total38434218859.66676   
R-SquareCoeff VarRoot MSEDEPTH_RIMFLOOR_TOPOG Mean
0.196937261.75890.1985120.075838
SourceDFAnova SSMean SquareF ValuePr > F
NUMBER_LAYERS53714.164997742.83299918850.4<.0001
Once again, the p-value is less that 0.0001 so there's at least one group that's got a different average depth from the others.

Here are the post hoc test results:

Means with the same letter are not significantly different.
Duncan GroupingMeanNNUMBER_LAYERS
A1.5940055
    
B1.33776854
    
C1.045977393
    
D0.5573434352
    
E0.42667154671
    
F0.054143646120

This is a pretty cool little table because it says that all of the groups have different mean depths.  That means that craters with 0 layers have a different depth than craters with 1 layers and that both of those groups have different depths than craters with 2 (and so on). It's not an incredibly surprising result but it's still neat to see that it's true.

In the next post... I have no idea what we'll be doing with the data but I'm sure it'll be fun.

No comments:

Post a Comment