Monday, October 19, 2015

Exploratory Data Analysis

This week, I've actually taken the data (all 384,343 rows) from the Mars Crater dataset and done various things to it.


My lovely code.  I wrote nested if statements in two different ways, though, and I'm not sure how I feel about that.
I included comments (in green) to describe what my code is doing but here's another breakdown.
First, I sorted it by crater ID (because every crater has it's own unique ID number). Then I created a set of ranges for the latitudes (because 18.100 degrees and 18.105 degrees really aren't that different when we're dealing with 180 degrees overall). Each range includes 10 degrees. I also had it do a similar set of ranges for the crater depth.  Morphology was different - it's categorical data - but I was only interested in the first structure, so I put some code in to do that.  After all of that rearranging, I had the program make frequency distributions for all of the variables I'm interested in (see my Mars Missions post for the entire backstory).

Let's look at each frequency distribution separately.  First up, latitude.

Crater Latitude (degrees)
lat_rangeFrequencyPercentCumulative
Frequency
Cumulative
Percent
01. (-90,-80]6310.166310.16
02. (-80,-70]69841.8276151.98
03. (-70,-60]135273.52211425.50
04. (-60,-50]187584.883990010.38
05. (-50,-40]253966.616529616.99
06. (-40,-30]345779.009987325.99
07. (-30,-20]4650412.1014637738.08
08. (-20,-10]4615812.0119253550.09
09. (-10,0]4092110.6523345660.74
10. (0,10]323628.4226581869.16
11. (10,20]304117.9129622977.07
12. (20,30]289907.5432521984.62
13. (30,40]233656.0834858490.70
14. (40,50]141603.6836274494.38
15. (50,60]108012.8137354597.19
16. (60,70]79742.0738151999.27
17. (70,80]27800.7238429999.99
18. (80,90]440.01384343100.00

This table counts how many craters can be found in each latitude.  From the table, we can see that there are significantly more craters in the -50 to 30 degree ranges (74.24%) than there are as we get closer to the poles.  In other words, just 44% of the latitude contains 74% of the craters.  In fact, 24.11% of craters fall within only 20 degrees of latitude (between -30 and -10 degrees), just south of the equator.  However, we should keep in mind that the latitudes closer to the equator also have more land area than the ones closer to the poles (and more area = more places for craters to exist).
 Next we have the number of layers:
Number of Layers
NUMBER_LAYERSFrequencyPercentCumulative
Frequency
Cumulative
Percent
036461294.8736461294.87
1154674.0238007998.89
234350.8938351499.78
37390.1938425399.98
4850.02384338100.00
550.00384343100.00
 This table tells us how many layers can be seen in the crater.  Almost all of the craters (94.87%) have no observable layers, followed by a mere 4.02% having 1 layers.  As the number of layers increases, the number of craters drastically decreases.  No crater in the dataset has 6 or more layers.  Next up is crater depth:
Crater Depth (km)
depthFrequencyPercentCumulative
Frequency
Cumulative
Percent
[-0.5,0)100.00100.00
[0,0.5)36359294.6036360294.60
[0.5,1)158034.1137940598.72
[1,1.5)35670.9338297299.64
[1.5,2)10390.2738401199.91
[2,2.5)2650.0738427699.98
[2.5,3)510.01384327100.00
[3,3.5)90.00384336100.00
[3.5,4)30.00384339100.00
[4,4.5)10.00384340100.00
[4.5,5)30.00384343100.00

The crater depths are grouped into ranges of 0.5 km.  Negative numbers indicate that the crater actually rises about the surrounding ground (and I'll talk about that more after the next table).  The thing that stands out is that most craters (94.6%) have depths between 0 and 0.5 km.  (I also want to point out that I used interval notation to describe the ranges so [0,0.5) includes craters that have a depth of 0 km but not craters that have a depth of 0.5 km.  And obviously it includes everything in between 0 and 0.5 km.)  And 98.72% of the craters are less than 1 km deep.  The deepest craters in this dataset are nearing 5km (3 miles!) deep but there are only 3 of them (out of about 300,000 craters).

And finally, we have the morphology table:

Morphology of Innermost Ejecta
morphFrequencyPercentCumulative
Frequency
Cumulative
Percent
Frequency Missing = 339718
DLE27776.2227776.22
MLE5811.3033587.52
Pd20.0033607.53
Rd2706960.663042968.19
SLE1419631.8144625100.00

Craters can have a lot of different shapes and structures and so scientists use codes (instead of words) to describe them.  For my data, I looked at the primary structure of the innermost ejecta (in other words, I only looked at the first code for each crater).  Because these codes and terms are probably unfamiliar (they were to me, anyway), I've got another post just about morphology that gives descriptions and pictures for each type.

But, here's the TL;DR:

SLE = Single-Layer ejecta
DLE = Double-layer ejecta
MLE = Multiple-layer ejecta
Pd= Pedestal
Rd=Radial

Getting back to the morphology table, we can see that most craters that have been classified are of the radial type (60.66%), followed by single-layer ejecta (31.81%).  There are not many pedestal craters, although it should be noted that this data table only looks at the primary morphology and pedestal craters may also be listed at SLEPd (single-layer ejecta, pedestal type) where the ejecta closest to the center of the crater is of the SLE type and then, moving farther out, it becomes a pedestal type.  I should also point out that most of the craters (88%) are actually missing a description of their morphology.

And that's it for now.  In my next post, I'll make some graphs to give a clearer picture of the data.

Crater Morphology

Craters can take on a lot of different shapes and, to make it easier to talk about them and to analyze them, scientists have developed a set of keywords and codes.  In this post, we'll go through some of the common ones (and by "common," I mean the ones I'm looking at in my analysis):
  • SLE (Single-Layer Ejecta): Craters that have one layer of ejecta. 
A and B both show SLE craters. [1]

  • DLE (Double-Layer Ejecta): Craters that have two superimposed layers of ejecta
C is a DLE crater.  Both layers are easily seen on the right side of the crater. [1]


  • MLE (Multiple-Layer Ejecta): Craters that have more than two layers of ejecta.

Part of an MLE crater. [2]

  • Pd (Pedestal):  Craters that have a raised layer of ejecta around them.  The ejecta makes it look like the crater is sitting on a pedestal.
The most perfect example ever of a pedestal crater [3]

  • Rd (Radial): Craters whose ejecta make a radial pattern, like spokes on a wheel.
Poona, a radial crater [4]


References

 [1] Robbins, S.J. (2011) "Planetary Surface Properties, Cratering Physics, and the Volcanic History of Mars from a New Global Martian Crater Database" Ph.D. Thesis, University of Colorado at Boulder.

 [2] Barlow, Nadine G. "Impact craters in the northern hemisphere of Mars: Layered ejecta and central pit characteristics." Meteoritics & Planetary Science 41.10 (2006): 1425-1436.

 [3] "ESP 037528 2350pedestal" by Jim Secosky modified NASA image. - http://hirise.lpl.arizona.edu/ESP_037528_2350. Licensed under Public Domain via Commons - https://commons.wikimedia.org/wiki/File:ESP_037528_2350pedestal.jpg#/media/File:ESP_037528_2350pedestal.jpg

 [4] Mutch, Thomas A., et al. "The geology of Mars." Princeton, NJ, Princeton University Press, 1976. 409 p. 1 (1976).