In this post, we’ll follow on to one of the ideas from the last post: that statistics and statistical analysis are important tools for coping with measurements, especially when measurements accumulate in sufficiently large amounts with varying values. When that is the case, the measurements are considered “raw data”. Having at least a conceptual understanding of the way that statistical analysis is used to reduce volumes of data to comprehensible information must be considered a critical skill in dealing with our modern world.
My intention is not to teach statistics or statistical analysis: I used these tools a little in my professional life, but my use was limited, fortunately. I have read several books on statistics as well as listened to Dr. Michael Starbird’s two DVD lecture series. I have listed them at the end of this post so that you may read them, consult them or ignore them. I am hardly an expert, but I do appreciate being able to understand what is meant when I hear statistics being used or misused, and I know enough to ask reasonable questions about hypotheses, data, sampling and drawing conclusions. Does this mean that I am equipped to cope well with our modern world? Maybe so, maybe no. There’s no fool like the one convinced that he/she cannot be fooled.
In the 12th lecture in Meaning from Data, Dr. Starbird comments: “Statistics is all about coming to conclusions that are not certain.” There is much about life that is uncertain, which is, of course, a truism but is not trivial. Statistics are used to assess future risk. Assessing risk, either for insurance purposes or building a nuclear power plant, is always an uncertain prospect, and when it is inadequately done, as has been made horribly clear in the case of the massive Japanese earthquake and tsunami, the consequences are potentially lethal. Not only is the future uncertain but the past is too: what happened in the past is the subject of endless books and articles on history, archeology, paleontology, etc., not all of which agree. Describing what happened in crimes is uncertain: memories of witnesses, if there are any, are often vague, and a large portion of the justice system would be rendered unnecessary if the past could be easily reconstructed (and our prisons even more over-filled?). The kind of certainty that Sherlock Holmes provides is exactly what you would expect of a work of fiction, nice, but illusory.
Since uncertainty is a given in the human condition, is it the environment to which humans have adapted by using measurement to find certainty? It does seem as if the driving force behind much of physical science from the 16th century until the beginning of the 20th century was to finally establish the certainty of the clockwork universe implicit in the work of Isaac Newton that is a gift Newton attributed to the ‘Creator’.
This clockwork universe started falling apart even before the work of Charles Darwin. The dissolution received a push by The Origin of Species, since it began to look less and less likely that evolution was under the direct control of a creator, or at least, that evolution could have occurred in ways that required a creator. But the apparent collapse of the clockwork universe was completed by Einstein’s relativity theories, the special and the general, then with the work of the group associated with Niels Bohr. I say apparent since both relativity and quantum mechanics have carefully defined realms where they are effective models of behavior: relativity, when dealing with velocities that approach the speed of light; and quantum mechanics, the realm of atomic and sub-atomic matter. Newtonian mechanics explains most of what we experience in the macroscopic world pretty well, and is still used for the calculations guiding most space shots, though relativity figures in to the way that the Global Positioning System works – but more on GPS in a later post.
Since statistics has been on my mind, I realize that the way that I think about the realms of quantum mechanics, relativity and Newtonian mechanics can be represented as a picture: using a standard bell curve, the normal distribution curve from statistical analysis, with two lines marking the extreme ends, the green is where quantum mechanics operates, the red represents the Newtonian world that is most of our day-to-day experience, and the yellow is the realm of relativity. The following picture is a picture only and was generated without reference to any actual data. Therefore, the green and yellow areas might have to be smaller, the red might need to be broader, but this representation was done to clarify a concept, and not to provide actual representations of the realms. Pretty, but like any model, it has lots of inaccuracies: the foundation for the red can be found in both the green and the yellow.
One concept that is part of the Copenhagen interpretation of quantum mechanics is Heisenberg’s famous uncertainty principle. The uncertainty principle has a very precise meaning in physics: that if you try to measure the precise position of an electron, one of the sub-atomic particles, you will not be able to measure its momentum, and the inverse, if you try to measure its momentum, you will be unable to measure its position. The two quantities, position and momentum, cannot be measured simultaneously. Because the only way to measure them is to use electromagnetic photons (like light but including the full spectrum of electromagnetic radiation) and the energy of small wavelength photons is comparable to the energy of an electron, “The more an observer trie[s] to extract information about the electron’s position, the less it [is] possible to know about its momentum, and vice versa.”1
This represents the limit of measurement: anything that is the size of wavelenths2, whether light wavelengths or x-ray or gamma ray wavelengths, the last two being the shortest, can’t be measured with precision. To be observed using electromagnetic photons, one must “bounce” the photons off the object, and then view the photons. If one tries to bounce a photon off an electron, sometimes it will be absorbed, sometimes it will knock the electron out of orbit, sometimes it might be reflected. But anything that can reflect the photon without being substantially affected by the interaction can be measured. In terms of the dimensions that this deals with, the change in momentum multiplied by the change in the position of the electron is equal to one/half of the reduced Planck constant, the value of which is 1.0545 x 10 to the minus 34th power joules per second. Things just don’t get much smaller.
To use a particle analogy, if one tries to bounce a billiard ball off a ball bearing, the interaction will send the ball bearing flying, changing its position, changing its momentum. The result would be that maybe some billiard balls would come back, as if they had bounced off the cushion of a billiard table. Others would keep going, having sent the ball bearing flying, and some might just graze the ball bearing. The resulting “picture” of the ball bearing would be “fuzzy” at best.
In the book quoted above, Lindley says that Heisenberg originally used the German word for “inexactness”, but Bohr came up with the word “uncertainty”. Lindley provides more information on the use and misuse of words about this aspect of quantum mechanics, but it is the phrase “uncertainty principle” that concerns me.
The phrase has been taken as a metaphor and applied to all sorts of situations in the macroscopic world that are only marginally, if at all, applicable. A corollary of this is that the observer affects that which is observed – possibly true in the situation of an anthropologist observing a group of people, who may or may not put on a “performance” for him/her, but not necessarily true of an archeologist finding human or animal bones in a prehistoric grave. There are some effects an archeologist could have, of course, accidentally running their shovel through a bone, shattering it, or moving a bone so that the precise location is no longer reconstructable, and if the bone was to be Carbon 14 tested, contaminating it by touching it with his/her bare hand.
The warning here that I want to issue is to be very wary of those who take a carefully worked out bit of a scientific model and apply it as a metaphor to a completely different realm. Another case in point is the theory of genes and evolution being turned into a model for the evolution of ideas, the meme theory. It may be helpful for a time, but could be misleading in the long run.
Trying to describe life based on the uncertainty principle has the same caution, even though it may seem “poetically accurate”. Yes, quantum mechanics uses probability in its model of the way that the sub-atomic world interacts, and yes, many macroscopic, human events use probability to deal with the future, with risk, and even in reconstructions of the past, but the two realms, sub-atomic and macroscopic operate differently, and logic and reasoning techniques that work in one realm do not always make the leap across the gap successfully to the other realm.
In all probability, this gap is bridged successfully only 5% of the time. That statistic, by the way, comes under the heading of what my older daughter told me: 43% of all statistics are made up on the spot. (or was it 48%?) She may have been paraphrasing an old Peanuts cartoon.
While I make light of statistics, and I could probably provide you with more statistics being blatantly and humorously misapplied, there is some serious intent here.
The 11 February 2011 issue of Science magazine, the journal of the American Association for the Advancement of Science (AAAS) has a special section in it devoted to data. There are 3 articles under the category “News”, 11 articles under the category “Perspectives”, and a series of related informational articles about the challenges of dealing with the data overload that has been created. One article which is only online and requires either membership or payment for a copy of it, is one that concludes that “We have recently passed the point where more data is being collected than we can physically store”3 If you would like to look at the articles, there is public access to those in the two categories that I mentioned above at:
The first story in the News category describes what happens to data after the physicists move on to the next big collaboration – the data may be “misplaced”. The tale is of a physicist going back to review the data from an experiment he had worked on 20 years before, and discovering that the data had been scattered, the software to access the data was obsolete, and “,,,[o]ne critical set of calibration numbers survived only as ASCII text printed on green printer paper…”4 which took a month to reenter by hand. Eventually, several years were spent reconstructing the data, then using it in light of the most recent theoretical advances to provide material for a number of papers, including one of which was cited in a Nobel Prize. But this recovery effort was part of the inspiration for the formation of a working group at CERN called Data Preservation in High Energy Physics (DPHEP).
The second story in the News category describes two collaborations on visualization software, both of which are cross-disciplinary: both collaborations are medical people and astronomers. One collaboration involves medical software used in MRIs to display 3-D versions of the MRI results, and has been adapted to the display of a massive amount of astronomical data in 3-D, leading to “…new discoveries that are incredibly difficult to do otherwise, such as spotting elusive jets of gas ejected from newborn stars.”5 The other collaboration uses image-analysis software developed for astronomers to “…analyze large batches of images, picking out faint, fuzzy objects”6, adapted to automate cancer detection. It is a very exciting use of software that uses statistical techniques to deal with masses of data.
I have included mention of these stories because they indicate how serious the ability to deal with data is using statistical analysis – and ultimately how important understanding how to transform data into information is. Since the central theme of the posts is measurement, statistics are critical to understanding many kinds of measurement, from economics to physics, from sports to carpentry, etc. I hope that I have convinced you of the value of understanding how statistics are generated from various kinds of data sets.
A little “bookkeeping” relative to the posts as a whole. In two earlier posts, Marking Time (or at least calibrating it) and Measuring Light, I mentioned Ole Roemer’s brilliant attempt to measure the speed of light by observing the moon of Jupiter named Io when it disappeared behind Jupiter and then reappeared. I recently received a photo from the Hubble Telescope through an iPhone app, Star Walk, that shows Io as it transits in front of Jupiter. It’s such a great photo that I’ve included it below:
The tiny dot in the center is Io, the dot to the right and down a bit is Io’s shadow on the surface of Jupiter, and of course, the background is Jupiter.
The other piece of bookkeeping is that in the post titled Science and Measurement, I discussed the work of Thomas Kuhn, in particular, his book The Structure of Scientific Revolutions, and an article that he wrote called The Function of Measurement in Modern Physical Science. I treated him as an authority on how science measures and moves forward from paradigm shift to paradigm shift. This past week, a blogger? guest columnist? smart guy who has written several very interesting series of articles for the New York Times, Errol Morris, wrote a series of 5 articles with the overall title “The Ashtray”. While not strictly about measurement, I learned much from the articles.
Errol Morris is, among other things, a film maker who won an Oscar for the best documentary in 2004 for his film, The Fog of War: Eleven Lessons From the Life of Robert S. McNamara. This set of articles, “The Ashtray” caught my attention when I read in the first one that Morris had been a student of Thomas Kuhn’s at Princeton, and that in 1972, Kuhn had settled a debate with Morris by hurling an ashtray at him. The series goes on to discuss how Kuhn may not be the authority on scientific measurement that I thought he was. I enjoyed the five articles quite a bit, so here are the URLs for them:
I guess that unless one spends much time in the actual field in which a person has won their fame or notoriety, one does not know how much uncertainty there is about that person’s reputation.
With that last bit, I will end this post which seems to have drifted from its original promise or premise, but contains a number of ideas that I believe is closely related to the subject of measurement.
A bibliography of a sort for the information in the post:
Hughes, Ifan G. & Hase, Thomas P.A., Measurements and their Uncertainties, A Practical Guide to Modern Error Analysis, Oxford University Press, Oxford, England, 2010.
Lindley, David, UNCERTAINTY: Einstein, Heisenburg, Bohr, and the Struggle for the Soul of Science, Doubleday, New York, N.Y., 2007.
Kachigan, Sam Kash, Multivariate Statistical Analysis, A Conceptual Introduction, Second Edition, Radius Press, New York, N.Y., 1991
Sanders, Donald H., and Smidt, Robert K., Statistics: A First Course, Sixth Edition, McGraw-Hill, Boston, etc. 2000, 1995.
Starbird, Professor Michael, Meaning from Data: Statistics Made Clear, The Great Courses, Chantilly, VA, Course No. 1487, 2006. www.thegreatcourses.com
Starbird, Professor Michael, What are the Chances?: Probability Made Clear, The Great Courses, Chantilly, VA, Course No. 1474, 2006. www.thegreatcourses.com
1 Lindley, David, UNCERTAINTY: Einstein, Heisenburg, Bohr, and the Struggle for the Soul of Science, Doubleday, New York, N.Y., 2007. pp. 146-147.
2 See the post titled Measuring Light for a description of the sizes of light wavelengths.
3 ” Dealing with Data, Challenges and Opportunities: Introduction”, Special Section, SCIENCE, AAAS, Washington, D.C., Vol 331, 11 February 2011, p. 692.
4 Curry, Andrew, ” Dealing with Data, Rescue of Old Data Offers Lesson for Particle Physicists”, Special Section, SCIENCE, AAAS, Washington, D.C., Vol 331, 11 February 2011, p. 694.
5 Reed, Sarah, ” Dealing with Data, Is There an Astronomer in the House”, Special Section, SCIENCE, AAAS, Washington, D.C., Vol 331, 11 February 2011, p. 697.
6 Reed, Sarah, ” Dealing with Data, Is There an Astronomer in the House”, Special Section, SCIENCE, AAAS, Washington, D.C., Vol 331, 11 February 2011, p. 696.