Tuesday, September 29, 2009

What makes it science?

Many people draw a division between the hard sciences and mathematics on the one hand, and everything else on the other. The implication being that one side is "really" science and the other is not. Which claim to upset members of "soft sciences" like psychology.

This post explores the question of how justified this division is.

The starting point for my thinking is something I learned from the wonderful essay In Oldenburg's Long Shadow about the serial pricing crisis.

To understand the serial pricing crisis you must first understand the science citation index. This is nothing more or less than an index of how many times a given paper has been cited by other published papers. When you look at it, you find that papers that appear in some journals are consistently cited more often than others. This is a direct measurement of how influential the journal is, and leads to the impact factor. When you look at journals across math and the hard sciences you find that there is a hierarchy of journals. At the bottom you have low impact journals that publish only for a small niche. But key papers from that niche are published in more prominent journals that are looked at by people in a wider range of subjects. And this goes all of the way to the highest impact journal of all, Nature, which is where people try to publish the absolute best work across all of math and the hard sciences. It doesn't matter whether you're a physicist or a biologist, the absolute best research goes to Nature.

So from the science citation index we can measure the impact factor of a journal, which in turn tells us its value to researchers. The value to researchers told publishers what universities were willing to pay, and so publishers have been steadily increasing the price of the most important journals. This costs universities more than they are happy with, and so is called a crisis. Librarians call journals "serials" (because you get a series of copies of a journal), hence this crisis is called the serial pricing crisis.

Now let's look at this in reverse. Across all of math and the hard sciences it is possible to make a somewhat reasonable comparison of how important any given paper is, and how good any given journal is. Furthermore there are small groups of editors whose judgment is regularly trusted to compare the best papers from different areas of science and select which are worthy to be in their journals. In the extreme example, the editors of Nature are trusted to draw comparison across all of the hard sciences. And for the most part, scientists agree with these decisions.

When you think about it, it is truly remarkable. It implies that there is a relatively well shared concept of relative value across all of the sciences. Of course people in the hard sciences seldom remark on it, it is just how things are.

To see how remarkable it is, compare with the humanities and social sciences. They have no such hierarchy. Instead of one grand hierarchy you get independent clumps of researchers who talk to each other but not the other groups. And they find this so natural that I have seen social scientists express disbelief that, say, a physicist in fluid mechanics can hear a key result in particle physics and will know that it is important. But it is true. If you ask one physicist, "What are the 10 most important results in physics in the last 30 years" and take that list to another, the other physicist will agree that those are all important. If you ask a psychologist for a similar list and take it to another, the other is likely to not even recognize many of the items.

What is going on here is a confirmation of one of Thomas Kuhn's key claims in The Structure of Scientific Revolutions. Which is that in a mature science (his term) researchers have come to share a paradigm about what would be progress. When a paradigm has become so compelling that virtually all researchers in the area accept it, then people who are not in that field can see the agreement that progress is happening. When no paradigm can compel general acceptance, then from a distance all that is visible is confusion.

Kuhn is careful to point out that within the field there will be groups of researchers who are doing good work and making progress. This is certainly is true. For instance I've brought up psychology. Yet if you read books like Parenting From the Inside Out you will find that solid research is being done, that comes up with valuable information. (I highly recommend this book to anyone, parent or not, who is willing to work through it carefully.) But the problem is that the case for this line of research is not compelling enough to convince other psychologists that this is the right way to try to understand the mind. So from a distance there isn't a clear impression of solid progress being made.

Therefore the hard/soft science division boils down to shared paradigms. In the hard sciences certain lines of research have become so compelling that everyone agrees that they are the right way to go. Because of this agreement, people in nearby fields get a clear picture of what progress looks like in that field. With clear pictures of what progress looks like in multiple fields, the ground is set for making comparisons between fields, which has evolved into a reasonably well shared value system across the entirety of the hard sciences.

The soft sciences share none of this structure. As a result there is no shared agreement within the soft science about what is important, let alone a shared agreement on the relative importance of different areas of science.

To close I would like to illustrate how much shared agreement there is within the hard science about what progress looks like. I'll do this by giving my personal top 10 lists of scientific advances in each century since science began to take off in the 1600s. I haven't tried to put them in any particular order. (They often are somewhat chronological.) While people may quibble with some of my specific choices, people who are well versed in the hard sciences will generally agree on the importance of these items.

  • 1600s
    1. Objects of different mass fall at the same rate (Galileo, physics)
    2. Telescope used for astronomy (Galileo, astronomy)
    3. Kepler's laws for planetary orbits (Kepler, astronomy)
    4. Circulatory system accurately described (Harvey, biology)
    5. Microbes discovered (Leeuwenhoek, biology)
    6. Hooke's law of elasticity (Hooke, physics)
    7. Newton's laws of motion (Newton, physics)
    8. Newton's law of gravity (Newton, physics)
    9. Speed of light first measured (Ole Römer, astronomy/physics)
    10. Calculus (Newton/Leibniz, mathematics)

  • 1700s
    1. Lightning explained as static electricity (Ben Franklin, physics)
    2. Fluid mechanics began to be analyzed (Bernoulli, physics)
    3. Linnaean taxonomy system created (Linnaeus, biology)
    4. Halley's comet's orbit predicted (Halley, astronomy)
    5. Coulomb's law for attraction of electric charges (Coulomb, physics)
    6. Oxygen discovered (Priestly/Scheele, chemistry) leading to the rejection of pholostigon (Lavoisier)
    7. Uranus discovered (William Herschel, astronomy)
    8. Conservation of mass demonstrated (Lavoisier, chemistry)
    9. Stability of solar system confirmed (Laplace, astronomy)
    10. Gravitational constant measured (Cavendish, physics)

  • 1800s
    1. Fourier series discovered, used to analyze heat transport (Joseph Fourier, mathematics/physics)
    2. Ice ages discovered, theory of The Flood rejected (Louis Agassiz, geology)
    3. Central Limit Theorem aka The Bell Curve (de Moivre/Laplace/Galton/Lyapunov etc, statistics) different versions were proven at different times, and Galton was making good use of it years before it was finally proven in generality by Lyapunov
    4. Thermodynamics (many people starting with Carnot, physics)
    5. Conservation of Energy (Joule/Mayer, physics)
    6. Descent with Modification aka Evolution (Darwin, biology)
    7. Germ theory (Pasteur, biology)
    8. Atomic theory (Avagadro/Loschmidt etc, chemistry)
    9. Maxwell's equations of electromagnetism (James Maxwell, physics)
    10. Periodic table (Mendeleev, chemistry)

  • 1900s
    1. Relativity (Einstein, physics)
    2. Radioactive Dating (Ernest Rutherford/Bertrand Boltwood, physics)
    3. Quantum Mechanics (Heisenberg/Schrödinger, physics)
    4. Gödel's Incompleteness Theorem (Kurt Gödel, mathematics)
    5. Hypothesis Testing (Ronald Fisher/Jerzy Neyman/Karl Pearson/Egon Pearson, statistics) - Egon was Karl's son
    6. The Structure of DNA (Watson/Crick/Franklin, biology)
    7. Continental Drift (proposed Wegener and confirmed by lots of people at once, geology)
    8. The Big Bang (Georges Lemaître/Edwin Hubble, astronomy) general acceptance followed the discovery of the CMBR by Arno Penzias and Robert Wilson
    9. Synthesis of the Elements in Stars aka B2FH (Geoffrey Burbidge/Margaret Burbidge/William Fowler/Fred Hoyle, astronomy/physics)
    10. Standard Model (Sheldon Glashow/Steven Weinberg/Abdus Salam, physics) tens of billions of dollars have been spent verifying this theory!
  • 2000s - There is likely more disagreement over these
    1. Human Genome Project
    2. Neutrino oscillation
    3. Rapidly improving knowledge of planets around other stars
    4. Poincare conjecture solved
    5. Age of universe measured to within 1% accuracy (it is 13.7 billion years old)
    6. FOXP2 critical for language
    7. Preserved soft tissue from dinosaur?
    8. Stem cells from skin cells
    9. New family of high temperature superconductors
    10. Molecular evolution is irreversible

Monday, September 28, 2009

Teaching linear algebra

In a recent Hacker News post I made reference to an interesting teaching experience I had in the mid-90s. This is a longer explanation of the same.

I was a graduate student in math at Dartmouth College. I wound up teaching an introduction to linear algebra course that was also the first course where students were asked to do proofs. The class was somewhere in the range of 15-20 students. If I remember correctly, this was in the fall of 1996.

In preparation for the class I set myself goals around how well the students would learn the material taught. After some thought I settled on four ideas that I would use:
  1. Homework not present at the start of class would not be accepted. However students were only graded on the best 20 out of 27 possible homework sets.
  2. All homework sets were cumulative. Generally 1/3 was the current day's material, 1/3 from the last week, and 1/3 from anywhere in the course. Those thirds were in increasing order of difficulty.
  3. Every class would start with a question and answer session to last no less than 10 minutes.
  4. Every student could expect to be asked at least one question every other class.

These ideas may seem odd, but there was a method to my madness. Here is each idea explained.
  1. Homework not present at the start of class would not be accepted. However students were only graded on the best 20 out of 27 possible homework sets.

    The point was to make sure that class started on time, with everyone ready to pay attention for question and answer time. I also didn't want to deal with people doing homework during lecture, evaluating sick excuses, etc. The leniency of not having to turn in 7 homework sets compensated for the rigidness of the policy. And cumulative homework sets meant that I didn't have to worry about students not practicing any given day's material.

    This worked even better than I hoped. The downside was that I had an argument on the second day when someone came in 2 minutes late and was not allowed to turn in his homework. But the first complaint was the last, and the students liked the freedom to decide when something else took precedence over doing homework.

  2. All homework sets were cumulative. Generally 1/3 was the current day's material, 1/3 from the last week, and 1/3 from anywhere in the course. Those thirds were in increasing order of difficulty.

    This was the most important idea I wanted to try. I had long been aware that research on memory had demonstrated that when you're reminded of something as you're forgetting it, it goes into much longer term memory. As a result periodic review at lengthening intervals is very effective in increasing long term recall. A typical effective study schedule being to review after half an hour, the next day, the next week, then the next month.

    Now of course you can tell students this until you're blue in the face - but they won't do it. However when the study schedule is disguised as homework, they don't have a choice.

    This really seemed to work. What I noticed on tests is that students were noticeably shaky on material they had learned in the previous week, occasionally didn't remember stuff for a half-month before that, but absolutely nailed every concept that they'd first learned at least 3 weeks earlier. I credit the forced review schedule from cumulative homework sets for much of that.

  3. Every class would start with a question and answer session to last no less than 10 minutes.

    For me this was the most important part of the class. The questions that came up in this session were my opportunity to refresh people on what they were forgetting, and were how I kept track of what topics should come in for more review on future homework sessions. Given my knowledge of how critical review is to learning, I honestly felt that time spent answering questions was more valuable than lecture. As long as there were questions, there was no maximum on how much time I was willing to spend on this.

    Of course the challenge is getting students to ask questions. My strategy was simple: I told them that someone will ask questions and someone will answer them, but they don't want me to be the one asking questions. On the second day nobody asked me any questions and I had to demonstrate. I picked a random person and asked her to explain a key point from the first day's lecture. She couldn't. I asked another student the same question. Again difficulty. I asked if everyone was sure that they had no questions. Someone asked me the question that I had been asking everyone else. I answered the question, answered the follow-up, and the point was made. I never again had to ask a question during question and answer period. :-)

  4. Every student could expect to be asked at least one question every other class.

    My goal here was to be sure that every student was awake and following the lecture. It was never my goal to embarrass anyone or put them on the spot. To that end I developed a rhythm. Every few minutes I'd stop, say, "Let's make that a question," ask the question, pause so everyone could think through the answer, then ask a random person the question. I made sure to rotate people around so that everyone got their turn fairly.

    The questions I'd ask were always straightforward. They were things like, "What is the result of this calculation?" Or, "Why is this step OK?"

    I treated failure to get the answer as my failures, not theirs. If they couldn't get the answers then they weren't following the lecture, and I needed to slow it down, figure out the rough spots, etc. It might seem that the constant interruptions were slow. But I found that having everyone pay attention more than made up for it. The class as a whole moved as fast as any other class - but with far greater comprehension. And the interactivity made the class become very open about asking questions.

    As a bonus I managed to convince the entire class that taking notes was not worthwhile. I learned this lesson about math in first year undergrad. What you do is read ahead in the textbook. If you really want a set of notes, you can make them from the textbook before class. Then show up at class having read the day's material and ready to pay attention. Then if anything that the professor says doesn't make sense to you when you're paying attention and have already read the day's lesson, then ask the question then and there. If you don't understand it, then probably nobody else does either. Add to that periodic reviews, and you'll have a huge edge in any math courses.

    Nobody ever believes that that works. But this class had no choice because there is simply no way to take notes and pay attention at the same time. Which meant that the note takers couldn't answer questions. But within a few days they learned to not take notes, and I believe did much better for it.

So how well did this package work? As far as my goals were concerned, much better than I had dreamed possible. What really brought this home was the final exam. Based on class performance I drew up a test that I though was a fair test of what I thought they understood. I showed it to some fellow graduate students. They thought I was crazy. They thought the class would bomb, and were willing to bet me on whether anyone would get the bonus question.

The class aced the test. That bonus question? 70% of the class got it. I don't remember what the bonus question was, but I do remember another one that I thought was cute. It went like this. Let V be the vector space of all polynomials of degree at most 2. a) Prove that d/dx is a linear operator on V. b) You can put a coordinate system on V by mapping p(x) to (p(0), p(1), p(2)). (Please imagine that flipped 90 degrees so it is a column.) Find the matrix that represents d/dx in this coordinate system. My fellow grad students got me worried that this might be too advanced for an introductory linear algebra courses. But I needn't have worried - the only significant errors were minor arithmetic mistakes in the calculation. And I think I dinged someone for not having enough detail in the proof.

Furthermore I was lucky enough to talk to some of my students about the experience a few months later. The general consensus was that the material really stuck. Furthermore nobody studied for the final. No joke. As one girl said, "I tried studying because I thought I should, but I gave up after a half-hour because I already knew it all." That is how I think it should be - if you study properly through the course, then you won't need to study for the final. Because you've already learned it. And you'll have a leg up on the next course because you still remember the material that everyone else has forgotten.

So were there any downsides? Unfortunately there were some big ones. I had set goals around learning. I failed to set any around happiness. Having to pay attention during class was hard on the class. Also it motivated them to work hard. Since everyone worked hard and they thought that I was going to grade them on a curve, there was a lot frustration that they wouldn't properly be recognized for their work. (In fact I gave half of them A's in the end.) This frustration showed up the teacher evaluations at the end of the course. :-(

Therefore if I had to do it over I'd ask somewhat fewer questions, hand out a lot more compliments, make it clear that I would not grade on a curve, and if they performed anything like that first class, I'd be even more liberal with good grades. Of course the point is moot since I've found myself profitably displaced from math to software development. But if anyone decides to replicate my experience, I'd recommend paying more attention than I did to those issues.