Measuring Learning

Over the past year or more, I have read  a number of disturbing stories of school districts firing teachers and closing whole schools, for under-performing, as well as stories of school districts that have found and fired teachers who have contributed to cheating, as unintended (or perhaps intended) consequences of No Child Left Behind (NCLB).  This has been a springboard for me to reflect on my own schooling: remembering teachers, tests and testing and how and when I really learned.

As it has been quite a while since my early grade schooling, grades 1 through 5 finished in 1956, I remember very little.  I have tried to recall my teachers – of the five homeroom teachers, I remember the names of only three, and can picture only one in my mind.  As to my attitude about tests, I don’t remember having one – that is, I probably did, but I don’t remember what the testing was like, and certainly don’t know if it was different or the same as the testing today.  I do remember class sizes being small, and feeling like some of the teachers augmented my parents’ teaching and examples, not as surrogate parents, but as additional people who cared about and for me.

My sister has “helped” me in my recall effort.  Unsolicited, she sent me my kindergarten report card, a document that can only be described as embarrassing.  In it, I was reported as: “knowing [my] name, address and telephone number; counting to 25; knowing the days of the week; singing in tune; carrying out directions; and being kind polite and thoughtful.”  Oh boy!  No longer completely accurate – I can still count to 25 though – but what a set of metrics.  She also sent a couple of links to YouTube videos done by someone about my grade school containing their home movies from that time, with sentimental music and content that would only reach someone who had been there.  But none of this has helped to spark a recollection about testing and about teacher quality.

From 6th grade to 8th grade, I went to a junior high school, and remember a number of teachers from there.  6th grade homeroom was led by a teacher who, 4 years later, had a psychotic breakdown in front of my brother’s class, driven by the younger brother of one of my friends.  Oh well.  When I had him, he presented bogus, frightening, paranoid information that gave me nightmares for years.  I was and still am wholeheartedly in favor of the prodding that led to his breakdown.  He will remain nameless.

I assume that the testing that was done was largely multiple-choice and short answers, but I don’t have a way to verify that.  I do remember developing an aversion to tests at about that time, with the usual student sweaty palms, tightening of the chest, heart beat speeding up, etc., when it came time to perform even as insignificant a test as a short quiz.

The next year was equally awful, but for entirely different reasons.  I was having some relationship difficulties with my peers, most likely due to the onset of hormonal changes, so I dealt with them by pleading sick a lot and missing school as a result.  My homeroom teacher, whose name is engraved deeply in my memory, was alternately understanding and pissed off at me.  I did something early in the year, relative to the grading process, that set us up against each other.  He had a requirement that every two weeks, we had to turn in a book report on something we had read.  Innocently, I had begun reading War and Peace, and he mocked my ability to read it and understand it.  To cap off his mockery, he said that if I finished it and did a book report on it, he would let me miss the deadline for several weeks, and then give me credit for the entire year’s worth.  Well, I did finish, I did understand it, mostly, and I did a long book report on it that must have been sufficiently coherent for him to honor his challenge.

The result was that he made cracks about me “beating the system”, and how I should turn in more for the rest of the year.  Publicly.  But I was unashamed, and never turned in another one.  However, I did continue reading, and gave him evidence of that.  I missed tests, because I was “sick”, and made them up, with the usual sweaty palms.  And this was about the time that we began to have essays as part of the testing procedures.

His name, as I said, is deeply engraved in my memory, and he is perhaps the only teacher before high school that I would ever want to get back in touch with to thank.  When I think about the lessons I learned that year, they had more to do with my character than any else: I feel that I have more understanding of others, and more “grit” for the handling the adversity as a result.  He never yelled at me, he mostly joked, mocked and cajoled me into being a much better and stronger person by the end of the year.  He is the only teacher that I remember that I did not want to fail in my life after school, not because I wanted to go back and rub success in his face, but because I wanted to be able to show him how much he had helped me, whether intentionally or not.

I was comfortable with tests by the time of 8th grade, because I knew how to cram for multiple choice tests.  I was a good test taker because I had figured out the basics of the system, and had only to apply it to each teacher’s particular style or method.  Essays were harder, but I was fairly fluent and could provide enough relatively intelligent verbiage to get by.

After taking a stroll through my recollections of grades 1 to 8, I find myself questioning the sense of the NCLB act and implementation.  I don’t believe that any of the teachers I had were really all that bad, with the exception of the 6th grade nut-case.  Would NCLB have highlighted his deficient psyche?  I doubt it.  Would any of the NCLB testing have identified the high quality of my 7th grade teacher?  I doubt that as well.

I feel a bit like Garrison Keillor describing Lake Wobegon, where all the kids are above average.  My teachers were all “above average”?  I doubt it, but I have rarely experienced teachers who entered the profession because it was their last choice job option: most appeared to be sincerely interested in teaching their chosen subjects, and more importantly, appeared to care about the students – well, with a few exceptions, but those kids were usually discipline problems or disruptive students – this was long before the days of Ritalin-drugging kids into a quasi-receptive stupor.

Multiple choice, short answer, matching types of questions may have some validity for standardized testing, to see how much, if any, of the water in the trough has been drunk by the horses.  More than anything, such testing seems to me to be useful in making sure that students have been exposed and are absorbing the “facts” which are important for the foundation of thinking, and also for diagnosing failures to absorb them.  However, it does not in any way test whether a teacher has neglected their duty to provide those facts.  I only remember taking three “standardized” tests, but not until high school, when I took the PSAT, once, and then the SAT twice.

A distinction must be made between the standardized tests and what I learned to take tests with multiple choice, short answer and matching types of questions that were done created by my teachers.  In my cramming for these types of tests, I would go over my notes to see which “facts” were mentioned, to see what was important.  But to make sure that I had my understanding correct, once the important ones were identified, I would go to the textbook, because if I had to dispute a wrong answer, almost any of my teachers would have used the textbook either to show me the correct answer, or to accept the textbook answer in favor of one of their own.  Needless to say, almost always, their answer matched the textbook, but once or twice over the years, I was able to improve a test grade by showing how I had gotten the answer from the textbook that had been “incorrectly” marked as wrong.

With the standardized tests, the method was somewhat different, reading the preparation material, taking a big deep breath at the start, and just doing what I could.  This meant using knowledge of answers in many cases, and process of elimination and informed guessing in others, since there was no chance of an appeal, only the chance to try again, hoping to raise the score (which I didn’t).  But I was responsible for the results.  None of my teachers could have been individually faulted or targeted as being at fault for my mistakes or omissions.

A little background about NCLB.  I do have a collection of newspaper columns and articles that discuss some of the effects attributable to NCLB, but for the basic information, I found a book by Diane Ravitch to cover what NCLB is, how it is being used and what the effect has been on education.  The book is titled: The death and life of the great American school system, how testing and choice are undermining education. The title gives away her perspective, she finds NCLB to be destructive of education.

NCLB was intended by Congress to make school districts and teachers accountable for the results of their efforts.  While this sounds like a laudable goal, the way that the bill was constructed and the way that it is being implemented has damaged school systems and education around the country.  The major points are:

  • Each state chooses the tests that they will use, defines three levels of performance, and determines what proficiency is for the tests.
  • Schools receiving federal funding must test English and Math proficiency each year for grades 3 to 8, and once in high school, and separate the test results by ethnic group, race, family income, etc.
  • All states must reach 100% proficiency in their teaching by 2013-2014.  Based on that, the states have set up timelines for achieving 100% proficiency in English and Math, and must show “adequate yearly progress”, AYP, based on their timelines, toward achieving the goal.
  • There are strict sanctions laid out for schools and school districts not reaching AYP each year, and very severe sanctions for not reaching the goal in 2013-2014.  The main reward was to have funding continued so that the school and school district could continue to function the way that it had.
  • All states must also participate in the National Assessment of Educational Progress (NAEP) standardized test on English and Math, delivered in 4th and 8th grades every other year.  The NAEP results are to act as an external monitor of the yearly progress.1

So, NCLB lets the states determine what the test contents are, and most likely, each state will use their own standardized test.  They determine what is considered “proficient”, and they determine their AYP.

While a standardized English or Math test, delivered with multiple choice, short answer, and matching types of questions may provide some insight into the proficiency levels of students, it is hard for me to believe that such a test will really determine how well a teacher has performed – this is too indirect, and there are far too many ways to “game” the system.  Additionally, there are many more factors that determine the performance of students on tests and teachers in classrooms than this method considers.  Yet careers are being destroyed, and workable schools dismantled based on NCLB, without any metric showing educational gains as a result.  In Ravitch’s book, she shows examples of schools and school districts that have registered impressive AYP while the scores on the NAEP have been flat, showing no gains and some losses over the same periods of time for those same schools and school districts.

There are a number of ways to look at NCLB to realize how flawed the whole concept is.  One is to realize that the method for accountability is based on that used by businesses regarding the performance of their sales people only.  If a company’s production line ran into significant difficulties that led to a decline in the quality of their products, but the company ignored the declining quality and only judged their managers on their sales people’s inability to sell “junk”, they would be addressing the wrong problem.  Likewise, if a real estate company were to fire their middle management because their sales people were unable to sell any property during the financial meltdown, again, this would be to ignore what the actual problem is.  But this is the way that NCLB is being used to make teachers, schools and school districts accountable.

The goal is to measure the proficiency of the students in two subjects: using a state-wide standardized test with multiple choice answers, the numbers are gathered, and if the numbers are too low, blame the teachers, schools and school districts.  One might expect other factors to be looked at, but apparently they are not.  The test is designed to measure an individual’s proficiency, yet rolled up, it is used to measure the effectiveness of teaching.

Often, adapting the method of measuring from one discipline to another is done in a manner that misuses the methods and results of measurement.  A prime example is described in an earlier post, in which the book by Stephen Jay Gould is discussed, The Mismeasure of Man.  In what purported to be the same objective data-gathering techniques as were used in the physical sciences, the data turned out to be skewed in favor of the investigator’s biases, and the interpretation of the data was used to misjudge the potentials of people in various “sub-standard” ethnic classifications, and then limit their possibilities.

I alluded to two problems, above, the ease with which the system can be gamed and additional factors associated with testing.  I first came across the ease with which the system can be gamed in the first chapter of Freakonomics, A Rogue Economist Explores the Hidden Side of Everything, by Steven D. Levitt and Stephen J. Dubner.  Based on the description in the book, Dr. Levitt was hired to use data from standardized multiple choice testing gathered over a period of years at the Chicago School District to identify which teachers had a high likelihood of cheating by entering answers for their students.  The negative incentives for low scores among their students for teachers and schools were strict, and there were some positive incentives for high scores and improvements.  He were able to identify a number of teachers who had probably cheated, then repeated the testing of their students with monitors present, and none of the classes were able to repeat their results.  Those teachers were fired.

Entering answers for their students is a particularly egregious way to game the system: many are more subtle, as covered in Ms. Ravitch’s book.  Methods that have evidently been used in response to NCLB have ranged from teaching only to prepare for the test; excluding other subjects and other material related to the test subjects; providing students with the answers; lowering the “proficiency” standard (34% right was mentioned as the number required to be considered a passing grade in one instance); schools that were not public schools have restricted the numbers of likely low-achieving students by setting up hurdles that they cannot jump; flunking the low-performers so that they are sent to a public school to depress the public school score; even encouraging low-performers to stay home the day of the test.  She quotes something she calls Campbell’s law: “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”2

The final chapter in Ms. Ravitch’s book, Lessons Learned, describes what will not improve schools (NCLB is only one culprit), and provides a vision of what students should be capable of when finished with school.  It is not a prescription, rather, it is a guideline: “Students should regularly engage in the study of the liberal arts and sciences: history, literature, geography, the sciences, civics, mathematics, the arts, and foreign languages, as well as health and physical education.”3 While one might disagree with some of this, the goal of all education should be to be to turn out students who are self-reliant, able to think for themselves, and are aware of the culture in which they will be functioning.

As a person who has some of those attributes, I am challenged to figure out where and how I learned these.  Certainly not from any of the multiple choice testing I was subjected to, though perhaps learning how to cope with that sort of situation might have contributed to my overall resilience.  I am certain that my teachers deserve much credit (even the nut-case gets credit for some learning) but also my family does, for they all have always supported me (or put up with me), as do my friends, my work-mates, and my good luck.

The one class that I have always valued the most, in terms of when I learned the course subject material better than in any other, was one that I took while I was in college.  The professor had a specific set of readings that had to be done, and a rigorous set of elements that had to be learned.  For some reason, this structure set me free: I ignored some of the reading that was uninteresting to me, but followed footnotes and references in the readings that were interesting.  In the end, my grade suffered, because I had ignored some of the elements I should have learned, but on the essay questions, for the first time in my life, I enjoyed myself responding to the topics, drawing on all of my digressions (though at the time I considered it research) for my answers.  The professor did not penalize me as much as he probably could have, but that was in part because I must have intrigued him.  A result was that once the class was over, the professor became a friend and mentor for the remainder of my years in college with whom I could discuss almost anything.

Would there have been any way to measure that kind of “success”?  If my grade had been looked at, would he have suffered for my inattention and “research”?  Would it have been fair to judge him on that?  Hardly.

The books cited in this post are:

Steven D. Levitt and Stephen J. Dubner, Freakonomics, A Rogue Economist Explores the Hidden Side of Everything, HarperCollins e-books, 2005, 2006, Adobe Digital Edition September 2009,

And

Ravitch, Diane, The death and life of the great American school system, how testing and choice are undermining education, Basic Books, New York, N.Y., 2010.

Both are worth the time to read.


1 This list has been adapted from that in Ravitch, Diane, The death and life of the great American school system, how testing and choice are undermining education, Basic Books, New York, N.Y., 2010.  pp. 107-109.

2 Ravitch, Diane, The death and life of the great American school system, how testing and choice are undermining education, Basic Books, New York, N.Y., 2010.  p.171.  Her footnote for this quote is: Donald T. Campbell, “Assessing the Impact of Planned Social Change,” in Social Research and Public Policies: The Dartmouth/OECD Conference, ed. G.M.Lyons (Hanover, NH: Public Affairs Center, Dartmouth College, 1975), 35.

3 Ravitch, Diane, The death and life of the great American school system, how testing and choice are undermining education, Basic Books, New York, N.Y., 2010.  p. 242.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s