Tuesday, July 6, 2010

Picard, not Data

Riley and Matt have told you before that it's a bad idea to average. Ken O'Connor will tell you that mindless number crunching is one of the cardinal sins of good assessment practice.1

But why?

Here's a story: I'm driving in my car. I check my odometer and I've just gone 25 miles. I check it again and I've now gone 50 miles total. I check again and I'm 75 miles away. I stop when I'm 100 miles away. If I take an average of each time I checked by mileage, I get 62.5 miles.

Totally unrelated story: I'm taking a test. The first time I get a 25%. I take it again and I get 50%. The next time I get 75%. Finally, I get 100%. If I take an average of each time I took that test, I get 62.5%

Learning is a journey. You cannot average different stages of the trip in any meaningful way.  Not only is it an inappropriate use of averaging but it sends the wrong message. It tells students that the 100% they got the last time was nothing more than experimental error. It dismisses the growth that has happened.

I usually disapprove of number crunching for grades in general. But I understand that some people are required to do it.

So when can you average?

Going back to my car trip: I stop my car and get out. I look at the odometer. I take a GPS reading. I check the road signs. I check my map. I've now got four different measurements for how far I am at this exact moment.

Multiple measures for standards-based grading are good. It is in fact a requirement that you take multiple and varied measurements in any good assessment system. Ideally these would all occur at the same time, but realistically they'd be within a few days of each other.

In this case, it is acceptable to average your results as long as you don't do it mindlessly. Not all assessments are created equal. I wouldn't even think of averaging my GPS results with the ones I got by using a ruler and a map.

If you have to average multiple assessments, they should meet two criteria:

  1. The assessments all need to be quality measures of the learning goal. A lab called "Measuring Motion" isn't a valid assessment of that learning goal just because its got it in the name. Check every assessment against your learning goals. Make sure you're assessing what you think you're assessing.
  2. The assessments all need to measure the same point in the learning progression. Usually this means temporal proximity. Don't average two assessments that occurred three weeks apart.
Here's where averaging gets really tricky:

The criteria must be evaluated on a per student basis. 

Assessments are not quality assessments for each and every student. The time span it takes to render an assessment obsolete varies by student. This relates directly to the statement by Chris Ludwig I quoted in my last post.

Your grades come from weighing the total body of evidence you've gathered against the standards you've set and communicated. Use averaging if it will help you make a better decision but don't let it make the decision for you.

To quote @johntspencer: "A simple glimpse at Star Trek reminds me that Data is meant to inform rather than drive." [source]

Data is useful. Data is good for advice. But Picard is the captain. Be the captain. Don't mindlessly average.

1: O'Connor says that if you must use mean, also take a look at median and mode to see if the mean is giving you a true picture of mastery.

Data image from: http://upload.wikimedia.org/wikipedia/en/0/09/DataTNG.jpg
Picard image from:http://upload.wikimedia.org/wikipedia/en/6/6d/JeanLucPicard.jpg 

Post publishing note: This was probably the first post all summer where I didn't link to Shawn's blog. I publish this, check my Reader.....and he also has a picture of Data! I swear, we're not the same person. He's much cooler than me. Literally. He curls in his backyard.


  1. Let's just be honest. We are the SBG Borg. Resistance is Futi... I mean, met with compassion and explanation.

  2. We've now been assimilated in the blog roll.

  3. You had me at the journey digression. Nicely done. Where does averaging come in with respect to determining an overall grade? It'd be moot if we all used SB reporting, but most still have to somehow distill it to A- F.

  4. I think you've hit the point where you could just stop marking. Period. The numbers are actually less information than you know about, so why use them?

  5. @David Agreed. I think anything we do to distill it down to a single score is going to be an ugly kludge. I've wrote about what I do before but I don't have a problem with averaging final grades, since at that point they're summative and no longer a moving target. I'm pretty sure Shawn establishes a few topics as must haves and then averages the rest. It's up to you/dept/school to decide on something that makes sense for your context. As long as there's a solid justification for what you're going with and aren't just number crunching. If your school/district established a single method for converting to an overall grade for all subjects/grade levels you're definitely on the wrong track.

    @meandthedoor I definitely agree that whenever you can offer feedback, without a numerical or letter grade, you should. That's one of my reasons for entering in grades by time, rather than by assignment.

    The larger issue that perhaps should be an entire post on its own:

    Philosophically, I do think there's value in a grade as a useful measure to track personal progress. I am not a "all grades are bad and can only be used for evil" person. It doesn't have to be a grade per se, you could use a giant goal thermometer like in Breakin' 2: Electric Boogaloo but a numeric grade works. I think the more abstracted learning is, the more you need some sort of external measure to see your progress. So for writing, I can compare previous papers or in art I can compare my old art work. It's not quite as straightforward to compare my improvement in solving quadratic equations without some sort of external indicator.

    Basketball example: I need feedback to improve my shot. A coach will guide me and I could video tape myself to compare my form as I go. But ultimately I would like to know that my shot percentage has gone up. The grade in this case wouldn't cause my improvement, but I'd need it to know if I was actually improving.

  6. Once upon a time (this was before the Internet but after Paris), when I was working for a major national laboratory, we needed to know the density of liquid hydrogen. We didn't have a CRC or couldn't find it therein, so we made a circuit of the halls, asking scientists and engineers what they thought. The distribution of answers was interesting (You should try this! Don't look it up! What do you think the answer is? Ask your smart friends!) BUT averaging would have been a very bad idea.

    —Tim Erickson

  7. Love it! I'm going to put a Captain Picard picture somewhere in my class - just for quiet inspiration!