Channeling David Cox

The Mr. Miyagi of questioning

"Mr. Buell, I don't get it."
"Get what?"

"I don't get the states of matter thing."
"Tell me what you do get."

"When things get hotter they usually expand."
"And why do you think that?"

"Well you showed us that ball-hoop thingie plus the balloon was getting bigger when we boiled water in the flask."
"What do you think that means?"

"I think it means the molecules are spreading out."
"And why do you think that?"

"We weighed the flask so I don't think heat is going in making it bigger......and then the dye spread faster in hotter water........" [we dropped dye in hot/cold water and watched it spread]
[...wait for it....]

"So when something melts it's really just the molecules moving around and not something different then?"
[...wait for it...]

"Nevermind Mr. Buell. I got it."

Massive Post on Common Formative Assessments

This one's long. Even by my wordy standards. Skip to the The Prep and The Intervention if you just want to see what we're doing.

I'd like to say a huge base of research has led me to wanting to start common formative assessments (also called benchmarks or interim assessments), but I think the results are mixed at best. "Works when it's done right" can be applied to almost every ed reform I've ever seen. Really there were two factors.

1. My kids.
Unless you're Steve Poizner, you don't look at East San Jose and think, "Isn't this where they filmed The Wire?"1 On the other hand, we've got our own issues. Many of my kids don't come in with a lot of background knowledge or outside support.2 Thus, their academic success is almost entirely dependent on my abilities as a teacher. If they don't get something at the end of the year, it's because I couldn't teach it in the right way for them to get it. I'm not the best teacher for every student. I don't want them to fail because I wasn't the right fit.

2. My teachers.
It's my 6th year. Here are the teachers I've worked with in 8th grade science:
Year 1: Mr B and Mrs. D
Year 2: Mr. S and long-term subs
Year 3: Mr. S and Mr. F
Year 4: Mr. S and we couldn't fill the spot so we dropped it and loaded our classes.
Year 5: Long-term subs, including a two month period where I taught every single 8th grader on a rotating schedule.
Year 6: Mr. L and a teacher who teaches a single section of 8th sci during my prep.
Stability hasn't been our strong suit. By my second year I was the most experienced teacher, so if you've ever wondered why I spend so much time reading blogs and twitter, now you know. You are all my mentors.

Common assessments are my response to those two factors. I needed to allow my students not to be limited by my teaching abilities and I needed to create some sort of stability in my department. Oh wait. I'm supposed to say something about test scores. Yeah. That too. If my principal is reading this (Hi Diane!) I did it to help us get to an 800 API.

Note: In California we use the terms Advanced, Proficient, Basic, Below Basic, and Far Below Basic. I'm going to use those here because we have a reasonably shared definition of them, not because I agree with the terms.

Grace, Greg, Dan and David all helped me out on this directly or indirectly. Hooray again for twitter/blog mentors. If you've read any of the DuFour books it's obvious we borrowed heavily from the PLC model.

(Edit: I should also mention the book Building a Professional Learning Community at Work by Bill Ferriter and Parry Graham. I let my principal borrow it in the spring and never got it back and I can't remember what ideas I got from it)

The Prep:
We're giving them about every 6 weeks. We met the day before we started this cycle and will do so again at the beginning of the next one. We looked over the test, suggested some changes and agreed on specific scoring criteria for the two short answer questions.

We give the benchmark on a Tuesday. Tuesday and Wednesday afternoon we spend just punching in data into a spreadsheet that Greg sent me and then go home and take a look. On Thursday we meet and formulate a plan.  Share what worked. Share what didn't. Write out some lessons. All that good stuff.

Friday, kids get their results back broken down by standard.

The Intervention:
Monday the kids line up at our doors and are called into different rooms based on results. They get targeted instruction. Right now our current benchmark is on four standards (matter and its properties, states of matter, structure of atoms, chemical vs. physical changes). Only two of us are teaching at any one time so we are playing it by ear based on the results on how we want to separate the kids. As of now we anticipate four groups:

A very small group of kids who blow everything away. These kids we're planning to just set free during the week with whatever project they'd like. We figure some will want to perform their own investigations, some will want to serve as small group tutors, and some will want to create something (digital or something that goes boom or whoosh or boom then whoosh). Before you object, we're not ignoring them. They're going to be elbow deep in awesome.

Almost to just barely proficient kids will get their own teacher.

Basic and Below Basic will get their own teacher.

Far Below Basic will get the third teacher who normally would be on prep. We're all going to be working for free through our prep this week, the part-time teacher is going 4 unpaid periods per day. Yeah, she's a champ. If you're in the Bay Area, offer her a full-time job.

Tuesday, line them up and call names based on the next standard. Wednesday repeat. Thursday repeat. Friday, same benchmark a second time. We keep a graph of the class results on the bulletin board.

Kids who don't score at a proficient level on the second one come after school for small group help until they are caught up.3

Monday, back in our normal classes to start the cycle all over again.

The Discussion:
Pretty much everyone I know hates their benchmarks. I get it. The math teachers and English teachers (district-mandated) hate theirs too.

Here's the key: The three of us have 100% control over our common assessment. Our district and our admins don't touch it.

We wrote our assessments. Yours suck because you bought them from Pearson and they don't align to the standards in your classroom or your state. I mean, they say they align to the state standards. But they don't ask the questions in the same way or at the same depth as your state test and there's just no way your class standards are in there so the information you get is pretty meaningless.

We all have hard and soft copies of the assessments we can refer to. How in the name of Zeus's butthole am I supposed to get any meaningful information if I'm just randomly guessing at what actually will be on the benchmark? Crazy idea: How about instead of guessing and surprises, I look at what's being assessed and target that. That way I can see if my methods were actually effective instead of seeing whether I guessed right.

No stakes. I'm not worried teachers will just give out answers or "teach to the test" because we own it. There's no admin looking at our test scores and hinting that we need to step it up because—even though you've got all the English language learners and we stick the trouble kids in your class because you're good with those kids—your test scores are 3% lower than the school average.

Fast turnaround. Our math and ELA department get their benchmark results anywhere from 4 to 6 weeks after taking it. Useless. We're going from test to plan in 48 hours.

Focusing on a few key ideas. Our math department benchmarks sometimes have 1 or 2 questions per standard. You can't get any useful information from that, especially on a multiple choice test. We went with four key standards tested with 20 multiple choice questions. Five for each standard with the level getting increasingly more difficult (i.e. The first question on states of matter is the easiest. The fifth question is the hardest). We added two short answer questions focusing on the two thinking skills we're emphasizing right now. Benchmark #2 will be about 80% current unit and 20% review. We're sticking with 5 questions per standard so some of our benchmarks are slightly longer or shorter depending on how many standards we've been working on for that cycle.

We can respond to conditions on the ground. One of the science teachers had some personal things come up and had to miss some time. He won't be able to teach what's in the benchmark by the date we had scheduled it. We moved the date back. 

Without admin or district support, we're putting in a whole lot of extra time. No early release or paid subs. We have zero collaborative time built in to our school year. No fancy electronic scoring so it's all hand coded. I might need to just take a sick day and spend it entering in scores and looking at the data.

I'm not sure how sustainable this will be. On the other hand, if we can get all of our students across the stage at the end of the school year, it'll be worth it.

My required plug for standards-based grading.
With department-wide standards-based grading, the benchmark test itself is unnecessary. We would already have the necessary data in our gradebook and would be able to team up as needed. I get excited just thinking about how great it would be to have an ongoing stream of data to compare with other teachers. I don't have that right now. 

However, SBG still lends its own brand of awesome. Take a look at the three steps a student might go through. First, the student has an opportunity to learn in class. Second, in a targeted and leveled remediation. Third, after school. Three different levels. In standards-based grading, it doesn't matter when you learn it, as long as you learn it. Carlos drops major knowledge on the first benchmark and has an A in the class already. Brenda has some issues with states of matter, goes through the week of intervention and learns what she needs to learn. Brenda now has an A too. Mikey sleeps through both step 1 and step 2. Now, he's coming after school. Every week. With me. Special time. Mikey finally learns something because he's sick and tired of seeing my striped polo shirts every day. Mikey knows that it's not acceptable NOT to learn. Mikey also knows that I believe he can learn and that I won't give up on him. And yes, he's got his A. Mikey likey.

Am I making every kid come after school until they've got an A? Nope. No Ds or Fs would make me plenty happy. But you know what? If you've gone from an F to a C, all of a sudden, it's not too far to an A.

In case you're wondering how benchmark fits into the actual gradebook we've decided on two things:
  1. The benchmark score will be reported but not computed into the grade. We're just going to manually input it into the comments section of the report card.
  2. We all have our own system of grading (don't get me started) but we've agreed on a policy that if you pass the benchmark, you can't fail the class. I admit this one wasn't my idea and I worried that once you attach a grade you start worrying about cheating and I always hate using a single-assessment for anything permanent. However, it was pointed out that a student who can pass the benchmark demonstrates a minimum level of understanding and a few basic precautions can minimize cheating.
Here's a copy of the first benchmark in case you're curious. As an aside, writing the benchmark helped me empathize with the writers of the state tests. You kind of have to make it boring and vanilla. Whenever you introduce anything interesting, you have to worry about kids being freaked out by the strangeness and screwing with your data. It has to be straightforward and cut right to the point. Not to say that this is a good thing, but it was interesting for me to experience that.

Let me know if you have anything you think I should change/add/remove. What works for your school's benchmarks? What doesn't work? What would you change to make it work?

1: If you're not in California, Poizner ran in the GOP primary for governor. He wrote a book called Mount Pleasant which is the high school my kids feed into. It was his reflection on teaching (a single period, one semester) in the school. It was less than glowing. Our neighborhood does not smell like garbage and high school seniors aren't menacing anyone. On the flipside, everyone really does call the school Mount Pregnant. He's an outsider though so....not cool.
2: Just to be clear, the parents are supportive. They just often lack the ability, resources, or time to help their child at this point in their schooling. We often confuse unable with unwilling.
3: To answer Matt's question, reassessment is optional. Learning is not.

A group of science teachers on Twitter have created #SciDo. In its current form, it's a shared Google Docs folder where science teachers upload lessons to share with others. Mike Ritzius has been the primary force here but I don't know the history behind it. It's like BetterLesson but with actual lessons in it.

There's a Flickr group for science pics. It looks like they're also forming a student blogging network. There's also talk of setting up mentoring programs for new teachers on Twitter and creating a video tutorial library. Sounds cool and definitely the more the merrier. 

If you're interested, go to to request GDoc access.

EngDo has also sprung up for the English teachers. My GDoc folder shows something called MathDo as well but it's currently empty. If I find out the status of that I'll update this post. I'm interested in seeing how this evolves.

Aside: I have often lamented the fact that many science teacher bloggers are actually covert edtech bloggers. They don't blog about science teaching. Every post is "101 ways to use WallWisher" and how PollAnywhere revolutionized their lectures. It's not my bag of chips so I rarely follow those blogs or the developments in the world of SMS response systems. And yes, I'm fully aware that I'm a science teacher and I have never actually blogged a science lesson.

However, SciDo was started by a group of teachers so I'm totally supportive of that.

Compare and Contrast with SET

As a department we've decided to focus this trimester on two thinking skills: Classifying and Compare/Contrast. All this really means is 1) It's going on our common assessment and 2) We're planning to throw those two things at the kids whenever possible. My kids are fresh off 7th grade life science, which is all about classifying (angiosperm or gymnosperm? eukaryote or prokaryote?) so they're pretty strong on that. Asking them to actually create their own criteria usually throws them off for awhile but they get the hang of it pretty fast.

Compare and contrast is a different beast. I'm not sure why, but it's something we've always struggled through. This year, I'm using the game SET to introduce it.

SET is pretty common in math classrooms, they even have a resources for teachers page. I'm sure Sue VanHattum has a few extra decks around her living room.

In the interest of not spending $96 on 8 decks of SET I used Keynote '08 to make a version. I'm just going to color print and laminate them. I'm not a designer in any way so if you're interested in making a better version I fully support that and will post yours.

I was too lazy to figure out how to make the squiggly character so I made rectangles instead. I also changed the colors.

Edit: Scribd is pretty much the worst thing ever so if it's not showing just click through and it'll work. I had to zip the keynote file because I can't figure out how to get or dropbox to share them. Here it is if you want to edit.

Mine prints out in landscape automatically.

Here are the rules for the uninitiated. It's the simplified version. Normally there are different shadings (striped, solid, outlined).

  1. Nine cards are placed face up on the table.
  2. The students take a look and yell out "Set" when they see a set. 
  3. The student who yells set has a few seconds to pick up his/her three cards.
  4. That player gets 1 point. 
  5. Three more cards are laid face up on the table.
  6. If everyone agrees there are no sets on the table (really rare, but students have a hard time seeing them at first) then three more cards are put out. These are not replaced when depleted.
  7. Play ends when the deck is depleted. Most points wins.
If you're playing the full version, 12 cards are on the table.

What's a set? Each card has three features: Shape, Color, and Number. In order to make a set, each feature must be the same on every card or different on every card. If you go try out the Daily Puzzle it makes a lot more sense.

I had written out a whole tutorial but found these screens here, which are much better. These are a set:

These are NOT sets:

Remember, I'm not using shadings.

It helps if students go through three questions:
  1. Are they all the same shape or are all different shapes?
  2. Are they all the same color or all different colors?
  3. Are they all the same number or all different numbers?
So here's the flow.

First day I introduce the rules and just let them play. Although the box says it's suitable for ages 6 and over, the thinking behind this is difficult at first so it'll take awhile for kids to get it.

Second day we play a little more. Then I formalize it. I like to use a comparison matrix for compare/contrast. Venns don't really do it for me although bubble maps are alright.1  After a little play I put this on each of their tables:

Mid-game they stop and take one set (3 cards). They put one card over each spot on the top then fill in the boxes for each attribute. The far right box they write a sentence, "Card 1 has a squiggle, Card 2 has a........All of the cards have different shapes."

After the game is finished I ask them to shuffle the deck and pull out three random cards. They fill out the comparison matrix again and use it to decide if it's a set or not.

I'll update this post when I actually do it. I'm looking forward to it. I really just want something I can point at. When I want a kid to compare/contrast something I want to be able to just point to the SET deck or say, "Remember that card game we played?" and have that memory do the work. It turns out "something I can point at" drives a lot of my instructional decisions.

Update: Here's a full version in pdf and keynote. I wrote a followup post too.

1: The problem with Venns is they're not forced to compare specific attributes. I get things like, "Dogs bark. Cats like tuna." umm...ok.

