Chasing Pineapples – Part 1

In my column for NY ASCD, I considered the role of assessment literacy on education. Here on my blog, I want to poke at the idea a bit more in-depth. And since this is my (“our” actually - Theresa was a much more consistent writer than I was when we first started. Her reflections and thinking can be found throughout the archive) blog, I’m going to draw a line between assessment literacy and the Common Core Learning Standards.

I have a favorite Common Core Learning Standard. I realize that’s a bit like saying I have a favorite letter in the alphabet, but there you go.

CCSS.ELA-LITERACY.W.11-12.1: Write arguments to support claims in an analysis of substantive topics or texts, using valid reasoning and relevant and sufficient evidence. (NYS added an additional sentence to this standard when the state adopted CCLS: Explore and inquire into areas of interest to formulate an argument.)

Making the promise to NY students that we will do everything in our power to help them develop their ability to understand arguments, logic, evidence, and claims, in my humble opinion, is long overdue. In truth, I’m jealous that my teachers weren't working towards this goal. I learned how to write a really solid 5 paragraph essay in HS English and it wasn't until I was paying for my education that I was introduced to the rules of arguments and logical fallacies. Since I missed the chance during my formative years to explore this approach to discourse and discussion, I try to practice it as much as I can as an adult.

It’s my hunch that assessment illiteracy is having a dramatically negative impact on how we talk about public education. More to the point, I suspect that the same quirk that makes us fear Ebola more than texting while driving is what leads us to discuss and debate the state assessments with more energy, passion, and time more than the assessments students see on a regular, daily basis. My claim: when viewed as a data-collection tool mandated by politicians with a 10,000 foot perspective, the tests are benign. Their flaws and challenges are amplified when we connect them to other parts of the system, or if we view them through the same lens we view assessments designed by those with a 10 foot perspective on student learning. When we chase the flaws in a test that takes less than 1% of the year, we end up chasing pineapples. 

In the traditional of well-supported arguments, I want to focus on patterns more than individuals and on a narrow, specific claim, rather than a bigger narrative. (In other words, I’m not defending the tests, NYSED, Pearson, Bill Gates, APPR, or anything else.) The pattern across Twitter and in Facebook groups is a call for NYSED to release the NCLB-mandated tests so that the public (including parents) can judge their appropriateness, quality, length, use of copyright, or whichever variable the person asking for the release wants to investigate. I absolutely support the Spencerport teachers desire to see the entire test but a voice in the back of my head keeps asking, “Why? So what? What criteria are you using to determine if the test is any good?” Last year, NYSED released 25% of the items and a few bloggers shared their opinions about the quality of the items but I haven't been able to find any analysis of the released items against specific criteria. This is not to say they don't exist, just that they escaped my google searches. This year, NYSED released 50% of the items and the response has been NYSED should release ALL of the items. Which, I suspect, is what NYSED wants to be able to do but funding issues are preventing it from happening. I've been watching Twitter, hoping to see critical reviews of the released 50% but instead, there’s been lots of opining. Lots of “I” statements, not a lot of evidence-based claims. This, I suspect, is a side effect of assessment illiteracy across the board. We just aren't any good as a field, much less as a collection of citizens, at assessing the quality of measurement tools.

So, what makes an assessment good? What makes it bad? Given rules of quality test design as outlined in the APA Testing Standards, why are we willing to accept that the strength of the speaker’s opinion as the determining factor of quality? Is the issue of quality in large scale assessments a matter of opinion? I suspect anyone who has taken any large scale test (from the drivers test to the SAT’s) hopes that’s not the case. I know that numerous organizations including the National Council for Measurement in Education work to establish explicit criteria. The USDOE is instituting a peer review process for state assessments to ensure quality. PARCC is being as transparent as possible, including bringing together 400 educators and teachers to provide feedback. All of these groups use specific criteria to assess the quality large scale assessments. Yet, in the media - social and traditional, one person's opinion about "bad" or "hard" items is treated as if it's the truth.

So, my confusion remains: if members of a groups who do assessment for a living spend years establishing and sharing measures of quality for large scale tests, what tools will the public use to assess their quality? How can the general public use “cardiac evaluation” (I know it because I feel it - not my phrase, I totally cribbed it from someone else) when the vast majority of classroom teachers receive little or no training during teacher prep in how to assess and evaluate assessments? When it comes to state assessments, is it more about chasing pineapples – making claims about the tests quality – than actually catching them – supporting claims with evidence?

 And as I often do, I end up asking myself why it matters. If a parent says "I think this item is too hard/developmentally inappropriate/unfair" should that be enough to say that it is? How much of the science of the education profession actually belongs to members of the profession? 

No comments: