The text below was shared by a colleague on the 
DATAG listserv. It's a worthwhile read for a calm and rationale approach to the (pending) uproar about the changing scale scores.
Colleagues, 
I’ve been working with many teachers who are  discouraged at the news that SED will raise cut points this year and  thrust schools back into embarrassing situations because groups won’t  make AYP.  As we read the rhetoric from newspapers and  politicians who hammer public schools with glee, it is easy to feel  like teachers are the whipping boys and girls for so many of society’s  problems.  The move to value-added assessment and changes in teacher  accountability lend credence to this interpretation,  since increasing teacher accountability is being sold nation-wide as  the key to school improvement.  Longer tests are offered as a means to  produce better data sets so we can identify our weak teachers and  terminate them before they do more harm if we accept  the tone of the US DOE.  Yes, I’m simplifying, and I’m reminded that  DATAG began as a group dedicated to making appropriate use of data to  inform professional development and school improvement without playing a  blame game.  We have to reengage in that effort  every year.  
My work with DATAG and with local teachers takes  quite a different tact that is more positive and recognizes how much  success New York has experienced in improving student achievement since  NCLB began.  I want to reframe the dialog around  testing and raising cutpoints:  It’s a reflection of our success rather  than the notion that we have been ineffective or settled for low  performance all along.
First, let’s review that our standards are  criterion-based, not norm-referenced.  That difference is important  here.  We have, however, treated our tests as norm-referenced in the  ways we have allowed outsiders (and ourselves) to report  them.  The development of our standards came with extensive work by  experts and with teachers who carefully described what they thought a  student should do at a given grade level and subject, based on our  standards documents.  So, we define the kinds of achievement  in ELA, for example, that a student in grade 8 should be able to do by  the testing date in a given year.  When we have a passing rate of 50% in  early years (forgive me for not looking up the statewide level 3  proficiency rate when this began, but it was dismal)  we were embarrassed and we worked very hard to improve that rate.  And,  so that we were comparing students on tests of the same difficulty  level from year to year, SED and the test developers worked hard to  create tests that held the proficiency scale score  at 650 for each succeeding year.  That process was designed to hold  that proficiency level constant even as the tests changed each year.   What that grade 8 student had to know to reach 650 was judged to be  equivalent from year to year.  Folks who understand  the IRT models (like our DATAG colleague and frequent presenter Kathy  Feller) help us to understand that a 650 in 2004 is the same as a 650 in  2010.  
Today, we see the proficiency rates of students and  schools rising every year.  Since we have different questions on the  test every year, we can’t cheat to produce this improvement.  Have our  tests tended to concentrate on some standards  more than others?  Yes, but that is because we have too darn many  standards and some are more important than others, on which most of us  can agree.  The point I want to make is that as NY educators, we have  done a good job of increasing student success in  becoming proficient on the skills we developed, and at the level that  we agreed was appropriate for each grade level.  We should be proud of that, and there should be regular  statements to that effect in recognition of our success.
Let me use a very simple analogy.  Think of your  gym classes and teaching students to jump over a bar.  In grade 3 we  want kids to jump over a bar that is 18” high.  In grade 4 we raise it  to 22”, and by grade 8 kids are jumping over a  bar that is 40” high.  When we began, our kids were not used to being  asked to jump, so many of them couldn’t get to 40” by grade 8.  Today,  most do, since we’ve done well in getting them in shape.  Now, we’ve  decided that our initial, well-designed 40” grade  8 target is too small to ensure that students are ready for the real  world and high school, so we raise the bar to 44”.  That will cause kids  to fail, but it sets the whole bar higher and we have to readjust.  
Think sports.  When I was in middle school the  world high jump record was 7 feet.  New methods of jumping and better  coaching has pushed the record to 8 feet 0.46 in.  So if one set a goal  in 1956 to be the best in the world, you targeted  7 feet as your goal and if everyone reached it you would be the best  coach in the world.  A decision to use 7 feet as your target would not  be good enough today because 7 fee is not world-class.  
Back to our cut points.  Our leaders have decided  that our goals are no longer world-class.  When we set them, they were  among the hardest in the nation.  The national norm equivalent score for  level 4 when we tested only grade 4 and 8  was the 96th to 98th percentile, which CTB  recognized if you reached the right folks to talk to.  We know normed  scored for each question because the  questions came from nationally  normed samples.  [Note: norm sampled is not the same as norm-referenced. Norm sampled means the questions were tried out on students across the country before appearing on the final test. Norm-referenced refers to how scores are reported] So, a level 4 student was better than  96-98% of students in the country.
And the level 3 kid had to hit the  68th percentile, which is a full standard deviation above  average.  We no longer have national norms available for our questions so we can’t make those statements today.   But while many states reduced their test difficulty or their cutpoint  to improve NCLB passing rates, New York has not.  We have held to a 650 cut point for  proficiency and we improved year after year.  This  was a significant success, but now it has been decided that our  targets, though higher than most states, should be raised.  
I wish that the dialog over raising cut points had  focused on the original high standards and on our success at progress  toward reaching them.  Were we all praised for the progress we have  made, we could more enthusiastically accept the  idea that the world around us keeps raising their standards and we want  to stay ahead of that rising tide.  Let’s applaud our progress and  reset our standards so that we are encouraged to continue our  improvement.  
This kind of dialog is not impossible to achieve.  I  hope leaders who roll this out can be more cognizant of our successes  as they raise our cutpoints.  We should explain this as a next step  arising from our success, not because of deficiency,  bad teaching, low standards, etc.  After all, we do the same with our  students—as they succeed, we raise the bar on what we expect from them  because we know they are ready for greater challenges.  That’s how I  want SED to position these changes, and that’s  how I would like to see these changes portrayed in the media by those  who lead us in the coming years.
Dr. Brian Preston
Lower Hudson  Regional Information Center
Elmsford, NY