Grand Rounds: Blogging DATAG meeting October 5, 2007

Before leaving for DATAG, this posting on G-Town Talks came up on my feed and I was immediately inspired. One of the other attendees at DATAG also uses Skype and I would have liked to explored Skyping during a meeting as a means of processing the content but it was hard enough to follow David, much less type up notes and follow a Sype conversation.

Below are the notes I took during David Abrams' speech this morning. Any misquotes or misunderstandings are my fault.

There are more people in attendance today than I think I’ve ever seen here. David Abrams is our first speaker – as the Assistant Commissioner for all things testing related, he’s the top of the food chain.

So – before we get started. . . I’m wondering:
Will the rest of the state adopt the model being used by NYC? (3 part report cards based on value-added, parent surveys, and school walk-throughs)
When are they going to pitch NYStart out the window and start again? (Note: 10 mins into his presentation, David said it was no longer his cross to bear. That answers that question.)

Introduction from Brian Preston – yup, it’s the largest attendance. More than the summer conference. Our theme is “state assessment and future models”. Sexy!

Background on David Abrams – Original member of DATAG (from his time in Albany). He was a high school English teacher and I think this shows in his presentation style.

Recognition from David for the work of DATAG (wow – he talks really fast), especially during the changes in testing programs. Apparently, he gets letters from across the state that leads him to believe that not everyone understands the system and reason for the changes. (Good plug for DIG’s – getting people involved in local groups with a more relaxed environment).

For the irrelevant portion of the program – David advised us all to join DATAG so we can “get on the listserv and bitch”. There hasn’t been a lot of that lately but point taken.

It’s official – there is a next generation of the accountability system and is in development.

David’s PowerPoint is pulled together from the Commissioner’s PowerPoints and will highlight the data points that David find interesting. Starting with English. There’s been movement over the last few years in a positive direction. Some discussion about the area of Reading as its own focus (as opposed to literary or reading in the content areas). The idea of students as independent readers emerges in middle school. David is interested in exploring the concept of a middle level literacy profile. Getting those struggling middle level readers what they need. The reality is that there are very few reading teachers at the middle school or high school level.

Comment about tests being built form two primary languages – primary verbal language and primary math language – but I’m not sure what he meant by that. He’s used the phrase not happening "at scale" twice now. This issue of discrepancy of data use and understanding across the state is apparently an issue.

Ah – the testing policy for the ELLs. As a data person, David wasn’t freaked out by the changes in the ELL policy. He wanted the ELA data as an additional data point. He shared he was asked to present at hearings in DC but didn’t want to share what he stance he took. His previous statement leads me to believe that he supported testing of ELL in English after one year, rather than 3.

Native born students – came into system in K or grade 1. District should be getting these kids at grade level by Grade 3.

Newly arrived – some discussion about the role of literacy in Language 1 and the impact on education. Clearly, David has spent a lot of time thinking about this issue. All of his comments make sense in the large scale sense of the testing program. He has the ultimate view of aggregate data.

Students with Interrupted Formal Education – missed the point he was making here but he used the words heterogonous and homogenous about six times in the same sentence.

Test was designed to “get in and get out” – tests have 20-30 MC items but everyone knows “that one item can really make a difference for Level 4’s”. There is a need to refine the tails of the test in order to get better data for the Level 1’s and Level 4’s.

Discussion about Students with Disabilities. Raised the point that diagnosis patterns are not consistent across districts. Mindset about testing SWD has changed but more work is needed around who is being identified and who is not.

Interesting political point – David just referred to the reauthorization of Title 1 “which was called No Child Left Behind”. He said he’s trying to break the habit of calling it NCLB. Hum. . .

So – to summarize the first half an hour. David is a proponent of longer tests that get better data and are more specific at the tails. He’s aware of issues for SWD and ELLs.

Another use of the phrase “at scale” – take a drink of Diet Pepsi.

Just got confirmation the state tests are not designed to tell Mrs. Jones and Apple School how her kids are doing. It’s to assess how New York State schools are doing in implementing the NYS Learning Standards. It is always powerful to hear him say that and I wish he’d said it a little slower.

HS Math. Ok – this year is a baseline of math standards – all together now, “at scale”. The number of students at Level 1 in grade 8 is scary and has major implications for HS programs. Schools should really take a look at the Grade 9 math program. Every HS principal should be looking at how students are being taught math in Grade 9.

70% of LEP population in NYS speaks Spanish. Wow. Students can take the math test in their dominate language so there are minimal ELL issues, but there are still gaps between Hispanic/Black and White/Asian students. Expects we’ll move into incremental movements on the math test.

Continued issue of item density at the tails. The break between Low 2 and High 1 is narrow. Need to identify who is “floating” right above standard cut points. Commissioner won’t talk about Standard Error because it will “break the brains of the press” (ha!) but districts need to be aware of those issues. He just flew through an example of a thermometer in the sun or in the shade but I lost the context, sorry.

Just got confirmation that David supports formative assessment and multiple measures. He said: some districts are buying formative assessment programs and using them as a summative program. This is bad. (I’m paraphrasing)

Review of standard setting process for Integrated Algebra test – I think there was some really important stuff there about pre-equating, post-equating, open-ended and closed responses but I wasn’t able to catch all of it and when I re-read what I did write, it made no sense. So, I’ll infer that David wants a dense test, not long and wants to make sure it’s done right. He just challenged people to prove that the Regents don’t test higher order skills – he can prove that they do. Any takers?

He said “fat data set” when describing data collected during the standard setting process. I'm totally stealing that.

Sample test for the Integrated Algebra will be out by Halloween. He would recommend that every High School in New York State pull together all math teachers and break apart the sampler. What is the range of difficult? What are the standards? DO NOT DROP THE SAMPLER ON A KID’S DESK AND SAY “TAKE IT.” Teach the curriculum. The value of the sampler is for the instructional staff. Have meetings with Grade 8 math teachers. Don’t look at it in isolation. Consider the curriculum, consider the core standards document.

David has been arguing with SED about test design, psychometrics, and protecting the integrity of the test for the students. He confessed that he is scared by the fact that he is extracting measurement from children. Live human children. His job is to protect the individual rights of the individual students. Aww!

The most stable data set for setting a standard is the operational data. David has spent a long time and asked a lot of people about the best way to do standard setting. Unlike 3-8, when your standard setting six tests at a time, David is confident that they can do standard setting within one week for the math test. The entire testing program has been audited and peer review by the USDOE. Arguments that the test is not aligned or unfair do not hold water. Conversion chart should be up by the morning of June 26th.

David confessed that he can’t sit still. Not a surprise.

He went to Pearson and gave them the “hairy death eye” on behalf of NYS. He reviewed their scoring procedures, see their scanning center, and meet with everyone who will be working on NYS assessments. He came out of the trip with several ideas on how to make things go smoothly in June.

He’s drafting a logistic memo that will be release by November. It will say “here are the rules, here’s what will happen, here’s what everything will look like.” This is to cut off (he actually used the phrase CYA) anyone who says they didn’t get notification in time.

Will do a full formal dry run in the Spring to vet any IT problems. Wants to make sure that files are accessible and can be viewed by all BOCES. Don’t worry about structure and format, as he’s given his work that the file formats will be compatible.

Information about the new tests will be posted at: www.emsc.nysed.gov/osa/new-math.htm

Accountability Update: Status versus Growth

Status Model: takes a snapshot of a subgroup’s or school’s level of student proficiency at one point in time and often compares that proficiency level with an establish target. New cohort each time.

Growth: Variation on status. Take a snapshot at each consecutive year and compare to previous year. Lots of different approaches. Lots of discussion and issues of standard error.

Real issue is tension between governance and school improvement.

David likes the New York Yankees. Dislikes Red Sox.
Likes Lobster. Dislikes Liver.
These do not impact his judgment.
He does not like or dislike value- or growth-added. His job to find out what works best and which is the most sound.

The goal is to build from status to growth. What are they doing? He won’t tell us. But he will tell us what he’s exploring.

Growth is not allowed. NYS is not in model because we started a year too late. Other states started a year before NYS.

David has been meeting with USDOE around the NYS model. He shared the name of several places that do models – it’s all about growth NOT value-added.

There is tension in the next generation because people want the system to inform decisions at the school level. “The best way to inform school decisions is through a multiple-measure system of assessment.” However, this system needs a standard “spine”.

David is scared by Margaret Spelling. I wonder if that’s a like or dislike?

NYS is researching if we can build a full vertical scale. No time for discussion today. His PowerPoint at this point summarizes most of his monologue. Note that the new model will start in 08-09. I wonder if this will change after the federal election?

Transparent does not mean easily understandable. A system this complex cannot be easily understood. It’s is rocket science. I think that was a little shout-out to the Geeks in the room.

OK – here’s a question. He said growth NOT value but the slide he just showed that the design is destined to align to value-added in 2010-11. I asked my question aloud – not sure I know the difference between the two. What I got from his response was that one is related to large-scale accountability, one isn’t. Social Studies was used as an example but I’m sure how it fits into my question. I’m hoping the next session will explain the difference between value-added and growth-added at a slower pace.

David will forward four references to the listserv about large-scale assessment. Thung at Michigan State is a recommended author as well as some folks at UCLA.

Lunch break and then value-added.

Blogging DATAG meeting October 5, 2007

No comments: