What does it take to truly develop teacher leadership?
When this question lept off the page for me at a recent Communities for Learning session, it seemed that three days worth of thinking had found a home. This post has been in draft form for a while but I have decided that examining the research and practice around this essential question will be one focus for me in the upcoming year. Since this post will capture my initial thinking around this topic, it is not heavy in the research but merely my attempt to capture the "problem."
Consider these scenerios that recently presented themselves in my work:
District A has adopted a new series and carved out a 90 minute literacy block. Teachers are struggling with the use of the block and despite having an onsite "coach" do not seem to be making good use of the time. In a planning meeting to discuss the development of the teachers, the idea of starting by coaching those teachers who were closest to the ideal in order to have them lead their colleagues was suggested. Building principal was not sure there were many teachers who were "close" and was concerned about how those teachers who were coached might be percieved by their peers.
District B has slowly been acquiring new technology resources for teachers to use and the building principal has been committed to providing teachers with the use of interactive whiteboards. As teachers see this equipment being used in the school, some are ready to embrace the technology (and the learning curve) and try out some lessons. Building principal asks the early users to showcase how they use the technololgy for their peers at a faculty meeting - none feel they have the expertise to do so and the computer teacher demos something instead.
In the same district, one new teacher (untenured) has slowly been integrating the technology even though she has not been given one of the interactive whiteboards. She researchs sites on the Internet, is taking a graduate course on media literacy, brings her class to the school lab weekly and has integrated quite a bit of technology. She even selected a technology based lesson for her observation with the building principal. She is not asked to present at the faculty meeting and has recently been passed by for the installation in her classroom of a new whiteboard purchased by the PTA.
In District C, a consultant has asked teachers who have been engaged in a long-term professional learning opportunity around discourse to share an instance where they took a risk and were successful. In the reflection around this question, teachers struggled to think of answers where they had been successful.
In each of these examples, I am certain that the district wants to foster teacher leadership and that there are teacher leaders available - yet they have not been tapped. What conditions must be in place in a school system for teacher leadership to be developed, and more importantly, to thrive in a sustained way? What dispositions must teacher leaders exhibit to be effective? In short, the essential question is what does it take to truly develop teacher leadership?
As I work to frame this question and my research better - I would appreciate any warm/cool feedback on the identification of the problem and question!!
Giving “Practice” Math Tests? Read me first. (Part 2)
Part 1 is here.
Usually, the most common reason for giving a practice math test is to identify students’ weaknesses. Hopefully the first post showed why it’s so critical to determine the weakness you’re talking about: familiarity with format, time, etc. If you’re worried about the math there are particular ways to approach the practice test.
First, ignore how the student did on the assessment. It sounds counter-intuitive but there is rationale reason for it, I promise. It’s more important how your students did on particular items than how they did overall. There are a couple of reasons for this:
NYS Assessments are based on a criterion-referenced model. Typically, when you give a student 25 questions, you mark the correct responses, determine a fraction of correct over total and come up with a score. Generally, we talk about these scores in percentages. Due to the complexity of the NYS assessments and the fact that they focus on performance in relation to a standard or criteria, scores are NOT reported this way. In fact, the number of raw points needed to demonstrate mastery shifts from year to year depending on the standard setting process.
It’s not the real deal. Regardless of the conditions we create, students know it’s not the real deal. Their performance may be inflated or deflated for that very reason and may not reflect their true performance.
NYS Test Design procedures. NYS follows a particular test design model that requires the test include items with varying difficulty. I’m sure you’ve noticed looking through the test that some questions “feel” easier than others. This isn’t a coincidence. Items are strategically chosen for the assessment that reflect a range of difficulty based on how students performed on them on the field testing. It doesn’t make sense to include 25 questions on Book 1 that were missed by most students during field testing. So, the test designers include items with a variety of difficulty – a few hard, a few easy and most middle of the road. This concept of item difficulty is called “p-value” – most simply put, what percent of students responded correctly to a question. In shorthand, we say items with high p-values are easy, while items with low p-values are hard for the particular group of students under discussion. So - two districts side by side may have different p-values on the same item. We need a neutral standard or benchmark to act as judge and jury around item difficulty. That's where the state data come in.
A great deal of data about NYS tests are made public every year – including p-values. These data can tell us which questions are easy and which are hard. It’s not a secret and requires only a smidge of background to use correctly. P-values are provided at a couple of levels. The one that is most important is for our purposes here Low Level 3. In this example, let’s talk about fifth grade. My mental model around scale scores and p-values is to picture a giant swimming pool filled with every fifth grader in the state of New York who took the state assessment last year. Floating above their head is their scale score. Students from the Bronx to Buffalo, from Long Island to Lake Placid. Students with and without disabilities. Levels 1, 2, 3 and 4.
I can look at how ALL the students did on items but included within the mix are students who really struggled and students who did really well (We assume most questions were hard for students at Level 1 while most were easy for students at Level 4.) So, I as the data lifeguard blow my whistle and call out every child who scores Levels 1 and 2. Same for the Level 4’s. Left in the pool are my Level 3’s – every child who met the standard. Because I want the data to be as clean and precise as possible, I’m going to boot out every child who scores above the minimum standard – which in Fifth grade in 2008 was 650. Left in the pool I have a few thousand students – all who met the minimum standard, AKA scale score 650. For each question these students took, I can look at how many got each question right and compare (or benchmark) my students to their performance. The graph below shows you what that looks like:
Out of ALL of the students who scored 650, only 18% of the students got question 7 correct. In other words, that was a hard question. My gut isn't telling me that. My students aren't telling me that. Students from across NYS are telling me that. Take a look and see how your students did on it. Odds are, they didn't do very well. It's not because you didn't teach it or they just weren't listening. It could be because the wording tripped them up - just like 82% of all students who scored a 650. The question is below:
However, before assuming it's a strength or weakness, look for other evidence that the students understand the concept. Formative assessment can really come in handy here. You can pose a similar question and ask students to respond on their way out the door. This time though, ask:
Anne has completed 87% of the race. What fraction represents that portion of the race she has NOT finished?
If students get the math, they should pick A. If they pick C, it's probably a testing issue. They slid past the NOT. Anyone who picks B or D may have a problem with fractions in general. How did they do on question 15 which taps a similar understanding? (I use Tinkerplots to answer these questions. It's one of my favorite data toys.) The students will form themselves into like-needed groups, depending on what the other instructional evidence shows.
So - if you're going to give the test to identify weaknesses:
Usually, the most common reason for giving a practice math test is to identify students’ weaknesses. Hopefully the first post showed why it’s so critical to determine the weakness you’re talking about: familiarity with format, time, etc. If you’re worried about the math there are particular ways to approach the practice test.
First, ignore how the student did on the assessment. It sounds counter-intuitive but there is rationale reason for it, I promise. It’s more important how your students did on particular items than how they did overall. There are a couple of reasons for this:
NYS Assessments are based on a criterion-referenced model. Typically, when you give a student 25 questions, you mark the correct responses, determine a fraction of correct over total and come up with a score. Generally, we talk about these scores in percentages. Due to the complexity of the NYS assessments and the fact that they focus on performance in relation to a standard or criteria, scores are NOT reported this way. In fact, the number of raw points needed to demonstrate mastery shifts from year to year depending on the standard setting process.
It’s not the real deal. Regardless of the conditions we create, students know it’s not the real deal. Their performance may be inflated or deflated for that very reason and may not reflect their true performance.
NYS Test Design procedures. NYS follows a particular test design model that requires the test include items with varying difficulty. I’m sure you’ve noticed looking through the test that some questions “feel” easier than others. This isn’t a coincidence. Items are strategically chosen for the assessment that reflect a range of difficulty based on how students performed on them on the field testing. It doesn’t make sense to include 25 questions on Book 1 that were missed by most students during field testing. So, the test designers include items with a variety of difficulty – a few hard, a few easy and most middle of the road. This concept of item difficulty is called “p-value” – most simply put, what percent of students responded correctly to a question. In shorthand, we say items with high p-values are easy, while items with low p-values are hard for the particular group of students under discussion. So - two districts side by side may have different p-values on the same item. We need a neutral standard or benchmark to act as judge and jury around item difficulty. That's where the state data come in.
A great deal of data about NYS tests are made public every year – including p-values. These data can tell us which questions are easy and which are hard. It’s not a secret and requires only a smidge of background to use correctly. P-values are provided at a couple of levels. The one that is most important is for our purposes here Low Level 3. In this example, let’s talk about fifth grade. My mental model around scale scores and p-values is to picture a giant swimming pool filled with every fifth grader in the state of New York who took the state assessment last year. Floating above their head is their scale score. Students from the Bronx to Buffalo, from Long Island to Lake Placid. Students with and without disabilities. Levels 1, 2, 3 and 4.
I can look at how ALL the students did on items but included within the mix are students who really struggled and students who did really well (We assume most questions were hard for students at Level 1 while most were easy for students at Level 4.) So, I as the data lifeguard blow my whistle and call out every child who scores Levels 1 and 2. Same for the Level 4’s. Left in the pool are my Level 3’s – every child who met the standard. Because I want the data to be as clean and precise as possible, I’m going to boot out every child who scores above the minimum standard – which in Fifth grade in 2008 was 650. Left in the pool I have a few thousand students – all who met the minimum standard, AKA scale score 650. For each question these students took, I can look at how many got each question right and compare (or benchmark) my students to their performance. The graph below shows you what that looks like:
Out of ALL of the students who scored 650, only 18% of the students got question 7 correct. In other words, that was a hard question. My gut isn't telling me that. My students aren't telling me that. Students from across NYS are telling me that. Take a look and see how your students did on it. Odds are, they didn't do very well. It's not because you didn't teach it or they just weren't listening. It could be because the wording tripped them up - just like 82% of all students who scored a 650. The question is below:
Students are likely to pick A because it practically screams "PICK ME!" at them. Your students may know fractions inside out and sideways. Picking A and not C is an issue of testing sophistication, not mathematics. When reviewing similar problems with students, as much as possible, give them "PICK ME!" choices so they can learn what they look like and how to avoid their siren song.
At the other end of the difficulty continuum are easy questions. The Low Level 3's did pretty well on question 3. If you discover that your students didn't do well on questions like 3 (any item with a p-value higher than 80%), then your warning bells should start warming up.
However, before assuming it's a strength or weakness, look for other evidence that the students understand the concept. Formative assessment can really come in handy here. You can pose a similar question and ask students to respond on their way out the door. This time though, ask:
Anne has completed 87% of the race. What fraction represents that portion of the race she has NOT finished?
If students get the math, they should pick A. If they pick C, it's probably a testing issue. They slid past the NOT. Anyone who picks B or D may have a problem with fractions in general. How did they do on question 15 which taps a similar understanding? (I use Tinkerplots to answer these questions. It's one of my favorite data toys.) The students will form themselves into like-needed groups, depending on what the other instructional evidence shows.
So - if you're going to give the test to identify weaknesses:
- Consider how your students do on easy questions (high Low Level 3 p-values) versus hard (low Low Level 3 p-values) questions.
- Be aware of what wrong answers students give as that's often more interesting than what they got right.
- Consult other evidence (formative and summative) before confirming the students have a mathematical weakness.
Giving “Practice” Math Tests? Read me first. (Part 1)
Similar to the weeks prior to the ELA assessment, many schools across the state are giving their students old copies of the state assessments to prepare their students for the big day. Based on conversations with schools and fellow professional developers, these practice tests serve several different purposes. Regardless of the reason for giving the assessment, there are several strategic moves that can be made to get the best return on your time investment and hopefully, minimize the impact on instructional time and students' sense of what school is all about.
First things first. Be honest about the reason you're asking students to take the old assessment. "To prepare them for the test" is a big broad topic. A common problem in test prep is trying to tackle two problems in one fell swoop. It's a given a that when you're teaching students a new strategy, you introduce it with familiar content or low level text. You wouldn't ask a middle school student to text-tag for the first time with a college level text. The same holds true for practice tests. It's not fair to ask students to "do their best" on the math content and expect them to notice the format and structure at the same time. Their brain is going to be busy with the math. I'm going to tackle a couple of common reasons for giving practice tests over the next couple of days and highlight the benefits of approaching different purposes in different ways.
If your goal is to expose students to the test format:
There are few students on the planet as test savvy as 8th grade students. They have been been tested since they were in fourth grade. They know what the test looks like. Some could even write it. If you work with middle level students, your time may be better served by telling them what's different in the grade 8 assessment (no editing, but extended writing). If the concern is that they really don't know the format, than give them time to do that - and only that. What do they notice about about the font? About the spacing and the structure? The set up of the questions? What might trip them up on the actual test? Make sure they know how to use the ruler, the protractor and the rules of getting as many points as possible on Book 2 and Book 3.
If your goal is to familiarize your students with timed testing:
Consider chunking the test. First, give your students the appropriate time to take Book 1 - and tell them the purpose of taking the practice test is to give a sense of how much time they'll have. Next, when they're done, take ten minutes to process what happened. A Behavior over Time graph (below) is a great tool for helping students process their stress level. Did they feel more stressed at the beginning of the test? At the end of the test? Finally, give them the support to develop a plan. If they freak out in the beginning, what can do they do to avoid the freak out? What helps them calm down? You'd be surprised the ideas that students generate during these types of conversations. It's also a nice way to reveal "rumors" that kids have heard.
Coming up tomorrow - how to tackle practice tests if your goal is to identify student weaknesses. I'd love to hear your thoughts on this!
Subscribe to:
Posts (Atom)