NAEP Scores Revisited

Time after time we hear state politicians try to prove a point by taking things out of context.  This is especially true of NAEP scores where they isolate one score for one year and pay no attention to growth that has occurred.  They pay no attention to where you began and how far you’ve come because that doesn’t suit their purpose.

And since we are now told that we have a fourth-grade “math crisis” it seems a good time to revisit a post of several months ago since it is just as true today as it was then.

Then we called it: Okay Governor, About Those NAEP Scores

Politicians love numbers.  They can twist them and bend them and take them out of context and omit bad ones all in their effort to make the voter feel good, or convinced, or something.  Take numbers about economic development.  Sometime about the end of the year the governor will announce that we created _______ new jobs in Alabama in the last 12 months.  But you never hear one also say that at the same time we lost _____ jobs and overall, the number of people working is actually less than 12 months ago.

Since he voted to hire Michael Sentance as state school chief on Aug. 11, Governor Bentley has mentioned NAEP scores in Massachusetts over and over again.  His inference is that since Sentance is from Massachusetts and their NAEP scores are better than ours, we will soon be just like they are.

Unfortunately, the governor does not see the big picture or take things into context and consequently gives us a very distorted view.

First, what are NAEP scores?

NAEP means the National Assessment for Educational Progress, the supposed gold standard of tests.  These are given every two years across the country to 4th and 8th graders.  Students and schools are picked at random.  About 2,500 students in Alabama are tested.  Test is 75 minutes long.  About 30 students are tested per school.  Some must be students with disabilities and English language learners.  They are told that they do not get a grade for taking this test.

These tests began in 1992.  Since then Alabama has narrowed the gap between our scores and national scores in 4th grade reading and math and in 8th grade reading and math.  So we have been doing better than schools across the nation, but no one bothers to tell this story.

And comparing Alabama gains to those in Massachusetts shows that except for 8th grade math, we have matched them stride for stride.   So we have been growing our NAEP scores for more than 20 years at the same rate as the Bay State.  But no one bothers to mention this because suddenly the picture looks much different.

And here is even more interesting info the governor has not mentioned.  One of the most important measures in education is the “achievement gap” between poverty and non-poverty students and between white and African-American students.  When you look at 8th grade reading and math and 4th grade reading and math you see the “gap” in Alabama is LESS than the one in Massachusetts in every single case.

Yes, NAEP scores are higher in Massachusetts than in Alabama.  The last NAEP scores, from 2015, show they are number 1 in 4th grade reading and math and 8th grade math and number 2 in 8th grade reading.   But has their growth been remarkable?  Not necessarily considering they were tied for 3rd nationally in 4th grade reading in 1992, tied for 5th in 4th grade math, and number 7 in 8th grade math and were number 4 in 8th grade reading in 1998 (as far back as scores go in this category).

Sure seems to me that when you study the numbers in their entirety and not just cherry pick them to suit your narrative, Alabama has been running just as fast as Massachusetts for more than two decades.

And governor, do you think a real education consists of ONLY math and reading?  What about the sciences, the arts, extra curricular activities?  Do you know any Alabama students going to college on a band scholarship or to be on a debate team, much less an athletic team?

Children are so much, much more than just data points on a damn graph.  And the totality of an education can not be neatly wrapped up in a few numbers politicians use to convince the public they are right and all of us are wrong.  The sweat and struggle teachers go through daily with special needs children and those who come to school hungry and with an abscessed tooth can not be adequately valued with one test given every other year to a handful of students.

And to worship at the altar of NAEP is not what education should be about.

It is said that repetition is a great teacher.  But it is unfortunate that we have to keep explaining this to political leaders.

Friends Don’t Let Friends Misuse NAEP Data

As I’ve explained before, test scores from the National Assessment of Education Progress (NAEP) test are constantly misused and abused.  Alabama legislators have done this time and again.  The governor has as well.  The same goes for certain educators.

So when I came across the following on the blog of Dr. Morgan Polikoff, a professor and education researcher at the University of Southern California’s Rossier School of Education, I had to share in it’s entirety.  It should be required reading for every member of the Alabama legislature.

This was posted in October 2015 as we were about to be hit with results from the 2015 test cycle.

At some point in the next few weeks, the results from the 2015 administration of the National Assessment of Educational Progress (NAEP) will be released. I can all but guarantee you that the results will be misused and abused in ways that scream misNAEPery. My warning in advance is twofold. First, do not misuse these results yourself. Second, do not share or promote the misuse of these results by others who happen to agree with your policy predilections. This warning applies of course to academics, but also to policy advocates and, perhaps most importantly of all, to education journalists.

Here are some common types of misused or unhelpful NAEP analyses to look out for and avoid. I think this is pretty comprehensive, but let me know in the comments or on Twitter if I’ve forgotten anything.

  • Pre-post comparisons involving the whole nation or a handful of individual states to claim causal evidence for particular policies. This approach is used by both proponents and opponents of current reforms (including, sadly, our very own outgoing Secretary of Education). Simply put, while it’s possible to approach causal inference using NAEP data, that’s not accomplished by taking pre-post differences in a couple of states and calling it a day. You need to have sophisticated designs that look at changes in trends and levels and that attempt to poke as many holes as possible in their results before claiming a causal effect.
  • Cherry-picked analyses that focus only on certain subjects or grades rather than presenting the complete picture across subjects and grades. This is most often employed by folks with ideological agendas (using 12th grade data, typically), but it’s also used by prominent presidential candidates who want to argue their reforms worked. Simply put, if you’re going to present only some subjects and grades and not others, you need to offer a compelling rationale for why.
  • Correlational results that look at levels of NAEP scores and particular policies (e.g., states that have unions have higher NAEP scores, states that score better on some reformy charter school index have lower NAEP scores). It should be obvious why correlations of test score levels are not indicative of any kinds of causal effects given the tremendous demographic and structural differences across states that can’t be controlled in these naïve analyses.
  • Analyses that simply point to low proficiency levels on NAEP (spoiler alert: the results will show many kids are not proficient in all subjects and grades) to say that we’re a disaster zone and a) the whole system needs to be blown up or b) our recent policies clearly aren’t working.
  • (Edit, suggested by Ed Fuller) Analyses that primarily rely on percentages of students at various performance levels, instead of using the scale scores, which are readily available and provide much more information.
  • More generally, “research” that doesn’t even attempt to account for things like demographic changes in states over time (hint: these data are readily available, and analyses that account for demographic changes will almost certainly show more positive results than those that do not).

Having ruled out all of your favorite kinds of NAEP-related fun, what kind of NAEP reporting and analysis would I say is appropriate immediately after the results come out?

  • Descriptive summaries of trends in state average NAEP scores, not just across a two NAEP waves but across multiple waves, grades, and subjects. These might be used to generate hypotheses for future investigation but should not (ever (no really, never)) be used naively to claim some policies work and others don’t.
  • Analyses that look at trends for different subgroups and the narrowing or closing of gaps (while noting that some of the category definitions change over time).
  • Analyses that specifically point out that it’s probably too early to examine the impact of particular policies we’d like to evaluate and that even if we could, it’s more complicated than taking 2015 scores and subtracting 2013 scores and calling it a day.

The long and the short of it is that any stories that come out in the weeks after NAEP scores are released should be, at best, tentative and hypothesis-generating (as opposed to definitive and causal effect-claiming). And smart people should know better than to promote inappropriate uses of these data, because folks have been writing about this kind of misuse for quite a while now.   

Rather, the kind of NAEP analysis that we should be promoting is the kind that’s carefully done, that’s vetted by researchers, and that’s designed in a way that brings us much closer to the causal inferences we all want to make. It’s my hope that our work in the C-SAIL center will be of this type. But you can bet our results won’t be out the day the NAEP scores hit. That kind of thoughtful research designed to inform rather than mislead takes more than a day to put together (but hopefully not so much time that the results cannot inform subsequent policy decisions). It’s a delicate balance, for sure. But everyone’s goal, first and foremost, should be to get the answer right.

Okay Governor, About Those NAEP Scores

Politicians love numbers.  They can twist them and bend them and take them out of context and omit bad ones all in their effort to make the voter feel good, or convinced, or something.  Take numbers about economic development.  Sometime about the end of the year the governor will announce that we created _______ new jobs in Alabama in the last 12 months.  But you never hear one also say that at the same time we lost _____ jobs and overall, the number of people working is actually less than 12 months ago.

Since he voted to hire Michael Sentance as state school chief on Aug. 11, Governor Bentley has mentioned NAEP scores in Massachusetts over and over again.  His inference is that since Sentance is from Massachusetts and their NAEP scores are better than ours, we will soon be just like they are.

Unfortunately, the governor does not see the big picture or take things into context and consequently gives us a very distorted view.

First, what are NAEP scores?

NAEP means the National Assessment for Educational Progress, the supposed gold standard of tests.  These are given every two years across the country to 4th and 8th graders.  Students and schools are picked at random.  About 2,500 students in Alabama are tested.  Test is 75 minutes long.  About 30 students are tested per school.  Some must be students with disabilities and English language learners.  They are told that they do not get a grade for taking this test.

These tests began in 1992.  Since then Alabama has narrowed the gap between our scores and national scores in 4th grade reading and math and in 8th grade reading and math.  So we have been doing better than schools across the nation, but no one bothers to tell this story.

And comparing Alabama gains to those in Massachusetts shows that except for 8th grade math, we have matched them stride for stride.   So we have been growing our NAEP scores for more than 20 years at the same rate as the Bay State.  But no one bothers to mention this because suddenly the picture looks much different.

And here is even more interesting info the governor has not mentioned.  One of the most important measures in education is the “achievement gap” between poverty and non-poverty students and between white and African-American students.  When you look at 8th grade reading and math and 4th grade reading and math you see the “gap” in Alabama is LESS than the one in Massachusetts in every single case.

Yes, NAEP scores are higher in Massachusetts than in Alabama.  The last NAEP scores, from 2015, show they are number 1 in 4th grade reading and math and 8th grade math and number 2 in 8th grade reading.   But has their growth been remarkable?  Not necessarily considering they were tied for 3rd nationally in 4th grade reading in 1992, tied for 5th in 4th grade math, and number 7 in 8th grade math and were number 4 in 8th grade reading in 1998 (as far back as scores go in this category).

Sure seems to me that when you study the numbers in their entirety and not just cherry pick them to suit your narrative, Alabama has been running just as fast as Massachusetts for more than two decades.

And governor, do you think a real education consists of ONLY math and reading?  What about the sciences, the arts, extra curricular activities?  Do you know any Alabama students going to college on a band scholarship or to be on a debate team, much less an athletic team?

Children are so much, much more than just data points on a damn graph.  And the totality of an education can not be neatly wrapped up in a few numbers politicians use to convince the public they are right and all of us are wrong.  The sweat and struggle teachers go through daily with special needs children and those who come to school hungry and with an abscessed tooth can not be adequately valued with one test given every other year to a handful of students.

And to worship at the alter of NAEP is not what education should be about.

 

 

Another Look At NAEP scores

Several days ago I wrote about a senator taking to the floor of the senate to denounce the latest Alabama scores on the National Assessment of Educational Progress.  As I pointed out, he did not bother to look at trend lines and instead, only singled out scores from one test year.

Most of us have tried to lose weight at some time.  (But for me, not lately.)  Let’s say that we’ve worked hard so far in 2016 and have now lost 20 pounds.  However, we had a wee too much to eat last Easter Sunday and when we hopped on the scales Monday, we were one pound heavier than the day before.  Should we pitch a fit about just that one day and declare that our diet is a total and complete failure?

I suppose only if we are in the Alabama Senate and trying to make the numbers tell a pre-determined story (that our public schools are going backwards.)

So I looked at our NAEP scores again and looked again at how we are doing on our diet since the first of the year, not just on the Monday after a big Sunday dinner. I found that if you look at Alabama scores for 4th grade reading and math back to 1992 and at 8th grade reading back to 1992 and 2002 (as far back as info on the national NAEP site goes for 8th grade reading) you find that GAINS in Alabama have EXCEEDED national gains in all four cases.

In 4th grade math, we went up 23 points, nationally the gain was 21 points.  In 8th grade math, we went up 15 points, while the national increase was 14 points.  For 4th grade reading, Alabama increased 10 points, the national gain was 6 points.  And for 8th grade reading, we gained 6 points and the gain nationally was 3 points.

And here is something especially interesting.  The proposed RAISE/PREP Act says we will use something called VAM (Value Added Model) process to determine how good our teachers are and how we can adjust education to make more rapid gains.

The first VAM was created by Dr. William Sanders in Tennessee and was put into use in the Volunteer State in 1992.  The proponents of this very inexact methodology want us to believe it is the best thing since sliced bread.  And since Tennessee has been using it for 14 years, they must be blowing our doors off down here in Alabama.

Well, not so fast.  Truth is that when you compare NAEP scores in Alabama to those in Tennessee you see that reading scores for our 4th graders have risen more from 1992 until 2015 than their counterparts in Tennessee.  Same for 8th grade reading scores from 2002 to 2015.

Of course, this hardly fits the narrative of those non-educators wanting to tell our teachers and school how to do things.  But then, facts can be troublesome at times and often get in the way of political agendas.

 

Please Don’t Let Facts Get In The Way

Given that some in Alabama are hell bent to show how bad our public schools are it is hardly a surprise when folks take things out of context and twist them any way the want to.

Take the National Assessment of Educational Progress (NAEP) tests that a senator recently railed about on the floor of the Senate.  He was upset that Alabama scores dropped slightly from 2013 to 2015.

But here is what he did not say.  NAEP is often called the “gold standard” of testing because it is a way to compare schools across the nation.  This test is given every two years for fourth grade reading and math and for eight-grade reading and math.  Students get no grade and therefore, have no incentive to perform well.

Students and schools are picked at random.  Tests are not aligned to new standards.  About 2,500 students in Alabama are tested.  Test is 75 minutes long.  About 30 students are tested per school.  Some of them must be students with disabilities and English language learners.

There are 730,000 students in our public schools.  So we are judging all of them on the performance of just .003 of them.

Had the senator bothered to do his homework he would have found that fourth grade math scores were down in 16 states and up in only four in the 2015 testing cycle.  He would have found that eighth grade math scores were down in 22 states.  So what happened here was not an anomaly.

Had he bothered to look on the NAEP web site, he would have learned that the change is Alabama scores is not even considered statistically significant.

Had he bothered to look he would have found that since Alabama started NAEP in 1992, we have narrowed the gap between our scores and national scores in fourth grade reading and math, as well as in eighth grade reading and math.

So the truth is that we are doing better–not worse.

But then, why let the truth get in the way when you are trying to prove an invalid point?

Never Let The Facts Get In The Way Of A Good Story

I know nothing about the Alabama Policy Institute.  Don’t know what purpose they supposedly serve, how they pay their bills, etc.  But with a name such as they have, one would think they should be all about research and policy statements and educating the public.

However, all I ever see are articles in some media outlet, such as this one, that is clearly about pushing an agenda, instead of laying all the facts on the table.

In this case they try to make the case that there is precious little relationship between resources spent on education and student performance.  To do this, they talk about how the state of Alabama ranks in comparison to other states on the most recent National Assessment of Educational Progress (NAEP) scores.

For instance, the article states, “Since 2000, rankings in math for fourth and eight grade students fell from 35th and 32nd place, respectively, to 51st and 50th place in 2015.”

The implication clearly is that our students are doing worse and whatever money we are spending on education is being wasted.  The only problem, they are talking about state rankings–not the actual scores of students.

So let’s look at what is meaningful, how students score on NAEP.

Fourth-grade math scores in 2000 for Alabama were 217.  In 2015, they were 231–an increase of 14 points.  That was exactly what was done nationally, an increase of 14 points.  Eighth-graders increased five points.  The national increase was nine points.

In reading, fourth grade scores jumped ten points from 2002 to 2015 (the national increase was four points) and eighth grade scores rose six points (the national average of one point).  So in four measures, we beat the national average in two and tied it in one.

The state of Massachusetts is often considered to have the top school system in the nation.  Alabama showed a bigger gain in math scores for both fourth and eighth grade in this time period than they did.

Yet the folks at Alabama Policy Institute discard the truth and want us to believe our students are going backwards–not forward.

The article above says, “Alabama’s rankings on the NAEP in math and reading have largely collapsed.”  The only thing I see collapsing is the credibility of the organization when they put out such as this.