Suspect school rankings.

Feb 26

School rankings can be comprehensive or heterogeneous—but not both.

Several years ago Malcolm Gladwell took U.S. News & World Report to task, exposing the strange process underlying its annual rankings of colleges and universities. He argued persuasively that rankings can seek to be comprehensive (presenting information on many different variables in order to give as complete a picture of overall school quality as possible), but in this case they should only compare schools that are truly similar (e.g., in size, socio-economic status, type of school). If heterogeneous schools are to be compared (e.g., all the colleges in the country, or all New Jersey high schools) the rankings should focus on a very narrow set of variables (e.g., number of students graduating in 4 years). In his words, “It’s an act of real audacity when a ranking system tries to be comprehensive and heterogeneous” (p. 70). Here are the major problems:

Selection of variables. Because there is no direct way to measure “how well [a school] manages to inform, inspire, and challenge its students” (p. 72), ranking algorithms rely on proxies for school quality. For example, I checked out the variables used by a best-selling New Jersey magazine to determine its Top Public High Schools for 2014.

The “School Environment” indicator combines the student/faculty ratio; the number of AP and/or International Baccalaureate (IB) subjects offered; the percent of 11th- and 12th-grade students taking at least one AP or IB test in English, math, social studies or science. Leaving aside what these components may or may not tell you about the “environment” of a school, they all place wealthier districts at a distinct advantage over those that have a harder time hiring more staff, or funding AP/IB teachers’ continuing professional development, or subsidizing students’ exam fees.
The “Student Performance” indicator relies on the number of students with an SAT score ≥ 1550; the percentage of students scoring ‘advanced proficiency’ on state standardized exams; and the percentage of students scoring ≥ 3 on AP tests or ≥ 4 on IB tests. Wealthier families can more easily afford test prep and specialized tutoring, which makes it harder to see how these numbers reflect the quality of classroom instruction.
The “Student Outcomes” indicator includes the 4-year adjusted graduation rate and the percentage of students enrolled in a 2- or 4- year college 16 months after high school graduation. Though it would be a good idea to keep track of how many students are still in school 16 months after graduation (also whether it’s the same college where they matriculated, and how well prepared they felt for college, etc.), I’m not convinced that every public school actually does it (unlike independent schools that rely on such stats, and maintain relationships with alumni for fundraising efforts). No, I don’t have proof–but I’m skeptical!

Weight of indicators. For reasons not expressed in the magazine’s methodology section, the three indicators above were weighted as follows: School Environment = 1; Student Performance = 1.5; Student Outcomes = 2.1. In other words, the variables in Student Outcomes were considered more than twice as important as an indication of overall school quality as those in School Environment, with Student Performance just about splitting the difference in terms of importance. Anyone with a reasonable command of Algebra 1 knows that a small change in this weighting scheme could yield upsets in the ranked results. How do we know this is the right mix for comparing 339 NJ high schools (including seven public charter schools)?

Ideology. “Rankings are not benign. They enshrine very particular ideologies” (p. 74). As an example, Gladwell notes that U.S. News gives twice as much weight to “selectivity” (i.e. how many of the college’s freshmen were in the top 10% of their HS class, how high their SATs were, and what percentage of applicants the college admits) as “graduation rate performance” (i.e. the college’s actual graduation rate vs. the predicted rate, given the SES and test scores of incoming freshmen). Yale does much better on the first indicator, whereas Penn State does much better on the second. Therefore, “the Yales of the world will always succeed at the U.S. News rankings because the U.S. News system is designed to reward Yale-ness. By contrast, to the extent that Penn State succeeds at doing a better job of being Penn State—of attracting a diverse group of students and educating them capably—it will only do worse.”

REFERENCES

Malcolm Gladwell, “The Order of Things: What College Rankings Really Tell Us.” The New Yorker, Feb 14 & 21, 2011.

“Top Schools 2014: Methodology” New Jersey Monthly website. September 2, 2014.

Peter Horn

Suspect school rankings.

Question convention and current practice.

High-stakes testing will always backfire.