One of the palpable weaknesses in the American justice system is the tendency for it to produce different outcomes for people from different social classes. Part of this is a result of discrepancies in the quality of legal representation people can afford, but part of it is also due to inconsistencies in the way morally questionable activities are judged.
Research on judgments of misdeeds often focuses on “moral licensing,” or the idea that certain circumstances can make us more amenable to bad behavior. One study that’s received a lot of attention found that people who bought organic food rather than a control food were less likely to volunteer to help a needy stranger. The implication is that doing one good deed made people feel it was less wrong to pass up an opportunity to do another good deed. Other studies have found that this kind of licensing can be influenced by the the actions of people in your group or people with whom you share a random characteristic such as a birthday. One study even found that people grant themselves a license to do bad things by exaggerating the evil of things they previously decided not to do.
However, thus far most research on moral licensing deals with people making judgements about their own actions. We still don’t know much about when or why people are willing to grant third parties a license to do something questionable. Given that you have less knowledge about other people, it seems likely that different factors would be involved.
The University of Wisconsin’s Evan Polman sought to fill this gap by specifically examining whether the social status of a third party influenced the degree to which they were granted a moral license. In two studies Polman and his team told participants about somebody who engaged in questionable activity, but they manipulated status through the person’s name (Billy-Bob (low) vs. Winston Rivington (high) vs. James (control)) or job (janitor vs. CEO). The researchers found that both high- and low-status actors were judged significantly less harshly than actors in the control condition. It would seem that both high- and low-status people tend to be given more leeway to do bad things than people in the middle.
Of course it seems unlikely that high- and low-status people would be granted a moral license for the same reason. That led Polman to investigate a more interesting question, which is how exactly status influences moral licensing. Polman hypothesized that high-status people were given “credentials” while low status people were given “credits.”
While credentials and credits similarly elicit less negative responses to misbehavior, they differ with respect to whether they alter observers’ perception of the negativity of the behavior itself (a form of perceptual change) or merely affect observers’ perception of the extent to which engaging in negative behavior is understandable or justified (a form of attitudinal change). Credentials bias perceptions of norm-violating behavior, leading people to perceive dubious behavior as less dubious; almost as if the behavior was not even a transgression (Effron & Monin, 201o)…
Credits, however, do not change observers’ perception of the behavior but offer counterbalancing capital so that a wrongdoer can transgress as long as their transgressions (so-called moral debits; Miller & Effron, 2010) do not exceed their credits (Nisan, 1991; Zhong, Liljenquist, & Cain, 2009). Thus, credits influence the extent to which observers sanction misbehavior by altering their attitudes toward the wrongdoing, such as viewing the wrongdoing as justified and tolerable.
In other words, when a high-status person does something questionable (e.g. stealing), you judge it less harshly because you believe the action is objectively less wrong (e.g. it was actually a clever manipulation of tax laws.) But when a low-status person does something questionable, you judge it less harshly because you understand why the person did it and you’re sympathetic to their situation (e.g. they needed money for food.)
Polman reasoned that when an action was ambiguous, high status people would be judged less harshly because it would be easier to reinterpret their action as something justifiable. He decided to test this hypothesis by directly manipulating whether the morally questionable action was ambiguous or not. Participants read about a janitor or a CEO who hired only white candidates, but some participants were told the person admitted being reluctant to hire African-Americans (unambiguous condition), and some participants were told the hiring was legitimately based on merit (ambiguous condition). Sure enough, when the action was ambiguous, participants rated the action less harshly when it was done by the CEO rather than the janitor. On the other hand, when the action was unambiguous, and thus it was impossible to view the action as permissible, participants judged the action more harshly when it was done by the CEO and less harshly when it was done by the janitor.
Polman also measured the dispositional sympathy of participants. As predicted, participants who were more sympathetic tended to judge those who were low-status less harshly. However, dispositional sympathy had no effect on the judgments of high-status people. To Polman, this was evidence that low status people were in fact being judged less harshly because their struggles were earning “sympathy credits” rather than “credentials.”
The findings suggest that high- and low-status people are judged differently, but not in a way that is universally advantageous to either of them. When the indecency of an action is ambiguous, high-status people will be judged less harshly because their actions are more likely to be interpreted in a positive light. However, when an action is unambiguous, low-status people will be judged less harshly because they are more likely to elicit sympathy.
As Polman points out, the research has implications for how you ought to defend yourself. If you’re clearly guilty, and thus your transgression is unambiguous, the optimal strategy may be to lower your status in the hope of receiving sympathy. For example, by begging the victim for forgiveness. Alternatively, if there’s ambiguity in your actions, it’s worthwhile to try and raise your status in order to make it more likely your actions will be viewed in a more positive light. Similarly, if you’re a high-status person but the ambiguity of your actions is open to debate, the best defense may be to obfuscate the circumstances of your crime. Because you’re high status, when your behavior is ambiguous a judge is more likely to “credential” your behavior by deciding that your actions weren’t all that bad.
Is there evidence of any of this in the real world? You can certainly string together enough high-status crimes for a solid bout of confirmation bias. For example, when high-status people do something unambiguous, such as commit murder, there doesn’t seem to be any special leniency. On the other hand, when their actions are more ambiguous, such as those involving regulatory improprieties in the banking sector, high-status people do seem to often go unpunished. Of course this doesn’t really prove anything — some actual data is needed. Either way, Polman’s research is a good reminder that social status matters, but not always in the way you might think.
Polman, E., Pettit, N., & Wiesenfeld, B. (2013). Effects of wrongdoer status on moral licensing Journal of Experimental Social Psychology, 49 (4), 614-623 DOI: 10.1016/j.jesp.2013.03.012
If often seems as though policy-making has devolved into nothing more than a contest where the goal is to blame as many people as possible (but not yourself) for the country’s problems. Fossil fuel companies blame environmental regulations for economic stagnation and high energy prices. Neocons blame civil libertarians for national security weaknesses. And of course, Westboro Baptist blames homosexuality for everything. Much of the blame appears to be farfetched (the exception is homosexuality, which since the days of Newton has been scientifically proven to cause hurricanes), and though the political motivations behind the blame game seem easy to understand, a new study led by Zachary Rothschild of the University of Kansas helps break down exactly when and why people latch on to a scapegoat.
Rothschild and his team were interested in examining how the potential culpability of one’s own group influenced moral outrage and blame for a third-party. They began their experiment by giving participants a survey that led participants to categorize themselves as middle class rather than working class or upper class. Participants then read an article about the struggles of working-class Americans, but in the in-group condition the article blamed the middle class for the struggles, in the out-group condition the article blamed the upper class, and in the unknown condition the article stated that economists don’t know the cause of working-class struggles. Participants then read another article about the status of illegal immigrants. In the viable scapegoat condition the article described the rising fortunes of illegal immigrants, while in the non-viable scapegoat condition the articles describe how illegal immigrants were also struggling to find work.
As expected, when illegal immigrants were viable scapegoats, participants were more likely to blame them for the struggles of the working-class when the cause of those struggles was unknown or attributed to their own group, the middle-class. Participants in the in-group condition also reported more moral outrage at illegal immigrants and a stronger desire for retributive action than participants in the out-group or unknown conditions. When illegal immigrants were not portrayed as a viable scapegoat (i.e. they were shown to also be struggling), participant perceptions about the cause of working class struggles had no effect on how they viewed illegal immigrants. In general, when a third-party was presented as a viable scapegoat, people were more likely to blame the third-party for a negative outcome when their own group was at risk of being viewed as responsible for that outcome.
The findings are intuitive, but nonetheless important. When it comes to the specific issue of immigration, the study highlights how simple political motivations can lead to bad policy. Though most economists agree that increasing immigration will boost the economy, as long as the economy is weak people from both political parties may be motivated to scapegoat immigrants in order to deflect blame and alleviate collective guilt about struggling families. This makes immigration reform more likely to occur once the economy is strong, but in the mean time families miss out on the economic benefits of immigration when they need them most.
More broadly, the politics of scapegoating can prevent problems from actually getting solved.
The irony of the present findings is that moral outrage in the present study arguably ensures the maintenance of the status quo of disadvantaged group’s suffering, while providing advantaged group members with an air of self-righteousness. That is, outraged advantaged group members may punish a supposed third-party culprit in the name of restoring justice for a disadvantaged group while simultaneously continuing to perpetrate harm against the disadvantaged without repair. Stated differently, there is the appearance of justice being served only to ensure that injustice is preserved.
You could alleviate your guilt and protect your group’s moral standing by working to help the disadvantaged, but it’s often easier to accomplish those things by blaming a third-party. In the end the disadvantaged group never gets help. So if you happen to see somebody throwing a lot of blame around, particularly toward groups that don’t seem to have the power ascribed to them, you should probably view that person with extreme skepticism.
Rothschild, Z., Landau, M., Molina, L., Branscombe, N., & Sullivan, D. (2013). Displacing Blame over the Ingroup’s Harming of a Disadvantaged Group can Fuel Moral Outrage at a Third-Party Scapegoat Journal of Experimental Social Psychology DOI: 10.1016/j.jesp.2013.05.005
In their 1968 book Pygmalion in the Classroom, Robert Rosenthal and Lenore Jacobson presented their groundbreaking research that showed teacher expectations are self-fulfilling prophecies. If two students start the school year at the same achievement level, the student the teacher is told is a high achiever will make more gains than the student the teacher believes is a low achiever.
Over the years researchers have confirmed that teacher expectations have small to moderate effects on student achievement, but more recent work suggests that the effects may not be the same for all students. For example, a new study by Temple’s Nicole Sorhagen provides strong evidence that the effects of teacher expectations vary by family income. Using longitudinal data from 10 cities, Sorhagen found that the effects of teacher expectations, whether they’re positive or negative, are stronger for students from low-income families.
This study investigated one aspect of the complex cognitive and behavioral processes underlying student–teacher relationships and found that early inaccurate teacher expectations were a lasting contributor to later academic performance. Furthermore, the findings suggest that self-fulfilling prophecies in the classroom vary across academic subjects and family income. Under- and overes- timation of early math and language abilities, but not reading abilities, seemed to have a more meaningful effect on students from lower income families. The fact that self-fulfilling prophecies in first-grade classrooms exerted an especially lasting impact on the achievement of disadvantaged students raises the possibility that teachers’ underestimation of poor children’s academic abilities may be one factor that contributes to the persistent and worrisome gap in achievement between children from different socioeconomic backgrounds. On the other hand, teachers’ overes- timation of abilities seemed to disproportionally help low-income students, suggesting that knowledge of self-fulfilling prophecies in the classroom could be relevant to policies aimed at ameliorating the achievement gap between low- and high-income students, especially considering the persistence of the achievement gap in America.
One neat thing about the study is that it hints at a mechanism for the power of school culture in low-income communities. A positive “no-excuses” environment is one way to encourage teachers to have high expectations for all students, and the study suggests these expectations will have a larger impact on low-income students. Something similar should also occur with regard to peer effects. If a student ends up in a classroom with higher achievers, the teacher is more likely to expect that they too are a higher achiever, and the impact of those higher expectations will be stronger for low-income students.
The importance of teacher expectations is also a reminder of why it’s important to evaluate teachers using multiple measures. While there’s often a big fuss made about the things teachers do that aren’t captured by test scores, the impact of teacher expectations is something that can only be captured by test scores. Even if you observed a teacher every single day of the year you wouldn’t pick up on whether they’re being sufficiently optimistic and equitable in their expectations for students. Such behaviors are too subtle to capture — what you need is a broad all-encompassing assessment that will incorporate the impact of teacher expectations even if it can’t specifically identify it.
Sorhagen, N. (2013). Early teacher expectations disproportionately affect poor children’s high school performance. Journal of Educational Psychology, 105 (2), 465-477 DOI: 10.1037/a0031754
Imagine you take a test consisting of a reading passage and two multiple choice questions. After a few seconds, you’re 99.9% sure about the correct answer to the first question. Two of the answers are absurd, a third doesn’t quite seem right, and the fourth clearly aligns with what the question is about. But when it comes to the second question, you’re less sure. Only one answer is absurd, and while you’re confident a second choice is wrong, both of the remaining choices seem to answer the question. After a few minutes of thought you decide one of them is superior, but you’re only about 75% sure and you’re left feeling slightly discouraged.
Here’s my question: Which test question is more likely to elicit a “higher order” thinking skill?
The context of all this is the widespread negative reaction to New York City’s new Common Core aligned tests. Here’s the lede in the New York Times:
Students at the Hostos-Lincoln Academy in the Bronx blamed the English exams for making them anxious and sick. Teachers at Public School 152 in Manhattan said they had never seen so many blank stares. Parents at the Earth School in the East Village were so displeased that they organized a boycott.
As New York this week became one of the first states to unveil a set of exams grounded in new curricular standards, education leaders are finding that rallying the public behind tougher tests may be more difficult than they expected.
Complaints were plentiful: the tests were too long; students were demoralized to the point of tears; teachers were not adequately prepared. Some parents, long skeptical of the emphasis on standardized testing, forbade their children from participating.
Confusion, discouragement, and a stack of incomplete exams can surely be signs of terrible test questions. But you would expect those things with almost any new test, and therefore they don’t rule out test questions are actually new and improved. For example, even if the new tests were objectively better, and even if substantial efforts were made to prepare students, you would still expect some increase in student discomfort simply because the new tests aren’t what students and teachers are used to. You would also expect an increase in student discomfort because the new NYC tests are designed to do a better job identifying hard-to-measure thinking skills, and these types of questions ought to involve correct answers that are less obvious. The tests don’t even have to be so different to drastically alter the student experience. Imagine that on 20% of the questions students have 20% more doubt about their answers. Over the course of the whole test that ought to leave students distraught with their performance and short on time.
Up to a point, that’s not the worst thing in the world. Imagine President Obama in the war room trying to decide what to do about Libya. He listens to all his advisors, reads all the intelligences briefings, thinks through the potential consequences, and eventually chooses a plan of action. Is he 100% sure he chose the right path? Probably not. Does he feel great about his decision? I doubt it. But that’s the nature of solving difficult problems. Doubt and discomfort creep in when you push your cognitive skills to the limit. Obviously taking the 4th grade ELA exam isn’t quite the same, but isn’t this ultimately the type of critical thinking we’re aiming to prepare students to do? If we want to build a generation of students who aren’t merely “bubble fillers” and who actually learn from their tests, I’m afraid that what people saw in New York is what the initial steps may sometimes look like. It’s worth remembering that olds tests are old tests for a reason — they didn’t do a good job evaluating the skills that were deemed important.
Standardized tests are clearly a complex issue, and we absolutely need to do a better job preparing students and teachers for all the trials and tribulations that new tests will bring. But we should also be wary of making visible student reactions the driving force in evaluating a new exam. None of this is to say there weren’t terrible questions on the New York City exams (I haven’t seen the tests), and there’s good reason to believe the tests were too long. As with any exam, there are surely a slew of experts ready to point out exactly why the test design was so terrible. But I think it’s a mistake to make judgments about the efficacy of a new test strictly based on the number of blank stares it elicits. If we’re serious about attempting to measure real critical thinking skills, the tests that successfully do it are going to initially make students uncomfortable.
If you’re adamant that standardized testing is terrible, then the reactions of various teachers and students probably gave you all the information you need. But for those of us who believe in the potential of an accountability system that makes use of student test scores, it’s important to remember that we’re still early in the process of assessment development. Perhaps testing won’t prove to be the answer, but we’ve barely scratched the surface of what research and technology can do for evaluating and identifying individual skills. With “next-generation” tests beginning to arrive, exams are going to repeatedly go through major changes in a relatively short amount of time, and it’s important to remain patient and not overreact to student reactions.
When there’s not unity among opposition movements:
Wars within states have become much more common than wars between them. A dominant approach to understanding civil war assumes that opposition movements are unitary, when empirically, most of them are not. I develop a theory for how internal divisions within opposition movements affect their ability to bargain with the state and avoid conflict. I argue that more divided movements generate greater commitment and information problems, thus making civil war more likely. I test this expectation using new annual data on the internal structure of opposition movements seeking self-determination. I find that more divided movements are much more likely to experience civil war onset and incidence. This analysis suggests that the assumption that these movements are unitary has severely limited our understanding of when these disputes degenerate into civil wars.
All in all, it’s just some more political science research with the potential to save tens of millions of dollars in foolish military spending. But sure, let’s go ahead and de-fund it all.
Political scientists and poli-sci minded journalists have recently upped the snark and condescension aimed those in the media who don’t understand that the president can’t make people do things they don’t want to do. (For good examples, see Ezra Klein, Jonathan Chait, and Brendan Nyhan.) The bottom line in all these pieces is that people need to admit that there simply isn’t any secret sauce of leadership, messaging, or glad-handing that will get Republican congressmen to take votes that jeopardized their reelection.
But while Klein, Chait, Nyhan, and Co. thankfully take an axe to ignorant punditry, I think they gloss over a key explanation for why the myth of presidential power is so widespread and so difficult to kill: If the president truly lacks the ability to get important things done, it means the media has failed to uphold it’s most indispensable public responsibility.
At the moment, the cause of Obama’s powerlessness is that members of the opposition party have abandoned the desire to govern in order to ensure their own personal reelections. With an opposition that doesn’t have a modicum of interest in cooperating, there’s nothing Obama can do to pass necessary legislation. In the face of poorly aligned incentives, our political system is floundering.
But the media fancies itself as the guardians of our great democracy, and so if our political system is broken, it’s because the media failed to prevent it from happening. After all, if it’s truly impossible for us to pass climate change legislation, why did our great newspaper columnists not warn us that such an outcome was fast approaching?
To combat the dissonance caused by the realization that they failed to prevent our political system from rotting, pundits will cling to any tangible explanation for Obama’s failure. He just needs to exhibit more leadership, make better speeches, or recreate scenes from popular hollywood films. Publishing these evidence-free analyses may seem like a difficult thing to do, but it’s not as difficult as admitting that you’ve failed at your job and let the country down. And so while may seem absurd to us for Maureen Dowd to suggest that Obama should take advice from an Aaron Sorkin character, to her it’s much less absurd than the alternative, which is that she completely missed the breakdown of our political system as it was happening right under her nose.
Accountability is all the rage these days, whether it’s with regard to schools, hospitals, government agencies, or the local Geico car insurance branch. But not all accountability is the same, and a thought-provoking new study led by Penn’s Philip Tetlock examines how political ideology and trust can influence support for various accountability systems.
The study used a sample of MBA students to investigate the distinction between “outcome accountability” — where the focus is on a tangible end-result – and “process accountability” — where the focus is on effort and adherence to best practices. Tetlock and his team hypothesized that conservatives would be more likely to support outcome accountability while liberals would be more likely to support process accountability. They reasoned that conservatives tend to attribute outcomes to personal characteristics rather than external factors, and thus a person ought to be held responsible for whether they generate a desired outcome. Conservatives also value protecting organizations from free riders who create a facade of good faith, and thus they would be skeptical of the efficacy of process accountability. Liberals, on the other hand, would be more likely to want to protect lower status employees from being judged based on uncontrollable events, and thus they would prefer process accountability.
In the initial experiment, the researchers tested their hypothesis in an unspecified policy domain — i.e. they simply described different accountability systems in a “large company.” Sure enough, conservatives tended to prefer outcome accountability while liberals tended to prefer process accountability.
But things got more interesting once the researchers introduced two specific policy domains that emphasized different values. One domain was education, an area where efficiency is the central value accountability is designed to uphold. (Yes, you might disagree with this notion of efficiency, but I think it’s valid in the context of the experiment.) Participants were asked to consider an outcome accountability option, in which teachers were evaluated based on standardized test scores, and a process accountability option, in which teachers were monitored through occasional observations. The second domain was equal employment opportunity hiring, a domain where equality is the central value the accountability is designed to uphold. Participants were asked to consider an outcome accountability option, in which managers were evaluated based on the number of minority employees hired, and a process accountability option, in which managers received diversity training and had their hiring decisions monitored by the human resources department.
As expected, in the education scenario, conservatives preferred outcome accountability, and their preference was even stronger than in the initial “unspecified” scenario. However, in the diversity scenario, liberals were more likely to prefer outcome accountability and conservatives were more likely to prefer process accountability. In other words, when equality was at stake, it was liberals who wanted employees to reach certain metrics regardless of how it was done. The idelogical nature of these preferences also tended to make them fairly sticky. When confronted with evidence that employees had acted dishonestly to subvert the system, both liberals and conservatives were significantly more likely to say they would switch accountability systems when the subversion occurred in their non-preferred system (e.g. process accountability for liberals in the diversity domain). This may help explain some of the stalemate over teacher evaluations and affirmative action.
Thus, managers who initially prefer process or outcome accountability and then discover that employees have subverted their preferred system will opt to tinker on the margins by closing loopholes, not by adopting a system they initially disliked. When they learn that employees failed to implement promised best-practices for checking racial bias, conservative managers will not suddenly embrace outcome accountability in equality-salient policy domains. Nor will liberals suddenly embrace outcome accountability in efficiency-salient domains when they discover that employees failed to implement promised processes.
As the researchers point out, the broader lesson is that accountability preferences tend to emerge from a complex recipe rather than a single ingredient.
Our claims are modest: (1) when managers think about how to structure accountability systems in particular domains, the spotlight for some managers seems to be more on ‘‘getting the job done’’ and for others, more on treating employees equally/preventing prejudice; (2) which managers fall in which categories is a joint function of ideological outlook and the degree to which the work setting has become a focal point for policy debates (‘‘politicization’’).
It would be interesting to see how the study would play out in the domain of healthcare policy. For liberals, healthcare clearly involves equality, as the priority is for everybody to have health insurance. But the potential for a single-payer system to keeps prices low and ensure conditions don’t worsen due to a lack of treatment may also make it an efficiency issue. For conservatives, healthcare seems like an issue where efficiency would be important, but it’s clear there’s also a focus on the lack of fairness in having income redistributed to buy insurance for poor people. All of this is to say that I’m not sure what the result would be, and it would be interesting to find out.
Tetlock and his team also conducted a follow up experiment that sought to further investigate the role of ideology in situations where there was complete information about employee trustworthiness. Before doing so, they identified a positive and negative type of both process and outcome accountability.
(A) Opportunity-focused outcome accountability empowers employees to use their creativity to go beyond standard operating routines and gives them chances to benefit from upside uncertainties of effort-outcome links (Grant & Ash- ford, 2008; Simons, 2005, 2010).
(B) Punitive outcome accountability sends a no-excuses message to employees (Rodgers, 1993) and shifts the risk of uncertain effort-outcome linkages to employees (Williamson, 1991).
(C) Employee-protective process accountability rewards good faith effort and shifts the downside risk of uncertain effort-outcome links from employees onto management (Scholten, van Knippenberg, Nijstad, & De Dreu, 2007; Siegel-Jacobs & Yates, 1996), with potential benefits of reducing stress and fear of mistakes and failure (Lee, Edmondson, Thomke, & Worline, 2004; Schoemaker, 2011);
(D) Punitive process accountability increases monitoring of processes to prevent faking of good-faith efforts and shifts the risk of uncertain effort-outcome links onto employees (Patil, Vieider, & Tetlock, 2013).
In the second experiment participants were told that employees either wanted to do a good job, or get paid as much as possible for as little work as possible. Participants were also told that either a great deal of luck was involved in how employee effort translated into work quality, or that almost no luck was involved. They were then asked about accountability preferences.
The researchers found that regardless of ideology, people chose punitive accountability systems (B & D) when employees were untrustworthy and positive accountability systems (A & C) when employees were trustworthy. This occurred even when there was no reliable link between effort and outcome. In addition, there no significant relationship between ideology and whether outcome or process accountability was preferred. It would seem that ideology plays a role, but it may only serve as a heuristic for when no good information about employee trustworthiness is available.
What Can This Teach Us About Education Policy?
I think these findings do help illuminate the contours of the debate on teacher accountability. Regardless of ideology, almost everybody thinks teachers are trustworthy. The idea of “teacher bashing” is used to rally union supporters, but nobody serious actually believes that a significant number of teachers are slacking off because they know it’s hard to get fired. That means the acceptable accountability options are A and C. Many people, myself and Barack Obama included, don’t see most manifestations of option C, the process accountability system, as doing enough to ensure accountability. This view is based on the fact that observations rarely rate teachers at a level that necessitates consequences, and that a few annual observations may not be enough to ensure that a teacher is doing a good job. Therefore the clear choice becomes option A, which in practice involves using test scores as part of a teacher’s evaluation.
Others look at the idea of using test scores and all the see is option B — a system that unfairly punishes teachers and blames them for every shortcoming in the system. For these people, the clear choice is option C, even if it’s imperfect.
Unfortunately, there doesn’t seem to be much room for compromise between these two trenches. Value-added measures could get better at controlling for certain environmental factors, but at some point you run up against the wall of innovation in data collection and statistical analysis. This makes it difficult to convince people value-added measure are option A rather than option B. Similarly, we could strengthen the stakes tied to teacher observations, but any system with teeth would likely be seen as a punitive outcome accountability system akin to B, rather than a positive process accountability system akin to C.
One thing that makes this a difficult issue is that we’ve moved on to debating policies when the real disagreement is in the underlying philosophical issues. Ultimately, teacher accountability is about what level of teacher turnover will maximize the talent of a teaching staff, as well as the degree to which teachers should be immune from that turnover. Put another way, how many prospective new teachers would quickly become better than incumbent teachers, and how many incumbent teachers, because of the nature of their profession, should have their jobs protected regardless of how good the new candidates are? That’s basically what the teacher accountability debate is about. (Yes, teacher accountability is also about helping teachers get better. And we should definitely have a system focused on helping teachers get better. But these systems would hypothetically help new teachers get better too, and so whether or not a system helps teachers improve is largely irrelevant to whether a school would be better off replacing a teacher with somebody new.)
Now you might think that few teachers have good replacements and that it’s insane, unproductive, and possibly illegal for teachers to have such little job security. Alternatively you might think that new graduates from teacher programs are pretty good, and that teachers, like most other employees, ought to be replaced if their manager is able to find somebody significantly better. For years the consensus was basically the former view — that all competent teachers should have their jobs protected. More recently there has been a marginal shift toward the latter view — that we could improve teaching staffs by pushing to replace more teachers. Many think we should push even further, while many believe we should go back. This is the core of the debate about using test-scores to evaluate teachers, and debates about VAM models and teacher observations merely obfuscate the real issue of how much job security is appropriate. (And yes, in this case I’m essentially using test scores and VAM as a proxy for “policy that will lead to more teacher turnover.”) Again, I don’t think there’s a silver bullet compromise, I just wanted to vent about people dancing around the issues at the core of policy debates.
Tetlock, P., Vieider, F., Patil, S., & Grant, A. (2013). Accountability and ideology: When left looks right and right looks left Organizational Behavior and Human Decision Processes, 122 (1), 22-35 DOI: 10.1016/j.obhdp.2013.03.007
New research from the University of Maine’s Richard Powell:
Prior public opinion research has identified a wide range of circumstances in which polling results may be tainted by social desirability bias. In races pitting a Black candidate against White opponents, this has often been referred to as the “Bradley effect” (aka “Wilder effect” or “Dinkins effect”), by which survey respondents overstate their preference for Black candidates running against White opponents. This study examines the accuracy of polling on same-sex marriage ballot measures relative to polling on other statewide ballot issues in all states voting on the issue from 1998 to 2012, controlling for a range of theoretically relevant contextual factors. There has been a great deal of speculation, though little empirical evidence, that polling systematically understates opposition to same-sex marriage. Consistent with social desirability bias, this study finds that opposition to same-sex marriage is about 5% to 7% greater on election day than in preelection polls.
The next frontier for marriage equality will be getting people to say what they think and do what they say.
I’ve been making my way through Richard Kahlenberg’s biography of Albert Shanker, and one of the recurring themes that jumps out at me is the way long-lasting policies can emerge from the confluence of short-lived circumstances. Previously I wrote about this kind of “policy stickiness” in the context of higher salaries for teachers with master’s degrees, but you can also see it play out with regard to dismissing teachers based on seniority, a practice known as “last in, first out” (LIFO).
Recently, reformers have attempted to do away with LIFO because it’s an arbitrary, albeit straightforward, way of determining whom to dismiss. But according to Kahlenberg, 30 years ago the climate of discrimination in the country was so bad that Shanker felt we needed LIFO because it was arbitrary.
Shanker and the AFT argued that seniority was worth preserving. Seniority “has proven to be the most effective mechanism for protecting workers—regardless of race, religion, sex or age—against capricious and arbitrary actions of their employers,” the union said. Seniority prevented racist employers from firing blacks first. And it also protected workers “from the whims and prejudices of their employers” not related to race. Finally, seniority was important as a union principle because it “prevents divisiveness among working people by distributing scarcity in a way that is objectively fair and therefore perceived to be fair.” (p. 241)
What better way to protect against racial discrimination than to mandate that everybody be discriminated against based on experience? The problem is that even if it was a smart thing to do at the time, the policy seems to have outlived its use. Nowadays the threat of a teacher being dismissed strictly because they are Black or Jewish is much less severe, and even if somebody were to attempt to pull it off, it’s unlikely they would get past the existing union protections. Meanwhile, Shanker’s final justification of maintaining unity plays right into the hands of critics who claim the unions put their own interests ahead of those of students. Shanker is effectively saying that allowing a superior teacher to be fired is a price worth paying for union solidarity.
One interesting takeaway from all this is that if attempts to do away with LIFO had begun earlier, so that there was less overlap with the push to utilize value-added measures, reformers may have been more successful in their efforts to eliminate it. But once teacher concerns about value-added measures began to grow, the fear of unknown arbitrariness rekindled the desire for an arbitrariness that was well-known. Just as Shanker felt LIFO was necessary to prevent dismissals due to racial discrimination, many teachers now feel LIFO is necessary to prevent dismissals due to what they perceive to be unfair VAM scores.
From a new paper in the Journal of School Health:
We used data from the 2009 National Youth Risk Behavior Survey (YRBS). Logistic regression analyses evaluated the association between insufficient sleep and school violence behaviors, controlling for demographic factors. In addition to examining main effects, interaction terms were entered into the models to examine whether potential associations varied by sex or race/ethnicity…Students with insufficient sleep had higher odds of engaging in the majority of school violence-related behaviors examined compared to students with sufficient sleep. Males with insufficient sleep were at increased risk of weapon carrying at school, a finding not observed for females with insufficient sleep. White students with insufficient sleep had higher odds of missing school because of safety concerns, a pattern that did not emerge among Black and Hispanic/Latino students.
Nothing too earth shattering, but it’s a reminder that we could do a better job with school scheduling. For example, in many places starting the school day later would allow kids to get more sleep and leave them with less unsupervised time after school.