As the prevalence of teacher accountability measures has risen over the past decade so too has the evidence base about their potential bias against teachers of color. In this paper, we use item response theory (IRT) to build on this base of evidence, finding that teachers of different identities with comparable teaching expertise receive systematically biased observation ratings on specific items across the continuum of instructional effectiveness.