Monday, December 9, 2019

Effective teaching free essay sample

In fall 2009, the Bill Melinda Gates Foundation launched the Measures of Effective Teaching (MET) project to test new approaches to measuring effective teaching. The goal of the MET project is to improve the quality of information about teaching effectiveness available to education professionals within states and districts— information that will help them build fair and reliable systems for measuring teacher effectiveness that can be used for a variety of purposes, including feedback, development, and continuous improvement. The project includes nearly 3000 teachers who volunteered to help us identify a better approach to teacher development and evaluation, located in six predominantly urban school districts across the country: Charlotte-Mecklenburg Schools, Dallas Independent School District, Denver Public Schools, Hillsborough County Public Schools (including Tampa, Florida), Memphis City Schools, and the New York City Department of Education. As part of the project, multiple data sources are being collected and analyzed over two school years, including student achievement gains on state assessments and supplemental assessments designed to assess higher-order conceptual understanding; classroom observations and teacher reflections on their practice; assessments of teachers’ pedagogical content knowledge; student perceptions of the classroom instructional environment; and teachers’ perceptions of working conditions and instructional support at their schools. The project is directed by Thomas J. Kane, Deputy Director and Steven Cantrell, Senior Program Officer at the Bill Melinda Gates Foundation. Our lead research partners include:  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦  ¦Ã‚ ¦ Mark Atkinson, Teachscape Nancy Caldwell, Westat Charlotte Danielson, The Danielson Group Ron Ferguson, Harvard University Drew Gitomer, Educational Testing Service Pam Grossman, Stanford University Heather Hill, Harvard University Eric Hirsch, New Teacher Center Dan McCaffrey, RAND Catherine McClellan, Educational Testing Service Roy Pea, Stanford University Raymond Pecheone, Stanford University Geoffrey Phelps, Educational Testing Service Robert Pianta, University of Virginia Rob Ramsdell, Cambridge Education Doug Staiger, Dartmouth College John Winn, National Math and Science Initiative Introduction For four decades, educational researchers have confirmed what many parents know: children’s academic progress depends heavily on the talent and skills of the teacher leading their classroom. Although parents may fret over their choice of school, research suggests that their child’s teacher assignment in that school matters a lot more. And yet, in most public school districts, individual teachers receive little feedback on the work they do. Almost everywhere, teacher evaluation is a perfunctory exercise. In too many schools principals go through the motions of visiting classrooms, checklist in hand. In the end, virtually all teachers receive the same â€Å"satisfactory† rating. 1 The costs of this neglect are enormous. Novice teachers’ skills plateau far too early without the feedback they need to grow. Likewise, there are too few opportunities for experienced teachers to share their practice and strengthen the profession. Finally, principals are forced to make the most important decision we ask of them— granting tenure to beginning teachers still early in their careers—with little objective information to guide them. If we say â€Å"teachers matter† (and the research clearly says they do! ), why do we pay so little attention to the work they do in the classroom? If teachers are producing dramatically different results, why don’t we provide them with that feedback and trust them to respond to it? Resolving the contradiction will require new tools for gaining insight into teachers’ practice, new ways to diagnose their strengths and weaknesses and new approaches to developing teachers. In the Fall of 2009, the Bill Melinda Gates Foundation launched the Measures of Effective Teaching (MET) project to test new approaches to identifying effective teaching. The goal of the project is to improve the quality of information about teaching effectiveness, to help build fair and reliable systems for teacher observation and feedback. OUR PARTNERS Although funded by the Bill Melinda Gates Foundation, the MET project is led by more than a dozen organizations, including academic institutions (Dartmouth College, Harvard University, Stanford University, University of Chicago, University of Michigan, University of Virginia, and University of Washington), nonprofit organizations (Educational Testing Service, RAND Corporation, the National Math and Science Initiative, and the New Teacher Center), and other educational consultants (Cambridge Education, Teachscape, Westat, and the Danielson Group). In addition, the National Board for Professional Teaching Standards and Teach for America have encouraged their members to participate. The American Federation of Teachers and the National Education Association have been engaged in the project. Indeed, their local leaders actively helped recruit teachers. 1 T he 2009 New Teacher Project study, The Widget Effect, found that evaluation systems with two ratings, â€Å"satisfactory† and â€Å"unsatisfactory,† 99 percent of teachers earned a satisfactory. In evaluation systems with more than two ratings, 94 percent of teachers received one of the top two ratings and less than one percent were rated unsatisfactory. Initial Findings from the Measures of Effective Teaching Project | 3 Yet, our most vital partners are the nearly 3000 teacher volunteers in six school districts around the country who volunteered for the project. They did so because of their commitment to the profession and their desire to develop better tools for feedback and growth. The six districts hosting the project are all predominantly urban districts, spread across the country: CharlotteMecklenburg Schools, Dallas Independent School District, Denver Public Schools, Hillsborough County Public Schools (including Tampa, Florida), Memphis City Schools, and the New York City Department of Education. THE THREE PREMISES OF THE MET PROJECT The MET project is based on three simple premises: First, whenever feasible, a teacher’s evaluation should include his or her students’ achievement gains. Some raise legitimate concerns about whether student achievement gains measure all of what we seek from teaching. Of course, they’re right. Every parent wants their children to build social skills and to acquire a love of learning. Likewise, our diverse society needs children who are tolerant. However, these goals are not necessarily at odds with achievement on state tests. For instance, it may be that an effective teacher succeeds by inspiring a love of learning, or by coaching children to work together effectively. We will be testing these hypotheses in future reports, using the data from our student surveys. For example, it may be possible to add measures of student engagement as additional outcome measures. This would be particularly useful in grades and subjects where testing is not feasible. Others have raised separate concerns about whether â€Å"value-added† estimates (which use statistical methods to identify the impact of teachers and schools by adjusting for students’ prior achievement and other measured characteristics) are â€Å"biased† (Rothstein, 2010). They point out that some teachers may be assigned students that are systematically different in other ways—such as motivation or parental engagement—which affect their ultimate performance but are not adequately captured by prior achievement measures. As we describe below, our study aspires to resolve that question with a report next winter. At that time, we will be testing whether value-added measures accurately predict student achievement following random assignment of teachers to classrooms (within a school, grade and subject). However, in the interim, there is little evidence to suggest that value-added measures are so biased as to be directionally misleading. On the contrary, in a small sample of teachers assigned to specific rosters by lottery, Kane and Staiger (2008) could not reject that there was no bias and that the value-added measures approximated â€Å"causal† teacher effects on student achievement. Moreover, a recent re-analysis of an experiment designed to test classroom size, but which also randomly assigned students to teachers, reported teacher effects on student achievement which were, in fact, larger than many of those reported in value-added analyses (Nye, Konstantopoulos and Hedges, 2004). Value-added measures do seem to convey information about a teacher’s impact. However, evidence of bias at the end of this year may require scaling down (or up) the value-added measures themselves. But that’s largely a matter of determining how much weight should be attached to valueadded as one of multiple measures of teacher effectiveness. 4 | Learning about Teaching Second, any additional components of the evaluation (e. g. , classroom observations, student feedback) should be demonstrably related to student achievement gains. The second principle is fundamental, especially given that most teachers are receiving the same â€Å"satisfactory† rating now. If school districts and states simply give principals a new checklist to fill out during their classroom visits little will change. The only way to be confident that the new feedback is pointing teachers in the right direction—toward improved student achievement—is to regularly confirm that those teachers who receive higher ratings actually achieve greater student achievement gains on average. Even a great system can be implemented poorly or gradually succumb to â€Å"grade inflation†. Benchmarking against student achievement gains is the best way to know when the evaluation system is getting closer to the truth—or regressing. Accordingly, in our own work, we will be testing whether student perceptions, classroom observations and assessments of teachers’ pedagogical content knowledge are aligned with value-added measures. Third, the measure should include feedback on specific aspects of a teacher’s practice to support teacher growth and development. Any measure of teacher effectiveness should support the continued growth of teachers, by providing actionable data on specific strengths and weaknesses. Even if value-added measures are valid measures of a teacher’s impact on student learning, they provide little guidance to teachers (or their supervisors) on what they need to do to improve. Therefore, our goal is to identify a package of measures, including student feedback and classroom observations, which would not only help identify effective teaching, but also point all teachers to the areas where they need to become more effective teachers themselves. Initial Findings from the Measures of Effective Teaching Project | 5 The Measures To limit the need for extensive additional testing, the MET project started with grades and subjects where most states currently test students. We included those teaching mathematics or English language arts in grades 4 through 8. In addition, we added three courses which serve as gateways for high school students, where some states are using end-of-course tests: Algebra I, grade 9 English, and biology. The following data are being collected in their classrooms. Measure 1: Student achievement gains on different assessments Student achievement is being measured in two ways, with existing state assessments and with three supplemental assessments. The latter are designed to assess higher-order conceptual understanding. By combining the state tests and the supplemental tests, we plan to test whether the teachers who are successful in supporting student gains on the state tests are also seeing gains on the supplemental assessments. The supplemental assessments are Stanford 9 Open-Ended Reading assessment in grades 4 through 8, Balanced Assessment in Mathematics (BAM) in grades 4 through 8, and the ACT QualityCore series for Algebra I, English 9, and Biology. Measure 2: Classroom observations and teacher reflections One of the most difficult challenges in designing the MET project was to find a way to observe more than 20,000 lessons at a reasonable cost. Videotaping was an intriguing alternative to in-person observations (especially given our aspiration to test multiple rubrics), but the project had to overcome several technical challenges: tracking both students and a non-stationary teacher without having another adult in the classroom pointing the camera and distracting children, sufficient resolution to read a teacher’s writing on a board or projector screen, and sufficient audio quality to hear teachers and students. The solution, engineered by Teachscape, involves panoramic digital video cameras that require minimal training to set up, are operated remotely by the individual teachers, and do not require a cameraperson. 2 After class, participating teachers upload video lessons to a secure Internet site, where they are able to view themselves teaching (often for the first time). In addition, the participating teachers offer limited commentary on their lessons (e. g. , specifying the learning objective). Trained raters are scoring the lessons based on classroom observation protocols developed by leading academics and professional development experts. The raters examine everything from the teacher’s ability to establish a positive learning climate and manage his/her classroom to the ability to explain concepts and provide useful feedback to students. The Educational Testing Service (ETS) is managing the lesson-scoring process. Personnel from ETS have trained raters to accurately score lessons using the following five observation protocols:  ¦Ã‚ ¦ Classroom Assessment Scoring System (CLASS), developed by Bob Pianta and Bridget Hamre, University of Virginia 2 S imilar cameras have been developed by other suppliers, such as thereNow (www. therenow. net). A commercial version of the camera used in the MET project is available from Kogeto. (www. kogeto. com). 6 | Learning about Teaching  ¦Ã‚ ¦ Framework for Teaching, developed by Charlotte Danielson (2007)  ¦Ã‚ ¦ Mathematical Quality of Instruction (MQI), developed by Heather Hill, Harvard University, and Deborah Loewenberg Ball, University of Michigan  ¦Ã‚ ¦ Protocol for Language Arts Teaching Observations (PLATO), developed by Pam Grossman, Stanford University  ¦Ã‚ ¦ Quality Science Teaching (QST) Instrument, developed by Raymond Pecheone, Stanford University A subset of the videos is also being scored by the National Board for Professional Teaching Standards (NBPTS). In addition, the National Math and Science Initiative (NMSI) is scoring a subset of videos using the UTeach Observation Protocol (UTOP) for evaluating math instruction, developed and field tested over three years by the UTeach program at the University of Texas at Austin. Measure 3: Teachers’ pedagogical content knowledge ETS, in collaboration with researchers at the University of Michigan’s Learning Mathematics for Teaching Project, has developed an assessment to measure teachers’ knowledge for teaching—not just their content knowledge. Expert teachers should be able to identify common errors in student reasoning and use this knowledge to develop a strategy to correct the errors and strengthen student understanding. The new assessments to be administered this year focus on specialized knowledge that teachers use to interpret student responses, choose instructional strategies, detect and address student errors, select models to illustrate particular instructional objectives, and understand the special instructional challenges faced by English language learners. Measure 4: Student perceptions of the classroom instructional environment Students in the MET classrooms were asked to report their perceptions of the classroom instructional environment. The Tripod survey instrument, developed by Harvard researcher Ron Ferguson and Administered by Cambridge Education, assesses the extent to which students experience the classroom environment as engaging, demanding, and supportive of their intellectual growth. The survey asks students in the each of the MET classrooms if they agree or disagree with a variety of statements, including: â€Å"My teacher knows when the class understands, and when we do not†; â€Å"My teacher has several good ways to explain each topic that we cover in this class†; and â€Å"When I turn in my work, my teacher gives me useful feedback that helps me improve. † The goal is not to conduct a popularity contest for teachers. Rather, students are asked to give feedback on specific aspects of a teacher’s practice, so that teachers can improve their use of class time, the quality of the comments they give on homework, their pedagogical practices, or their relationships with their students. Measure 5: Teachers’ perceptions of working conditions and instructional support at their schools Teachers also complete a survey, developed by the New Teacher Center, about working conditions, school environment, and the instructional support they receive in their schools. Indicators include whether teachers are encouraged to try new approaches to improve instruction or whether they receive an appropriate amount of professional development. The survey is intended to give teachers a voice in providing feedback on the quality of instructional support they receive. The results potentially could be incorporated into measuring the effectiveness of principals in supporting effective instruction. Although we have not yet had a chance to analyze those data for the current report, they will be included in future analyses. Initial Findings from the Measures of Effective Teaching Project  7 Stages of Analysis The MET project will be issuing four reports, starting with this one. In this preliminary report of findings from the first year, we focus on mathematics and English language arts teachers, in grades 4 through 8, in five of the six districts. (The student scores on the state tests were not available in time to include teachers in Memphis). We report the re lationships across a variety of measures of effective teaching, using data from one group of students or school year to identify teachers likely to witness success with another group of students or during another school year. At this point, we have classroom observation scores for a small subset (less than 10 percent) of the lessons collected last year. Given the importance of those findings, we will issue a more complete report in the spring of 2011, including a much larger sample of videos. Our aim is to test various approaches to classroom observations. Third, late in the summer of 2011, researchers from RAND will combine data from each of the MET project measures to form a â€Å"composite indicator† of effective teaching. That report will assign a weight to each measure (classroom observations, teacher knowledge, and student perceptions) based on the result of analyses indicating how helpful each is in identifying teachers likely to produce exemplary student learning gains. Our goal is to identify effective teachers and effective teaching practices. To do so, we need to isolate the results of effective teaching from the fruits of a favorable classroom composition. It may well be easier to use certain teaching practices or to garner enthusiastic responses from students if one’s students show up in class eager to learn. If that’s the case, we would be in danger of confusing the effects of teachers with the effects of classroom characteristics. Like virtually all other research on the topic of effective teaching, we use statistical controls to account for differences in students’ entering characteristics. But it is always possible to identify variables for which one has not controlled. The only way to resolve the question of the degree of bias in our current measures is through random assignment. As a result, teachers participating in the MET project signed up in groups of two or more colleagues working in the same school, same grade, and same subjects. During the spring and summer of 2010, schools drew up a set of rosters of students in each of those grades and subjects and submitted them to our partners at RAND. RAND then randomly assigned classroom rosters within the groups of teachers in a given grade and subject (so that no teacher was asked to teach in a grade, subject or school where they did not teach during year one). Within each group of teachers in a school, grade and subject, teachers  effectively drew straws to determine which group of students they would teach this year. At the end of the current school year, we will study differences in student achievement gains within each of those groupings to see if the students assigned to the teachers identified using year one data as â€Å"more effective† actually outperform the students assigned to the â€Å"le ss effective† teachers. We will look at differences in student achievement gains within each of those groups and then aggregate up those differences for â€Å"more effective† and â€Å"less effective† teachers. Following random assignment, there should be no differences—measured or unmeasured—in the prior characteristics of the students assigned to â€Å"more effective† or â€Å"less effective† teachers as a group. If the students assigned to teachers who were identified as â€Å"more effective† outperform those assigned to â€Å"less effective† teachers, we can resolve any lingering doubts about whether the achievement differences represent the effect of teachers or unmeasured characteristics of their classes. Better student achievement will require better teaching. The MET project is testing novel ways to recognize effective teaching. We hope the results will be used to provide better feedback to teachers and establish better ways to help teachers develop. 8 | Learning about Teaching What We’re Learning So Far Before describing the measures and analysis in more detail, we briefly summarize our findings so far.  ¦Ã‚ ¦ In every grade and subject, a teacher’s past track record of value-added is among the strongest predictors of their students’ achievement gains in other classes and academic years. A teacher’s value-added fluctuates from year-to-year and from class-to-class, as succeeding cohorts of students move through their classrooms. However, that volatility is not so large as to undercut the usefulness of value-added as an indicator (imperfect, but still informative) of future performance. The teachers who lead students to achievement gains in one year or in one class tend to do so in other years and other classes.  ¦Ã‚ ¦ Teachers with high value-added on state tests tend to promote deeper conceptual understanding as well. Many are concerned that high value-added teachers are simply coaching children to do well on state tests. In the long run, it would do students little good to score well on state tests if they fail to understand key concepts. However, in our analysis so far, that does not seem to be the case. Indeed, the teachers who are producing gains on the state tests are generally also promoting deeper conceptual understanding among their students. In mathematics, for instance, after adjusting for measurement error, the correlation between teacher effects on the state math test and on the Balanced Assessment in Mathematics was moderately large, . 54. Teachers have larger effects on math achievement than on achievement in reading or English Language Arts, at least as measured on state assessments. Many researchers have reported a similar result: teachers seem to have a larger influence on math performance than English Language Arts performance. A common interpretation is that families have more profound effects on children’s reading and verbal performance than teachers. However, the finding may also be due to limitations of the current state ELA tests (which typically consist of multiple-choice questions of reading comprehension). When using the Stanford 9 Open-Ended assessment (which requires youth to provide written responses), we find teacher effects comparable to those found in mathematics. We will be studying this question further in the coming months, by studying teacher effects on different types of test items. However, if future work confirms our initial findings with the open-ended assessment, it would imply that the new literacy assessments, which are being designed to assess the new common core standards, may be more sensitive to instructional effects than current state ELA tests.  ¦Ã‚ ¦ Student perceptions of a given teacher’s strengths and weaknesses are consistent across the different groups of students they teach. Moreover, students seem to know effective teaching when they experience it: student perceptions in one class are related to the achievement gains in other classes taught by the same teacher. Most important are students’ perception of a teacher’s ability to control a classroom and to challenge students with rigorous work. Initial Findings from the Measures of Effective Teaching Project | 9 While student feedback is widely used in higher education, it is rare for elementary and secondary schools to ask youth about their experiences in the classroom. Nevertheless, soliciting student feedback is potentially attractive for a number of reasons: the questions themselves enjoy immediate legitimacy with teachers, school leaders and parents; it is an inexpensive way to supplement other more costly indicators, such as classroom observations; and the questionnaires can be extended to non-tested grades and subjects quickly. Our preliminary results suggest that the student questionnaires would be a valuable complement to other performance measures. Classroom observations are the most common form of evaluation today. As a result, our goal is to test several different approaches to identifying effective teaching practices in the classroom. In our work so far, we have some promising findings suggesting that classroom observations are positively related to student achievement gains. However, because less than 10 percent of the videos have been scored, we will be waiting until April to release results on the classroom observation methods. MEASURING TEACHER-LEVEL VALUE-ADDED In order to put the measures of student achievement on a similar footing, we first standardized test scores to have a mean of 0 and a standard deviation of 1 (for each district, subject year and grade level). We then estimated a statistical model controlling for each student’s test score in that subject from the prior year, a set of student characteristics and the mean prior test score and the mean student characteristics in the specific course section or class which the student attends. (We provide more details in the Technical Appendix. ) The student characteristics varied somewhat by district (depending upon what was available), but typically included student demographics, free or reduced price lunch, ELL status and special education status3. The statistical model produces an â€Å"expected† achievement for each student based on his or her starting point and the starting point of his or her peers in class. Some students â€Å"underperformed† relative to that expectation and some students â€Å"overperformed†. In our analysis, a teacher’s â€Å"value-added† is the mean difference, across all tested students in a classroom with a prior year achievement test score, between their actual and expected performance at the end of the year. If the average student in the classroom outperformed students elsewhere who had similar performance on last year’s test, similar demographic and program participation codes—and classmates with similar prior year test scores and other characteristics—we infer a positive value-added, or positive achievement gain, attributable to the teacher. Using this method, we generated value-added estimates on the state assessments and the supplemental assessments for up to two course sections or classrooms teachers taught during 2009-10. We also calculated valueadded estimates for teachers on state math and ELA test scores using similar data we obtained from the districts from the 2008-09 school year. (To be part of the MET project, a district was required to have some historical data linking students and teachers. ) 3 T  he student-level covariates used in the regressions included, in Charlotte-Mecklenburg: race, ELL status, age, gender, special education, gifted status; in Dallas: race, ELL, age, gender, special education, free or reduced lunch; in Denver: race, age and gender; in Hillsborough: race, ELL, age, special education, gifted status, and free or reduced lunch; in NYC: race, ELL, gender, special education, free or reduced lunch. Differences in covariates across districts may reduce the reliability of the value added est imates. 10 | Learning about Teaching In addition to state tests, students in participating classes took a supplemental performance assessment in spring 2010. Students in grades 4-8 math classes took the Balanced Assessment in Mathematics, while students in grades 4-8 English language arts classes took the SAT 9 Open-Ended Reading assessment. We chose these two tests because they included cognitively demanding content, they were reasonably well-aligned with the curriculum in the six states, had high levels of reliability, and had evidence of fairness to members of different groups of students. Balanced Assessment in Mathematics (BAM): Each of the test forms for the Balanced Assessment in Mathematics (BAM) includes four to five tasks and requires 50-60 minutes to complete. Because of the small number of tasks on each test form, however, we were concerned about the content coverage in each teacher’s classroom. As a result, we used three different forms of the BAM—from the relevant grade levels in 2003, 2004 and 2005—in each classroom. In comparison to many other assessments, BAM is considered to be more cognitively demanding and measures higher order reasoning skills using question formats that are quite different from those in most state mathematics achievement tests. There is also some evidence that BAM is more instructionally sensitive to the effects of reform-oriented instruction than a more traditional test (ITBS). Appendix 1 includes some sample items from the BAM assessment. SAT 9 Reading Open-Ended Test: The Stanford 9 Open-Ended (OE) Reading assessment contains nine openended tasks and takes 50 minutes to complete. The primary difference between the Stanford 9 OE and traditional state reading assessments is the exclusive use of open-ended items tied to extended reading passages. Each form of the assessment consists of a narrative reading selection followed by nine questions. Students are required to not only answer the questions but also to explain their answers. Sample items. from the Stanford 9 OE exam are available in Appendix 2. MEASURING STUDENT PERCEPTIONS College administrators rarely evaluate teaching by sitting in classrooms—as is the norm in K–12 schools. Rather, they rely on confidential student evaluations. Organizers of the MET project wondered whether such information could be helpful in elementary and secondary schools, to supplement other forms of feedback. The MET student perceptions survey is based on a decade of work by the Tripod Project for School Improvement. Tripod was founded by Ronald F. Ferguson of Harvard University and refined in consultation with K-12 teachers and administrators in Shaker Heights, Ohio, and member districts of the Minority Student Achievement Network. For the MET project, the Tripod surveys are conducted either online or on paper, at the choice of the participating school. For online surveys, each student is given a ticket with a unique identification code to access the web site. For the paper version, each form is pre-coded with a bar code identifier. When a student completes a paper survey, he or she seals it in a thick, non-transparent envelope. The envelope is opened only at a location where workers scan the forms to capture the data. These precautions are intended to ensure that students feel comfortable providing their honest feedback, without the fear that their teacher will tie the feedback to them. The Tripod questions are gathered under seven headings, or constructs, called the Seven C’s. The seven are: Care, Control, Clarify, Challenge, Captivate, Conf.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.