The New SAT Facts November, 2006 Wayne Camara & Amy Schmidt.

Slides:



Advertisements
Similar presentations
Illinois. Graduating Class Of 2002 Demographics State And National Test Volume Source: State, National ACT Profile, 2002, Executive Summary.
Advertisements

Jamesville-DeWitt School Report Card Presented to the Board of Education May 10, 2010.
Standardized Testing. Standardized Testing Defined A standardized test is designed in such a way that its administration, scoring and interpretation are.
Academic Outcomes of 4-Year University Freshman Cohorts: A Comparison of Dual Enrollees & Advanced Placement (AP) Credit Recipients Shoumi Mustafa and.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
NYC ACHIEVEMENT GAINS COMPARED TO OTHER LARGE CITIES SINCE 2003 Changes in NAEP scores Leonie Haimson & Elli Marcus Class Size Matters January.
1 Graduation Rates: Students Who Started 9 th Grade in 2005, 2006, 2007, 2008 and 2009.
Accountability Update Ty Duncan Coordinator of Accountability and Compliance, ESC
BOARD ENDS POLICY REVIEW E-2 Reading and Writing Testing Results USD 244 Board of Education March 12, 2001.
Admissions Testing: Predicting College Success CRESST September 2002 Wayne J. Camara The College Board.
Science Achievement and Student Diversity Okhee Lee School of Education University of Miami National Science Foundation (Grant No. REC )
Mark DeCandia Kentucky NAEP State Coordinator
Grade 3-8 English Language Arts and Mathematics Results August 8, 2011.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
The SAT ® What Does It Mean for Students?. 2 The SAT Focuses on College Success ™ Skills Critical Reading Mathematics Writing The SAT ® tests students’
Minority Student Participation in International Programs: A Survey of Undergraduate Students Attending HBCUs Komanduri S. Murty & Jimmy D. McCamey, Jr.
The New SAT ® What Does It Mean for Students?. 3The New SAT: What Does It Mean for Students? June, 2004 The New SAT Focuses on College Success ™ Skills.
The College Board: Expanding College Opportunity The College Board is a national nonprofit membership association dedicated to preparing, inspiring, and.
1 Name: Wael Eid Grade: 11 English project The SAT ® What Does It Mean for Students?
Implication of Gender and Perception of Self- Competence on Educational Aspiration among Graduates in Taiwan Wan-Chen Hsu and Chia- Hsun Chiang Presenter.
The Learning Behaviors Scale
NYC ACHIEVEMENT GAINS COMPARED TO OTHER LARGE CITIES SINCE 2003 Changes in NAEP scores Class Size Matters August
High School Mathematics: Where Are We Headed? W. Gary Martin Auburn University.
Student Engagement Survey Results and Analysis June 2011.
FFT in California: Evaluation Outcomes Cricket Mitchell, PhD CIMH Consultant April 3, 2008.
Is the Force Concept Inventory Biased? Investigating Differential Item Functioning on a Test of Conceptual Learning in Physics Sharon E. Osborn Popp, David.
Results of the 2009 NAEP High School Transcript Study America’s High School Graduates Jack Buckley Commissioner National Center for Education Statistics.
Developmental Math, English and Reading Data Team Subcommittees Reports January 2011.
Evaluating the Vermont Mathematics Initiative (VMI) in a Value Added Context H. ‘Bud’ Meyers, Ph.D. College of Education and Social Services University.
Presenters Rogeair D. Purnell Bri C. Hays A guide to help examine and monitor equitable access and success Assessing and Mitigating Disproportionate Impact.
1 Results for Students and Individuals with Disabilities September 2008.
Midcourse Assessment of Healthy People 2010 Goal II Suzanne P. Hallquist, MSPH Kenneth G. Keppel, PhD National Center for Health Statistics Centers for.
Jack Buckley Commissioner National Center for Education Statistics January 25, 2011.
Student Performance Profile Study: An Examination of Success and Equity Matt Wetstein, Interim Vice President of Instruction Office of Planning, Research,
GEOMETRY: CONTENT and PROCESS Good ideas in teaching Precalculus and…. Rutgers March 18, 2011.
Kale Braden, ASCCC North Representative Michelle Grimes-Hillman, ASCCC Curriculum Committee Chair James Todd, Interim Vice-President, Student Services,
End of Year Report_ DataSet 1 Lodi Unified School District Year-End Benchmark Assessment Results (Student Achievement Monitoring)
CALIFORNIA DEPARTMENT OF EDUCATION Jack O’Connell, State Superintendent of Public Instruction Results of the 2005 National Assessment of Educational Progress.
Student Achievement Gains and Gaps in Saint Paul Public Schools Tom Watkins Director of Research, Evaluation and Assessment Saint Paul Public Schools May.
Integrating Success The Transition of All Students From High School to College November 2007 Iowa Educational Research & Evaluation Association Annual.
Mark DeCandia Kentucky NAEP State Coordinator
Final Report for East Carolina University
THE 2005 NAEP HIGH SCHOOL TRANSCRIPT STUDY. THE 2005 HIGH SCHOOL TRANSCRIPT STUDY Today ’ s Presentations.
CAHSEE Results Board Report 1 Lodi Unified School District 2009 California High School Exit Examination Results September 15, 2009.
Erikka Goff Georgetown Learning Centers.  Structure  Content  Scoring  Registration  Score Choice.
Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New.
Cambrian School District September 17, 2015
Examining the Enrollment and Persistence of Students with Discrepant High School Grades and Standardized Test Scores Anne Edmunds, Ed.D. Higher Education.
Evaluation Institute Qatar Comprehensive Educational Assessment (QCEA) 2008 Summary of Results.
Effectiveness of Selected Supplemental Reading Comprehension Interventions: Impacts on a First Cohort of Fifth-Grade Students June 8, 2009 IES Annual Research.
2009 Grade 3-8 Math Additional Slides 1. Math Percentage of Students Statewide Scoring at Levels 3 and 4, Grades The percentage of students.
ReadiStep and PSAT/NMSQT Summary of Answers and Skills & Advantage: SAT PSAT/NMSQT.
Admission and Transfer Policy Review Task Force 1.
SAT is Changing …. Timing.... Timeline of Changes to the PSAT and SAT in Pearland Fall 2015 August: PISD SAT prep class begins focusing on Redesigned.
The Process The Results The Repository of Assessment Documents (ROAD) Project Sample Characteristics (“All” refers to all students enrolled in ENGL 1551)
Illinois.
How Can High School Counseling Shape Students’ Postsecondary Attendance? Exploring the Relationship between High School Counseling and Students’ Subsequent.
Conversation about State Report Card November 28, 2016
Assessing Students' Understanding of the Scientific Process Amy Marion, Department of Biology, New Mexico State University Abstract The primary goal of.
Test Validity.
2015 End of Course Examinations
2015 PARCC Results for R.I: Work to do, focus on teaching and learning
Examination of the Relationship Between Nutrition Media Literacy and Soft Drink Consumption Among Adolescents – Preliminary Findings Martin H. Evans*,
2017 MCAS Reporting Michol Stapel, Associate Commissioner Bob Lee, MCAS Chief Analyst October 23, 2017.
The Current SAT, the New SAT, and the ACT
Linda DeAngelo CIRP Assistant Director for Research
Deputy Commissioner Jeff Wulfson Associate Commissioner Michol Stapel
The Current SAT, the New SAT, and the ACT
Report of Achieving the Dream Data Team
USG Dual Enrollment Data and Trends
Presentation transcript:

The New SAT Facts November, 2006 Wayne Camara & Amy Schmidt

Executive Summary Purpose of Briefing: To provide an overview of recent research conducted on the new SAT. Research and Analysis designed to meet three demands: 1. Provide baseline data concerning score use and other characteristics of the new SAT compared to old SAT 2. Respond to questions from stakeholders concerning the new SAT 3. Develop a base of new knowledge on the Writing test and impact of changes to the SAT

Research to Date on the New SAT Score Change for PSAT/NMSQT Test Takers Construct Comparability and Continuity in the SAT Essay Reliability Effect of New SAT Length on Performance (Fatigue) Consequences of adding writing to K-12 instruction Impact of Taking Advanced Math Courses on New SAT Math Items Standardized Differences on Ethnic Subgroups Relationship Between Essay Features and Essay Scores Effects of Short-Term Coaching on Writing Test performance Discrepant Scores between CR and Writing

Score Change for PSAT/NMSQT (P/N) Test Takers (Oh, Wright, & Zanna, 2005) Analyzed score changes and repeater patterns for P/N; results used to develop table of expected SAT score ranges for P/N Score Report Plus. Based on test-takers who took both 2003 and 2004 P/N, and those who took both 2004 P/N and spring 2005 (March, May or June) SAT

P/N Score Change Study Highlights On average, 2003 sophomores repeating the P/N as juniors improved their reading score by 3.3 points, their math score by about 4.4 points, and their writing score by about 4.1 points. On average, 2004 juniors taking the P/N received junior-year SAT scores that were 2.5 points higher in reading, 1.9 points higher in math, and 1.4 and 1.3 points higher in writing (MC and composite, respectively). The correlations between the old (2003) and new (2004) P/N scores ranged from.82 to.86 for the three subtests. The correlations between the new P/N and the new SAT ranged between.81 and.87.

P/N Score Change

Construct Comparability and Continuity in the SAT (Oh & Sathy, 2006) Study assessed whether the changes to the SAT had an impact on the constructs measured by the test. Results are based on factor analysis of data from a sample of students taking both the previous version and new version of the SAT during the 2003 field trial.

Highlights of Results from Construct Comparability Study Critical Reading Exploratory Factor Analysis revealed at least 2 distinct factors, one comprising items related to sentence completion and analogy items and one comprising critical reading and passage-based reading items. This finding suggests that the construct continuity for the sentence completion item type and passage-based reading/critical reading item types are maintained in the new SAT. Results from a 2-factor model without analogy items provided best fit to the data.

Highlights of Results from Construct Comparability Study, Continued Math Results suggested that the new math test is essentially unidimensional, as was the previous version. Tests of dimensionality revealed a small yet statistically reliable secondary factor related to geometry items in both the old and new SAT.

Score Equity Assessment of Transition from SAT I to the new SAT (Dorans, Cahn, Jiang, & Liu, 2006) Study assessed whether the changes in the College-Bound Senior 2006 means were due to population shifts or to changes to the SAT Using operational data from the first year of new SAT administration, Score Equity Assessment was used to estimate what the subgroup means would have been had the SAT not changed.

Highlights of Results from Score Equity Assessment Study, Continued Linkages between the new SAT and the old were examined for population invariance across gender groups. Results suggested that the equating functions were invariant across gender groups, providing support for the comparability of scores from the old SAT to the new SAT.

Essay Reliability within the SAT Reasoning Test (Allspach & Walker, 2005) Study designed to estimate various forms of reliability associated with the SAT essay. 3,776 juniors from 35 high schools participated in the study. Four different essay prompts used. Students wrote on two different essay prompts at two different times, about 2 weeks apart. Essays were read by raters trained similarly as in operational SAT essay readings.

Essay Reliability Study Type of reliability estimates: Single-rater (inter-rater) reliability – Correlation between observed scores from 2 raters scoring the same essay. Represents consistency of any given rater in scoring an essay. Double-rater reliability – Correlation between total essay scores from two pairs of raters scoring the same essay. Represents consistency in scoring method itself when 2 raters are used. Observed essay reliability – Correlation between examinees’ total scores on 2 different essays. Represents proportion of true (writing ability) in essay score.

Highlights of Results from Essay Reliability Study The average single-rater reliability coefficients across the 4 prompts was approximately.79. The average double- rater reliability was about.88. The average observed essay reliability was about.67 70% of scores between 6-8; 80% of scores between 6-9 Reader agreement: 56% exact 96.5% +/- 1 pt 3.5% > +/- 2 pts (go to third reader)

Investigating the Effect of New SAT Test Length on the Performance of Regular SAT Examinees (Wang, 2006) Using the data from the March 2005 SAT administration, a recent study examined test-taker performance on eight SAT sections which were presented to examinees in different orders and in different positions. The study looked at the average percent of items answered correctly and the average number of items omitted for different sections of the test.

If the increased length of the SAT caused test-taker fatigue, we would expect: The percent of items answered correctly to decrease for the later sections of the test, when the students would be feeling fatigue. The students’ omit rates to increase for later sections of the test.

The average percent of items correct was consistent throughout the entire test: The results were similar for gender, racial/ethnic, and language groups, and for different levels of ability as measured by total SAT score.

The average omit rate was NOT higher at the end of the test: The average omit rate for the last 6 items was also NOT higher at the end of the test:

Summary of Fatigue Study Findings: Study conducted on March 05 SAT and replicated on Oct 06 administration. Results also compared to SAT I and no changes were detected. On average, students got the same percent of items correct on later sections of the test as on earlier sections. On average, students did not omit a larger number of items on later sections of the test. These findings provide evidence that any fatigue that students may have felt did not impair their performance in any way.

The Impact of Taking Advanced Math Courses on Performance on the New SAT Math Items (Deng & Kobrin, 2006) Evaluated whether taking more advanced math courses in high school gives students an advantage on the new SAT items testing Algebra II content. Study analyzed new SAT field trial data. Standardized mean differences on average item performance for the old and new content across groups of students with various course-taking patterns. DIF analyses to explore whether items functioned similarly for students of equal ability with different course taking patterns.

Math Course-taking Study: Summary of Results Students who took one or more advanced courses scored higher than those who did not take any advanced course or just planned to do so. Students who planned to take one or more advanced courses scored higher than those who did not plan to take any advanced course. Items measuring the new content were more sensitive to the effects of taking advanced math courses than items that measure the old content. Several sub-content areas within Algebra II and Geometry had large percentage of items showing DIF.

The Relationship Between Essay Features and Essay Scores (Kobrin, Deng, & Shaw) This study investigated: the relationship between several features of SAT essay responses and essay scores. whether essay scores are predictable from features of the prompt. subgroup differences (racial/ethnic, gender, and language) in the frequency of essay response features and their correlation with essay scores.

Essay Research Study—Phase I Phase I focused on essay length and scores 2,820 essays were sampled from 6 different SAT forms (both east & west coast prompts) that were administered in March, May, & June of ’05. Examined the relationship between essay score and: number of words number of paragraphs whether students reached the 2 nd page whether students wrote in first-person (used the pronoun “I”).

Phase I Results: Correlation of Length with Essay and SAT-W Scores (Kobrin, Deng & Shaw, Under Review) The range of correlations with essay scores across the six prompts was.57 to.68 for number of words and.27 to.38 for number of paragraphs.

More Phase I Results Reaching the Second Page Students who reached the second page scored about 2 pts higher than those who did not. After controlling for # of words, this was reduced to less than one pt (.7). Using First-Person About 50% of students used first-person. The mean score for students using first-person was 6.9 compared to 7.3 for students not using first-person. There was substantial variation across prompts in the use of first-person responses. Some prompts appeared more conducive to a first-person response than others, but the voice used appeared to have very little impact on essay score.

Effects of Short-Term Coaching on Standardized Writing Tests (Hardison & Sackett, 2006) Can coaching increase scores on the SAT essay? Does that coaching increase scores only on the specific essay, or does it also increase the test-taker’s actual writing ability that the test is intended to measure?

Methods for Short-Term Coaching Study Six Ph.D. students were hired to develop coaching strategies for a training program, similar to those offered by test-prep companies. 50 first-year college students participated in 9 hour training program (training group); 49 students did not receive training (control group). Both groups completed pretest and posttest essays from CLEP. Participants also completed two additional essays developed to mimic writing tasks that a student might encounter in a college setting.

Results of Short-Term Coaching Study After controlling for ability (using ACT scores), students receiving training did indeed score significantly higher on essay. Coaching was particularly effective for those with lower writing performance, but actually led to a decrease in scores for high-performers. Coaching also produced significant improvement in performance on the generalizability tests when compared to the control group. Results suggest that SAT essays may be susceptible to coaching, but score inflation may reflect at least some improvement in overall writing ability.

Students with discrepant CR and W scores Correlation between SAT CR and W about ,000 students had a significant discrepancy between scores. Of these 50% had a CR score that was 1 SD > than W (63% male); 50% had a W score 1 SD > CR (63% female). No significant difference among students in HSGPA. Results by ethnicity and best language not significant: Whites > CR; Asians > Writing English Speakers > CR; ELL > Writing

Update on New SAT Scores: College Bound Seniors 2006

What comes next? Research planned or in progress… Impact of SES on SAT & College Success Validity Study of the SAT Reasoning Test Consequential Validity of SAT Writing New SAT/ACT Concordance Evaluating Formula Scoring vs Right Scoring Placement validity of Math and Writing tests

Comparison of 2006 College-Bound Seniors with Previous Cohorts Sub-group Gender: Male Female Race/Ethnicity: No Response American Indian or Alaskan Native 111 Asian or Pacific Islander 899 Black 10 Mexican or Mexican American 454 Puerto Rican 111 Other Hispanic or Latino 345 White 5156 Other 344 Best Language: No Response 1676 English English and Another 788 Another Language 233

2004, 2005, & 2006 College-Bound Seniors ‘04 to ’06 Changes ‘05 to ’06 Changes Highest Verbal Highest Math Highest Composite Latest Verbal Latest Math Latest Composite Highest Composite (single admin)

Major Changes in College-Bound Seniors Cohort Scores pts (7V, 3 M) pts (9V, 7M) pts (5V, 3M) pts (4V, 1M) pts (5V, 2M) pts (3V, 3M) Math has not dropped 2 pts in 1 year since 1978 The last time Verbal dropped more than 1 pt was: pts pts

Diff05-06 Diff 1-time test takers N 636,655645,629682,00545,35036,376 % Mean Verbal Mean Math time test-takers N 542,589563,028545,1732,584-17,855 % Mean Verbal Mean Math time test-takers N 195,215216,883187,194-8,021-29,689 % Mean Verbal Mean Math Retesting Patterns and Scores

Overall Retesting Changes on SAT Total Students 1,429,00 7 1,475,62 3 1,464, ,616 (3.3%) -10,879 (-.7%) 35,737 (2.5%) Total Tests* 2,492,68 3 2,630,38 8 2,547, ,70 5 (5.5%) -83,021 (-3.2%) +54,684 (2.2%)

CB Srs Score Changes and Subgroups CR -5 (males -8, females -3) – largest drop since 1994 Math -2 (males -2, females -2) Underrepresented minorities show overall gains: Income < $20k CR +2, M +1 Non English Speaking CR +5, M +2 Private school CR-11, M-4 Non-Response rate (no change in %) but large decrease in scores Score gaps decrease In CR among all ethnic minorities except Other Hispanic (no change) In Math for Asian, Black, and Puerto Rican subgroups Score decline evidence in first SAT taken First SAT in 05 (CR 498, M 507.9); in 06 (CR 494.8, M 508.5) (CR- 3.2, M+.6) No difference in age testing between first time test takers (mean 17.2 yrs) HSGPA increases in increase in GPA from 05 to 06; mean is 3.33 (with 43% of students having a HSGPA >A-)

Ethnic Differences Slightly Reduced in CR (Effect sizes) Group Critical ReadingWriting Asian Black Mexican Am Puerto Rican Latin Am White Females

Ethnic Differences Slightly Reduced in Math (Effect Sizes) Group Asian.52 Black Mexican Am Puerto Rican Latin Am White Females

Students who take a Core Curriculum or More Significantly outperform those taking less than a Core Curriculum Number of Students SAT Scores

Core vs Non-Core Core = 4 yrs of English, 3 yrs of Math (with Algebra), 3 yrs of Science, 3 yrs of Social Studies N (%) CRM2006 N (%) CRM06-05CR (06- 05) M (06- 05) Core + 909,049 (77.3) ,452 (77.0) ,597 (-0.3) -3+1 Core - 267,278 (22.7) ,728 (23.0) ,450 (+0.3) -6-5