Download presentation
1
A Brief History of Assessment & Testing
“It is necessary to call into council the views of our predecessors, in order that we may profit by whatever is sound in their thought and avoid their errors.” —Aristotle, de Anima, Bk. 1, ch. 2, 403b20-23 Being a scholar requires that one be familiar with the history and development of one’s chosen field of study. It is hardly possible to know where education is and where it may be headed without some knowledge of where it came from and how it arrived at its present state. Today we will briefly address the history and development of formalized assessment and testing so that you may better understand how we came to be where we are with respect to this issue.
2
A Brief History of Assessment & Testing
What is assessment? Assessment: The process of gathering information to make informed decisions. To begin with, let us review the definition of the term. We will use a slightly broader definition today than we did in the last presentation so as not to limit ourselves to the realm of education (since assessment does occur in places other than the classroom).
3
A Brief History of Assessment & Testing
Why do we need assessment? Categorize Diagnose Measure Change Predict an Outcome Certification Second, we ought to know why we need to know about assessments, and “because the teacher said so” is not a sufficient reason . Assessments are used all the time. We use them to categorize things such as who can see what movies (g=everyone, PG = parental guidance suggested, PG-13= no one under 13 without a parent, R= restricted; no one under 17 w/o parent), how good a movie is (“three stars”, “two thumbs up”), the quality of food products (grade A, grade AA in eggs; regular, lean, extra lean, and super lean in hamburger), and the quality of gasoline (regular, plus, and premium). We use assessments to diagnose conditions such as malignant/benign cancers, ADD, schizophrenia, and color blindness. Assessments measure changes, such as the amount of rainfall in a given month, the amount of material learned over a year, and changes in health from diet and exercise programs. Assessment help us to predict outcomes. For example, if a person smokes cigarettes we can predict with reasonable certainty their odds of getting lung cancer or heart disease. There are also tests to determine your probability of succeeding in college (ACT/SAT), graduate school (GRE), law school (LSAT), or medical school (MCAT). You can even log on to EHarmony, take a personality profile, and “begin the exciting journey toward finding your true love.” Finally, we use tests to certify that a certain condition exists or competency is successfully demonstrated. We certainly would not want to let just anyone open up a medical practice without some type of certification that they are qualified to diagnose conditions and prescribe treatments. Also, in order to insure that financial transactions are legitimate we have laws for using Certified Public Accountants. And of course, every teacher in the public school system has to be certified before they can gain employment as a teacher. In this class earning a B- or better certifies that you are competent to continue on in the education program. These “gate keeper” types of assessments exist to insure an acceptable standard of quality.
4
A Brief History of Assessment & Testing
Sir Francis Galton ( ) Gifted genius Founder of social science Pioneered use of statistics in psychological research Teacher of Karl Pearson and Charles Spearman Cousin to Charles Darwin Eugenics Though assessment is as old as time, the formal beginnings of trying to measure human intelligence and cognitive function began in the late 1800’s with Francis Galton. Galton was a gifted genius and used his abilities to pioneer several fields of research. Had it not been for his interest in social sciences he would still be noted as one of the leading scientists of his time for his work in meteorology. Within the social sciences he was a pioneer in applying the “hard science” methods of statistics to psychological research. He was also the teacher and mentor of Karl Pearson (developer of the Pearson Product Moment Correlation in statistics) and Charles Spearman (developer of the Spearman Rho ranked correlation coefficient). It just so happens that Galton’s cousin was Charles Darwin, who is well known for his book The Origin of Species and his theory of natural selection. Galton put the question of natural selection to humans and started the debate on nature vs. nurture. Are people “smart” because of the genes they received from their parents, or is it because of the environment in which they were raised? Today the majority of psychologists tend to agree that nature and nurture are both essential aspects of “intelligence.” Perhaps the best description is that nature, or what we get from our parents, lays the foundation with which we have to build upon. Nurture, or the conditions we are exposed to and gain experience from, determines the form and features of the house. Back in the days of Galton the nature argument was winning the debate and led to the concept of eugenics. Eugenics, in blunt terms, is the application of animal husbandry to humans. Put together an intelligent man and woman and they will produce intelligent children regardless of living conditions. Conversely, two less intelligent parents have no chance of producing intelligent children, even if raised with wealth and privilege.
5
A Brief History of Assessment & Testing
James Cattell ( ) First person with title “Professor of Psychology” Mental Tests and Measurements (1890) Creator and/or editor of Psychological Review, Science (AAAS), and Popular Science. President APA 1895 James Cattell picked-up where Galton left-off and worked to analyze correlations between academic performance and intelligence. Turns out he could find no statistically significant relationship between these two variables. Cattell is also noteworthy because he is the first person to hold the title “Professor of Psychology.” One of his most prominent publications was Mental Tests and Measurements, in which he described his theories of measuring mental capacity. Cattell’s work with mental tests were a good start, but they soon proved to be unreliable and were replaced by a more acceptable test developed by Alfred Binet.
6
A Brief History of Assessment & Testing
Alfred Binet ( ) New Methods for the Diagnosis of the Intellectual Level of Subnormals Binet-Simon Scale Mental age. Alfred Binet worked with mentally handicapped children, or what were called in his day “subnormals”, and found it useful to try and predict what these children were capable of learning so that they could become functional and contributing members of society. It should be noted that his tests were not designed to categorize people as “mentally retarded” or any other politically incorrect name which currently carry negative connotations (e.g. moron = dull MA of 8-12, IQ of 51-70, imbecile IQ of 26-50, idiot IQ of 0-25), but rather to help him try to discover the potential within each individual and help people to achieve their full potential. His contributions include his landmark publication New Methods for the Diagnosis of the Intellectual Level of Subnormals, the Binet-Simon intelligence scale, and the concept of a mental age (e.g. though a person my be 23 years old, they might have the mental function of the typical 7 year old).
7
A Brief History of Assessment & Testing
William Stern ( ) The Psychological Methods of Intelligence Testing Modified the Binet-Simon scale to get the “mental quotient”, which later became known as IQ. William Stern came along and slightly modified Binet’s mental age concept by dividing the mental age by the person’s chronological age to produce what he called the “Intelligence Quotient.” The norm for an IQ score is 100, with standard deviations of 15.
8
A Brief History of Assessment & Testing
Henry Goddard ( ) Large scale implementation of IQ tests. The Kallikak Family. 27 states practiced eugenics based on his work. Huge influence on military, immigration, and society. Henry Goddard became interested in intelligence testing and rose to prominence by implementing large scale IQ tests. He was a strong supporter of the Nature theory, which he felt he had satisfactorily demonstrated in his publication The Kallikak Family. The story goes something like this: a young officer from Connecticut joins the Union Army and fights the Confederates in the south. While in the south he meets a young barmaid who is considered “subnormal” (i.e. she was not proper breeding stock for a gentleman from Connecticut). He has an illegitimate child with the barmaid, but abandons them both at the end of the war. Returning home to Connecticut he maries a proper woman worthy of his families estate and has several children by her. Goddard traced Kallikak’s children and discovered that the illegitimate child was, like his mother, “subnormal”, while the other children became fine and proper members of the elite society. The reason, according to Goddard, was due to the bad genes of the barmaid and the good genes of the proper wife. Goddard became highly influential during his time and was called to testify before congress on issues of intelligence. Following Goddard's advice 27 states passed laws implementing the practice of eugenics. This meant that those who were not “intelligent” should not be allowed to reproduce. Thousands of people were sterilized or lobotomized against their will. Though I cannot confirm it, I did hear tale that Goddard could estimate intelligence by appearance alone and was employed at Ellis Island to inspect incoming immigrants. As the story goes he looked at one person and told authorities that he was “subnormal” and should not be given admittance. Turns out that the person in question was a Nobel Prize Laureate. In any case, Goddard had a huge impact on the military, immigration, and society.
9
A Brief History of Assessment & Testing
Lewis Terman ( ) Refined the Binet-Simon test and produced the modern Stanford Binet, also known as the Stanford Achievement Test. The Stanford Achievement Test (1916) set the testing benchmark for over two decades. Lewis Terman started his career as a school teacher, then an administrator, and ended up at Stanford University. For his Ph.D. thesis, Terman decided to see what mental tests could do in distinguishing unusually backward students from very bright ones. His study was titled, "Genius and Stupidity: A Study of the Intellectual Processes of Seven "Bright" and Seven "Stupid" Boys." Later, in 1906 while at Stanford, Terman published a revised and perfected Binet-Simon scale for American populations. This "Stanford Revision of the Binet-Simon Scale," soon became known as the "Stanford-Binet", and was by far the best available individual intelligence test. It set the benchmark for intelligence testing for more than two decades.
10
A Brief History of Assessment & Testing
The Army Alpha & Beta Developed by the American Psychological Association. Terman, Goddard, & Robert Yerkes were principle group heading up the project. Yerkes organized and lead a staff of forty psychologist who produce the test in only two months. 1.7 million tests administered during WWI With the US involvement in WW I the army soon found itself in need of several new soldiers. More soldiers means you need more officers to lead them in battle. The problem was to determine which recruits were capable of being officers and which should be left in the enlisted ranks. To solve this problem the Army called upon the APA to develop a test which would sort out the officers from the enlisted men. One of the problems with testing was that it was usually conducted in one-on-one sessions and took several hours to administer, score, and interpret. What the Army needed was something that could be done quickly and efficiently with large numbers of examinees. The project was headed by Terman, Goddard, and Robert Yerkes. Yerks served in an administrative capacity and was responsible for the leading 40 psychologists who developed the test in only two months (by contrast, it takes about two years to put together a comparable test today). During WWI 1.7 million of these tests were administered.
11
A Brief History of Assessment & Testing
The Army Alpha The mother of all tests 212 questions in MC and T/F format Used to classify draftees into officer or enlisted ranks. The first test developed by Yerkes’ team became known as the Army Alpha, and is considered the mother of all modern standardized tests. It consisted of 212 questions in MC an TF format, the responses to which would determine whether a person would be an officer or a regular enlisted soldier. Though the test was fairly decent they did run into one problem they did not anticipate; what to do with recruits who couldn’t read.
12
A Brief History of Assessment & Testing
The Army Beta Used to classify illiterate draftees Comprised mostly of pictures and diagrams. The Army Beta test solved the problem of illiterate draftees. Rather than having to read prompts and responses the Beta used pictures and diagrams that could be responded to without having to read.
13
A Brief History of Assessment & Testing
The testing explosion Following the Army Alpha, intelligence testing became a multimillion-dollar industry. The Army Alpha and other subsequent tests transformed the once cumbersome and costly methods of testing into a process that was easy, inexpensive, and commonly accepted. The development and use of the Army Alpha and Beta tests put standardized testing on the map, and it was here to stay. Suddenly everyone felt it necessary to test for everything, and testing soon became a multimillion dollar industry. Such entities as Educational Testing Service, the College Board, and a host of smaller test publishers were writing tests to measure every possible talent a student might possess so that students could be placed in the proper study track to fulfill their destiny as defined by these tests. The Army Alpha had revolutionized the testing process and taken what was normally a long, cumbersome process and changed it into a quick and efficient process containing all the benefits of the contemporary industrial assembly line.
14
A Brief History of Assessment & Testing
Voices of Dissent Walter Lippman Testing stamps a label, which, once applied, is difficult to remove. Tests serve the prejudice and powerful. Steven J. Gould The Mismeasure of Man Edwin Boring What is intelligence? Fortunately there were a few people who did not jump on the bandwagon and took the time to think about what was going on. Walter Lippman argued against tests on the basis that they stamp a label on the examinee which, when applied, is difficult to remove. He also claimed that tests were a way for the prejudice and powerful to retain their position and oppress those whom they considered “subnormal.” Steven J. Gould, an evolutionary geologist at Harvard, wrote a nice little book called The Mismeasure of Man. In it he describes several problems and fallacies with measuring intelligence and the stereotypes they promoted. In one example he describes how researchers equated larger brains with greater intelligence. In order to test this theory the researchers compared the volume of male and female skulls from a burial tomb. In order to determine the volume of the skull the researchers filled them with down-feathers, then weighed the feathers. It does not take a rocket scientist to realize that men are generally larger than women and thus their skulls would hold more feathers. However, just in case there was a problem, the researchers tended to pack the feathers as tight as they could in the male skulls and leave the female ones as lightly lofted as possible. Apparently all of these researchers were single men because any woman can tell you that just because men have a bigger brain does not mean that they use it. Perhaps the most scathing criticism of IQ testing came from Edwin Boring, another Harvard Psychologist, who posed the question “What is intelligence?” No one has come up with a satisfactory response to this question. Boring, emphasizing his point, replied “whatever it is, these tests measure it” .
15
A Brief History of Assessment & Testing
The past 50 years Elementary and Secondary Education Act (ESEA) of 1965, aka Title I. Minimum Competency Testing (MCT). Lake Woebegone Effect. A Nation at Risk. Standards based reforms. WYTIWYG (from WYSIWYG). Adopted as the centerpiece of President Lyndon Johnson's War on Poverty, the ESEA had the noble objective of closing the achievement gap between privileged and underprivileged children. It was going to do that by directing billions of federal dollars in a dazzling array of special programs focused especially on the children of poverty. With a budget now approaching $14 billion a year, the ESEA supports no fewer than five dozen major programs extending into virtually every area of school life. Thirty-four years later, the government's own assessments have shown that the ESEA -- particularly its major prong, Title I -- has fallen short of fulfilling the lofty hopes. After an expenditure of $120 billion, the achievement gap has not narrowed; indeed by some measurements it has widened. In Fiscal 1999 another $8 billion in Title I funds are expected to flow through local schools. And some of the aid programs under a dozen other ESEA titles have been as unproductive as Title I. Minimum competency testing has made minimum competence the norm. In addition, it has greatly inflated and overstated achievement because no school wants to be considered below average (even though mathematically 50% of them are). This has led to what is known as the Lake Woebegone Effect after Garrison Keeler’s book Lake Woebegone Days in which the opening line states “Welcome to Lake Woebegone, where the women are strong, the men are handsome, and ALL the children are above average.” A Nation at Risk, the landmark report from the 1980’s, induced a call for higher standards in education, which of course prompted another testing revolution. Today all states are required to test students for achievement of content standards. This often has the undesirable effect of forcing teachers to teach only content which appears on the test and promotes the acronym WYTYWYG (pronounced “witty whig”) for “What you test is what you get”
16
A Brief History of Assessment & Testing
The past 50 years National Assessment of Educational Progress (NAEP), The Nations Report Card. Third International Mathematics and Science Study (TIMSS). Inclusion. Other tests have been developed to see how US education is functioning as a whole (NAEP), and others still that compare education across various nations of the world (TIMSS). The results of these tests are often reported in the news. Unfortunately the news generally tends to highlight only negative results of these tests while ignoring any achievements. Finally, educational assessments have recently been unfairly effected by requiring ALL students to participate, including those who for various reasons/conditions are what Binet would have called “subnormals. ” Including these student’s low, statistical outlier scores tends to pull down the average and obscure what the “average” student truly knows.
17
A Brief History of Assessment & Testing
Reasons for using intelligence tests: Reasonably good at predicting performance in school. Mental abilities are very much an inherited trait (twin and adoption studies). Mental measurement is useful, beneficial to society, and is one of psychologies greatest contributions to modern life. Despite the problems with intelligence tests there are several good reasons for using them. 1. They are reasonably good at predicting performance in school. They are not perfect and cannot measure aspects such as attitude, determination, perseverance, and other required attributes, but they are fairly good at predicting school performance. 2. Mental abilities are very much an inherited trait (twin and adoption studies). Several research studies have shown the effect of Nature on intelligence. In recent years Nature has taken a back seat to Nurture, but for those who like to argue this debate there is some strong evidence for the influence of genetics (see Steven Pinker’s The Blank Slate: The Modern Denial of Human Nature). 3. Millions of people have benefited by the proper use of assessments to diagnose conditions that impact learning and consequently have become more capable in reaching their full potential.
18
A Brief History of Assessment & Testing
Concerns about Testing/Assessing Test Anxiety Categorizing & Labeling Effects on Self-Concept Self-Fulfilling Prophecies Test Fairness/Bias Points for discussion on the negative aspects of testing
19
A Brief History of Assessment & Testing
Concerns about Not Testing/Assessing How do you know if a person is inclined to succeed? How do you know if a person can really perform a function? How do you choose between different people, organizations, services, and businesses? Points for discussion on the consequences of NOT testing.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.