Thursday, July 18, 2013

Standardized Tests in the Homeschool - What's With The Numbers?

A while back, I wrote a blog article about homeschool and standardized testing.  I explained some of the reasons why people use standardized testing.  I'm all about everyone having the freedom to use whatever evaluating method they choose to assess their students' progress during the year.  I tend to think that a portfolio review is more authentic -- it can capture progress over time. You can gather samples of the BEST your student has to offer as well as showing their AVERAGE or WORST -- because we all know that learning is a process fraught with steps forward and behind. 

But, there is something that just satisfies the curiosity of a homeschooling mama {or at least this homeschooling mama} and I just need a little, teeny peek at how my kids are doing compared with their same-grade peers.

photo credit: (C) Shannan Muskopf  (Flickr)


Which leads me to this post -- it is all about what the numbers mean.

So you get this nice letter back from Testing Service USA {totally made that up, and not intending to represent a real company}, you open it, look for the composite score, and -- phew! -- your student did fine.  Or not.  And you are worried.  What do all the numbers mean, anyhow?  How can you use this to help your son or daughter?  Did you just waste $20-$40?

A. Percentiles.  This is not the same as percent.  I love the example given in my BJU Press "Guidelines for Test Interpretation" flyer that says something to the effect of a 55% on a test is a failing grade, but a subtest score at the 55%ile will show a student performing in the average range.  Big difference between percent and percentile.

The highest percentile score possible is 99.  The lowest is, well, 1.  Think of it this way:  there are 100 same-grade level students in a room.  A percentile will show you how many students your son/ daughter did better than.  So if your student scores at the 60%ile, that means that your son/daughter did better than 60% of students in his grade.  He also did worse than 40% {this would be the 'glass-half-empty' point of view}.

Who are these 100 students?  They are the norming sample.  What is that? you ask.  I found this concise explanation here at Education.com written by L.G. Cohen and L.J. Spenciner:

A norm-referenced test is a standardized test that compares a student's test performance with that of a sample of similar students who have taken the same test. After constructing a test, the test developers administer it to a standardization sample of students using the same administration and scoring procedures for all students. This makes the administration and scoring "standardized." [emphasis added]The test scores of the standardization sample are called norms, which include a variety of types of scores. Norms are the scores obtained by the standardization sample and are the scores to which students are compared when they are administered a test.
So, the norming sample is a group of same-grade boys and girls at roughly that have been given this test at roughly the same time of year that you are administering it.  I've actually helped out by giving new speech and language tests to typically developing students before.  It can take 6-12 months for a new norm-referenced tests to be administered to typically developing students (which is usually judged by the fact that they do not have an IEP in school).

One thing you might notice when ordering a standardized test is that you have to chose a "fall" or "spring" grade level.  Obviously, if you gave a 4th grade test in the spring, students would perform much better on 4th grade material than at the beginning of the academic year.  

Notice that the phrase "Standardized Test" has to do with the way it is administered.  If you chose, say, to read aloud the reading comprehension portion of the standardized test -- well, your student's results will really have nothing to do with reading.  You've changed that subtest from a reading test to an auditory or listening comprehension test.  Your results will be meaningless numbers on the page.  I have had many, many conversations with home school mamas who wanted to read the reading comprehension subtests to their students.  They are not trying to cheat the system, but just unaware of how that changes the entire test and results.

A similar conversations I've had with moms is about timed tests.  Again, for your test to really mean anything, you must follow the directions and watch the time limits.  Otherwise, your results are really going to over-estimate your child's ability.  Some mamas have said to me, "but I don't want to test him with the stress of time" or "I just want to see how well he'd do without time limits."  If that is the case, don't use a test that has a timelimit.  I'd like to suggest using the Stanford Achievement Test (available at BJU Press), which has suggested time limits for each subtest -- but they are suggested.  I used this test when Luke was a 2nd grader and was slow to read and process reading information.

B. The one part of the test that I always see mamas chests get puffed up about is the grade-equivalence.  Yes, there is something satisfying by seeing your 6th grader receiving a PHS (post high school) age-equivalene on science or reading or math.  But we have to think more carefully about what this means.

An age-quivalent score relates more to your child's raw score (this is the number of questions on a subtest or test that your child answered correctly) than to any of the statistical standardized scores.  Again, I'm going to quote from BJU Press' brochure:

"Grade equivalents are not indicators of grade placement.  They are only estimates of  students standing in a continuum of learning.  In our example, Sally's GE of 4.7 does not mean that she is ready for fourth-grade material because she was not tested on fourth-grade material.  It means only that she has a thorough mastery of the material covered on a second grade test."

Here's another great explanation of grade-equivalent scores that I found at Cobb County, GA School District:

The Grade-Equivalent score compares your child’s performance on grade-level material against the average performance of students at other grade levels on the same material and is reported in terms of grade level and months. If your 5th grade child obtains a grade-quivalent of 10.5 on a standardized math or reading test, it does not mean that your child is solving math problems or reading at the mid-1oth grade level. It means that she or he can solve 5th grade math problems and read 5th grade material as well as the average 10th grade student can read and solve 5th grade math problems. Your child is performing much better than the average 5th grader but most likely would not perform as well if tested using 10th grade material as they have not yet been exposed to 10th grade material...

Although I hesitate to beat this issue into the ground, imagine one hundred 10th graders taking your child's 4th grade ITBS or CAT or Stanford achievement test.  If your child scores a grade equivalent of 10.4 (the decimal is the number of months in a school year) in reading, this score is telling you that your child is answering a similar number of problems correctly as the average of the students in the 10 grade, 4th month taking this test.  Has your child mastered the 4th grade material?  More than likely, yes.  Can he do algebra or geometry or chemistry like a 10th grader?  Well, the test you just administered won't be able to tell you;  it wasn't designed to assess if your child knows 10th grade subjects -- only 4th grade. You would need to administer a 10th grade test to your 4th grader to assess his/her mastery of 10th grade material like chemistry and geometry.


C.  The last set of standardized scores I want to go over is Stanines.  Professionally, I rarely used them, but they were often included as a possible way to look at test scores. Stanine scores group students at a particular grade into 9 groups.

From: http://www.ciil-miles.net/ETerms_SSamp4.asp


You can see on this graph above that stanines are more of a range of scores than a precise score.  Considering any number of issues might affect your students' performance (was it too hot in the room?  Was s/he hungry?  tired?  cold?  feeling stress?  bored?), a stanine might be a better way of assessing performance - because we all have good and bad days.

If you are interested in reading more about standardized assessment scores, I've found several additional resources you might consider reading:

There are many other articles out there.  I hope this has helped you to understand more about your students results.




No comments: