Are You Testing Too Much (Part 2 of 2)

How to Improve Assessment for Learning

In part 1, I shared one way that you could be testing too much – putting too much into one test. That post shared two solutions to the “big” test problem.

  • Making the test bigger (not a good solution)
  • Using unidimensional tests

I promised to address four more topics here, in part 2:

  1. How to know if you’re using unidimensional tests.
  2. Examples of how to create unidimensional tests for different subjects.
  3. Reasons why you should start using unidimensional tests now!
  4. The second way you could be testing too much.

The four topics will take a closer look at the question, Are you testing too much?

Testing Too Much Part 2

Are You Using Unidimensional Tests?

I don’t know of any school that uses this term – unidimensionality. However, some schools use these tests in small applications, usually related to RTI.

Sometimes they go under the term curriculum-based measurements (CBMs). They are not to be confused with curriculum-based assessments (Here’s a discussion on the use of CBMs).

Some common formative assessments can capture the essence of unidimensional tests. I  attended a small-group session about 8 years ago with Chris Jakicic, just as she was finishing her book, Common Formative AssessmentIn that presentation, she delivered a simple way for teachers to design unidimensional tests (I recommend her book).

Here are a few checkpoints to see if you’re using unidimensional tests:

  • Are the results consistent from test to test? Either consistently flat or consistently growing. This is the opposite of test scores that jump up and down and up.
  • Is the test focused on the set of skills and concepts related to one topic or standard?
  • Are the test items focused on three levels performance within that topic?
  • Is the test quick to administer?
  • Does the test really tell what the student knows or can do?

A unidimensional test is one that tests one dimension of a skill. It tests one aspect of a topic and one set of skills and knowledge related to overarching course ideas.

Tests designed in this manner can be given more frequently because they take less time to administer. They also give better feedback to teachers about their effectiveness and feedback to students about their growth.

Examples of Unidimensional Tests

Let’s look at two examples from elementary reading and middle school Algebra.

It’s the typical “big test” habit for schools to give mini-version of the end of year criterion-referenced test. You know, multiple choice questions focused on the genre and set of skills that were recently taught.

Instead of this status quo assessment practice, let’s look at how the unidimensional test improves the assessment routine. Also, see how these assessments actually address the concerns of Anderson and Krathwohl (2001)* when they posited, “Different types of objectives require different approaches to assessment.”

Unidimensional Assessments Done Right

Disclaimer >>> I’m no expert in reading or Algebra curriculum, so the experts will certainly take this to another level. But even with my limited knowledge, I’d wager $500 that the split-half and test-retest reliability of the assessments will improve drastically! And I’m serious about the wager…email me and we can make it happen.

You’ll see in the following examples that the tests are focused on a single power skill (read below about power standards and power skills). The tests have these defining characteristics:

  • Quick to administer.
  • Three levels of questions.
  • Scoring describes student proficiency with prerequisite knowledge, basic application, and mastery.

4th-Grade Reading Example

The unidimensional test is made of 15 test items and can be administered in 20 minutes. The assessed topic is summarizing expository texts. The first 5 test items assess tier 2 vocabulary terms related to the content and thinking processes (i.e. main idea, paraphrasing, expository). The next 5 test items are lengthy sentences, and students identify the correct paraphrase of the sentence. Questions 10-13 are brief paragraphs and students identify the main ideas. The final 2 test items ask students for summaries of the previous paragraphs.

7th-Grade Algebra Example

The assessed skill is solving algebraic equations – a big pre-algebra topic. The first 5 test items would be designed to assess prerequisite key terms and one-step equations from 6th-grade math. The next 5 questions will assess students ability to solve two-step equations and inequalities. The last 5 questions will connect solving equations and inequalities to real-world applications using word problems.

Benefits of these Two Examples

In each of the examples above, you will find three levels of test items: prerequisite, basic, and complex. This could easily equate to a performance scale for standards-based grading or into a score in a traditional 100-point. Because of its design, the test score will tell three levels of mastery:

  1. Prerequisite – basic concepts or prerequisite skills are mastered.
  2. Basic – basic version of the skill is mastered.
  3. Complex – the target level of mastery for this skill.

What’s more important is the alignment issue. Take a look at another quote from Anderson and Krathwohl (2001)

If instruction is not aligned with assessments, then even-high-quality instruction will not likely influence student performance on those assessments…the results of the assessments will not reflect achievement.

You know alignment is critical, but let’s think about from a test item perspective. If instruction targets prerequisite skills and knowledge, basic versions of the target skill, and complex thinking, why don’t tests also reflect this variety?

Great instruction addresses a progression of learning, and great assessments should do the same. This is what unidimensional assessment does. And assessments that do this have a higher likelihood of encouraging learning instead of discouraging learning.

How to Create Unidimensional Assessments

Here’s a quick overview of how to design a unidimensional test.

1. What are the specific power skills?

Using the power standard approach (read more from Doug Reeves here), teachers will focus in on the most important 2-3 skills or power standards of the next 2-4 weeks. These are the power skills that need to be unpacked.

2. What does the power standard look like as a test item?

Think of the end product. Is it an open-ended short answer response, multiple choice with multiple steps, or the creation of a diagram or graphic?

Then teachers will determine the content knowledge and concepts that support the end product or final test item. Think of the power standard in terms of its learning progression.

3. Design the prerequisite test items.

What declarative knowledge, vocabulary terms, or concepts do students need to understand in order to master this skill? What are the building-block skills that progress toward this skill? Design a third of the test items to this content. Note: Students should be able to complete this section of the test in only a few minutes.

4. Design test items for the basic form of the skill.

It goes without saying, the skill worth assessing is complex and higher-level. Otherwise, it doesn’t deserve its own assessment. That being said, about a third of the test items should represent the skill in a straightforward, basic form. Remember the summarizing example above? Finding the main idea was the basic form of the more complex summarizing.

5. Design your complex test items.

These are the test items that typically make up those unreliable “big tests” that I dissected in part 1. They require strategy, critical thinking, and multi-step problem-solving. They should only be mastered when students actually master the target skill. A high score on a unidimensional test shows advanced mastery – that’s different than other types of tests where a high score can show a variety of factors (i.e. socio-economic status, prior knowledge, test-taking skills, etc…).

I tried not to be too thorough or technical here, but I hope you see how unidimensional tests differ from tradition testing approaches and hold a clear advantage in terms of feedback to teachers and fairness to learners. Let’s recap the benefits.

Why You Should Use Unidimensional Tests Now!

Yes, this heading is designed to grab your attention, but it’s not altogether sensational. You should use unidimensional tests now. Here are 13 reasons why:

  1. Engage teachers in prioritizing their curriculum around power standards.
  2. Transforming power standards into power skills.
  3. Increase your school’s understanding of the connections between standards, learning, and assessment.
  4. Build stronger understanding of alignment between prerequisite knowledge and higher-level skills.
  5. Create transparent assessment protocols that help district educators and community members share understandings of expected student learning.
  6. Engage teachers in developing tests that show a progression of learning, not just mimicry of state standardized tests.
  7. Use tests that have a higher degree of reliability.
  8. Provide teachers with feedback on knowledge gaps (without having to interpret standards reports and guess at probable causes for gaps in student learning).
  9. Provide parents with accurate reports on what students know and can do.
  10. Give students a simple picture of what they need to learn.
  11. Help students understand how their learning equates to actual test performance.
  12. Increase visible learning from better assessment and data practices (read more here and here).
  13. Maximize the power of sensitive data to determine the impact of your practices (read more on Sensitive Data vs. Lethargic Data).

The 2nd Way You Could Be Testing Too Much

Let’s end this discussion be going back to our original question, Are you testing too much? You can test too much in a single test or you can test too often. This second problem is witnessed over and over without much effect on learning. You know you’re testing too often when:

  • You don’t have time to thorough reteach students who indicated a need from the previous assessment.
  • You don’t have time to really understand the results of the test.
  • The results from test to test don’t show the impact of your instruction.
  • Students are intrinsically motivated to take a test.

Yeah, that last one pretty much captured most schools everywhere in the United States.

I’m not claiming a silver bullet here, but in our current classrooms and core courses, unidimensional tests are the closest thing we have to getting reliable, formative and actionable data about student learning.

What’s more, unidimensional assessments are designed to hit the sweet spot in testing frequency. They are short. They are sensitive. This means they should be given every 2-4 weeks, preferably right around 12-15 days.

With this frequency, a teacher can identify reteaching needs and provide learning opportunities. Also, different instructional approaches can be implemented and given time to work (or not).

Reassessing after 12-15 days will give the learners and teachers full opportunity to show significant growth.  It’s just long enough for an impact to happen, but not so frequent that learning time is lost or students judge it to be monotonous.

Moving Forward with Assessment

There’s no time for 3-year transition plan. Start now designing and using unidimensional tests. Additionally, consider other forms of assessment that are responsive to students, provide agency, and paint a picture of the whole child. There’s no time for subpar and unreliable assessment practices. Our students, our futures, deserve better.

In two upcoming posts, I am asking the experts to share their thoughts on the next steps for assessment practice, and I will share my own thoughts regarding an untouched topic – sensitive data.

Thank you for reading.

Please consider these other related readings:

*Anderson, L., Krathwohl, D., et al (2001). A taxonomy for learning, teaching, and assessing. New York, NY.

Some Tests are Just Bad, Like Testing Too Much
Free to Share