The ROAR-CAT’s out of the bag: A reliable, efficient reading assessment

As RuPaul famously says, “Reading is fundamental.” It forms the foundation of how we learn, communicate, and engage with the world around us. That’s why it’s essential to have good, reliable ways to measure reading ability. A parent might do this by asking their child to sound words out or try reading a sentence. Researchers, clinicians, and teachers could do it the same way, but what if they need to assess a whole classroom, or even an entire school?

Reading is one of the most important skills we can learn. But how do we assess reading ability? Photo by Andrea Piacquadio (pexels.com).

Conducting lots of one-on-one reading assessments would be resource-intensive and time-consuming. The Rapid Online Assessment of Reading (ROAR) is a web-based tool that solves this problem. ROAR is a lexical decision task, where test-takers see combinations of letters (e.g., how or xop) and need to quickly identify if they believe it is a real word or not. Some combinations are more challenging than others – There could be fake words that look real, or real words with unusual spellings. Using these difficulty levels alongside the test-taker’s responses, ROAR automatically computes the test-taker’s reading ability.

ROAR is already great at measuring reading ability, but is there a way to make it more efficient for large-scale use in schools? In their latest research published in Behavior Research Methods, Wanjing Anya Ma, Adam Richie-Halford, Amy K. Burkhardt, Klint Kanopka, Clementine Chou, Benjamin W. Domingue, and Jason D. Yeatman (pictured below) answer this question, letting the ROAR-CAT out of the bag.

Authors of the featured article. Top row, left to right: Wanjing Anya Ma, Adam Richie-Halford, Amy K. Burkhardt, Klint Kanopka. Bottom row, left to right: Clementine Chou, Benjamin W. Domingue, and Jason D. Yeatman.

Standard ROAR shows test-takers each set of letters in a completely random order. In this study, the researchers enhanced ROAR using computerized adaptive testing (CAT), a technique that selects the most informative set of letters to display next based on the test-taker’s performance. For example, it’s not super helpful to show a poor reader a word like “homogenization,” or to show a strong reader a word like “dog.” CAT is already used in popular tests like the GRE, the GMAT, and the Duolingo English Test. The researchers’ new tool, ROAR-CAT, works by constantly adapting ROAR’s lexical decision task based on the test-taker’s performance.

Figure depicting the ROAR lexical decision task. Participants are shown combinations of letters and must decide if they form a real word. The instructions are narrated like a story to keep young participants engaged.

In Study 1, the researchers computed the difficulty level of over 200 letter combinations, a process called calibration. They found that item difficulty was highly consistent between schools varying in socioeconomic status, age, and language-based learning difficulties. For example, hust was consistently more difficult for students than ggnoi. This means ROAR can be used to assess diverse groups of students in schools!

In Study 2, the researchers used these letter combinations to directly compare traditional ROAR to ROAR-CAT. Nearly 500 students in grades 1 through 8 participated. To reach a high level of reliability, ROAR-CAT took 40% fewer trials than traditional ROAR – 75 items vs. 125 items. This is exciting evidence that ROAR-CAT is effective and efficient!

Figure comparing empirical test reliability, bias, and mean squared error between ROAR-CAT and traditional ROAR (ROAR-Random) over the course of testing. Compared to traditional ROAR, ROAR-CAT reached higher reliability and lower mean squared error in fewer trials, and consistently performed more effectively.

Next, the researchers did a study to compare ROAR-CAT to two reading assessments that are commonly used in schools: Fountas and Pinnell Benchmark Assessment (F&P) and FastBridge Curriculum-Based Measurement for Reading (FAST CBM). In F&P, students read selected passages for 20-30 minutes before their reading ability is judged. In FAST CBM, reading ability is judged based on how many words students can read from one passage over the course of one minute. In both of these cases, students must read aloud. In grades 1 & 2, ROAR-CAT aligned extremely well with both of these well-established measures of reading ability without requiring test-takers to read aloud.

Figure showing the correlations between ROAR-CAT’s reading ability scores and those provided by FAST CBM and F&P in grades 1 and 2. The results suggest that ROAR-CAT aligns extremely well with both of these measures in both age groups.

These studies are the first to combine CAT with a lexical decision task. Altogether, ROAR-CAT is a valid, efficient assessment of reading ability for a wide range of learners. In the words of the researchers,

“ROAR-CAT serves as both a reliable screening solution and a robust research tool, grounded in data from diverse student populations and already making an impact in hundreds of classrooms across the US.”

The tools show a lot of promise in identifying reading difficulties in children, and have even been officially recognized as part of California’s universal dyslexia screening program.

ROAR-CAT is designed for long-term flexibility, as it can easily be updated with new test items over time. It is built using jsCAT, which integrates seamlessly with online research tools like jsPsych. If you’re a researcher, teacher, or parent interested in trying this quick, science-backed reading assessment, give it a try online! Reading really is fundamental, and ROAR-CAT represents a huge improvement in how effectively and efficiently we can assess reading skills. You could say the ROAR-CAT’s officially out of the bag.

Featured Psychonomic Society article

Ma, W. A., Richie-Halford, A., Burkhardt, A. K., Kanopka, K., Chou, C., Domingue, B. W., & Yeatman, J. D. (2025). ROAR-CAT: Rapid Online Assessment of reading ability with computerized adaptive testing. Behavior Research Methods, 57(1), 56. https://doi.org/10.3758/s13428-024-02578-y

Author

  • Anthony Cruz is a PhD Candidate in the Department of Psychology at Western University. Under the supervision of Dr. John Paul Minda, he studies category learning, the process by which people learn to sort objects into groups. His research looks for ways to help people learn categories more effectively. He researches how spaced learning (taking breaks while studying) and metacognition (reflecting on your own learning) can enhance memory and make categorization easier.

    View all posts

The Psychonomic Society (Society) is providing information in the Featured Content section of its website as a benefit and service in furtherance of the Society’s nonprofit and tax-exempt status. The Society does not exert editorial control over such materials, and any opinions expressed in the Featured Content articles are solely those of the individual authors and do not necessarily reflect the opinions or policies of the Society. The Society does not guarantee the accuracy of the content contained in the Featured Content portion of the website and specifically disclaims any and all liability for any claims or damages that result from reliance on such content by third parties.

You may also like