In April of 2012, Mark D. Shermis, then the dean of the College of Education at the University of Akron, made a striking claim: “Automated essay scoring engines” were capable of evaluating student writing just as well as human readers. Shermis’s research, presented at a meeting of the National Council on Measurement in Education, created a sensation in the world of education —among those who see such “robo-graders” as the future of assessment, and those who believe robo-graders are worse than useless.
The most outspoken member of the second camp is undoubtedly Les Perelman, a former director of writing and a current research affiliate at the Massachusetts Institute of Technology. “Robo-graders do not score by understanding meaning but almost solely by use of gross measures, especially length and the presence of pretentious language,” Perelman charged in an op-ed published in the Boston Globe earlier this year. Test-takers who game the programs’ algorithms by filling pages with lots of text and using big words, Perelman contended, can inflate their scores without actually producing good writing.