Monday, January 03, 2005

Tackling online searches of handwritten material

IHT - NY Times: "Google's experimental service Google Scholar scours the Web for academic papers. But while the site, at http://scholar.google.com, may prove an asset for students and those who work in areas like science, neither it nor any other search engine is very useful for historians whose research involves plowing through documents from the time when handwriting ruled. "There is an enormous amount of handwritten stuff locked away in many archives, libraries and museums," said R. Manmatha, an assistant professor with the Center for Intelligent Information Retrieval at the University of Massachusetts at Amherst.

Some sophisticated handwriting recognition systems are in use. The U.S. Postal Service and postal agencies around the world use them to read addresses at sorting stations. But Manmatha said the experience developed from those systems was not particularly useful when he and two graduate students, Toni Rath and Victor Lavrenko, began work on their manuscript-searching project. The current systems have to cope with only a limited range of material - for example, names and addresses - written in a consistent format.

To develop their system, Manmatha and his students obtained about 1,000 pages of George Washington's correspondence that had been scanned from microfilm by the Library of Congress. Though no library or archive has yet approached Manmatha about the system, he will brief Google on it early in 2005. With sufficient funds for software development and document scanning, he said, it might be possible within a decade for people to search historical manuscripts as easily as they now locate anything else on the Web."

No comments: