Automating the image creation
Our legacy site provided full-page black & white images, along with smaller images of those entries that had been transcribed. For our new site, we planned some key improvements:
- full-page images scanned in color from an original copy (vs. a modern facsimile copy)
- entry images appear alongside the transcription (vs. on a separate page)
- images for both the 1755 (1st folio) and 1773 (4th folio) editions (vs. just the 1755 edition)
The Cordell Collection of Dictionaries at Indiana State University provided us with thousands of full-page scans, and students volunteered to help us crop and merge them into entry images. Unfortunately, the hand-creation method was taking too long. We needed to be able to automate the process.
Co-PI Amy Giroux recruited a team of computer majors to help. UCF computer science majors must complete a two-semester senior design project in order to graduate. These students (Jacob Rodriguez, Brian Smith, Vincent Cardaman, and Miguel Severino) designed software that used computer vision and image manipulation to make images automatically. More specifically, here’s what the software did:
- Analyzed each full-page image for page-level rotation and corrected for it
- Parsed each page into word-level image files based on the headwords
- Analyzed the word-level images for rotation and correct for it
- Combined multi-part word images into single files of common width to keep text size common between the parts
- Adjusted contrast and brightness of multi-part words to make the image more uniform
- Named the resultant file based on the project’s standard naming conventions and save in a directory structure arranged by edition (1755/1773) and beginning letter of the word
These tasks were especially challenging given the dictionary’s antique print conventions (e.g., the long “s”) and two-column layout. You can see more details of the project in their powerpoint slides.
The software works quite well on high resolution images. (high resolution images –that’s foreshadowing. More on that topic later). About 70% of the time, it correctly determines where the images begin and end, and when the entry breaks across the bottom of the page, the software stitches the pieces together.
When the images are flawed, we need to crop and stitch the images by hand. Usually, the problem is that software has included too much or too little in the image. Occasionally it produces beautiful “Frankenstein” images that combine the beginning of one word with the end of another. Senses 2-3 of this image actually belong to a different entry, but on first glance, it looks great, doesn’t it?
We expect to make 83,938 entry images for our project. By creating this software, Jacob Rodriguez, Brian Smith, Vincent Cardaman, and Miguel Severino helped to leapfrog us much closer to the finish line–thanks!