Skip to content

Latest commit

 

History

History
74 lines (53 loc) · 2.07 KB

README.md

File metadata and controls

74 lines (53 loc) · 2.07 KB

The Plan

produce a plaintext edition of the 1st ed Oxford English Dictionary based on page scans hosted by the Internet Archive (IA).

The Secret Sauce

IA did a fairly accurate ocr pass on the pages, and put the results into epub files. Once you extract that text, the most difficult step is complete. What remains is:

  • organizing the text into entries
  • correcting the ocr output
  • adding markup

The Sources

Introduction, Supplement, and Bibliography

Volume 1: A and B

Volume 2: C

Volume 3: D and E

Volume 4: F and G

Volume 5: H to K

Volume 5 Part 1: H

Volume 5 Part 2: I to K

Volume 6 Part 1: L

Volume 6 Part 2: M and N

Volume 7: O and P

Volume 8 Part 1: Q and R

Volume 8 Part 2: S to Sh

Volume 9 Part 1: Si to Sq

Volume 9 Part 2: Su to Th

Volume 10 Part 1: Ti to U

Volume 10 Part 2: V to Z