Digitalization is the way of the future, and with the recent deal between authors and Google books, that future may in fact be bright for all parties.

In the course of my dissertation work I often have to track down primary sources, and when those sources are particularly rare it becomes difficult. Or it used to be difficult. Now I Google it.

Exhibit A: This morning I needed to track down some homilies of Hebrews by Chrysostom. Being a dedicated Greek Geek, I wanted the “original,” which means I need Patrologia Graeca volume 63. Where am I going to get it? Google Books of course–they have the entire series digitalized and downloadable for your convenience. This is what sites like Google Books and archive.org are made for—primary sources in the open domain.

Image view of v63 of Patrologia Graeca

Image view of v63 of Patrologia Graeca

Here are some screenshots for you. The first is the standard scan, downloadable as a pdf. The second is Google’s attempt at a little OCR, which obviously is struggling with both the Greek and the Latin. This is to be expected. I did a little natural language processing way-back-when; a lot of OCR software will “guess” the letters based not only on shape, but on the software’s (limited) understanding of the language, which for Greek and Hebrews is probably NULL. Still, I was impressed, and this is a harbinger of great things to come.

OCR view of v63 of Patrologia Graeca

OCR view of v63 of Patrologia Graeca

So what primary sources have you been trying to track down? How do you use research tools like these? Post in the comments!

Related posts:

  1. Search your PDFs with OCR
  2. Get Real Text out of your Scanned Documents
  3. Send Web Documents Straight to Google Docs
  4. Even More PDF Tools
  5. A Guide to Using Zotero in Biblical Studies: Collecting, Annotating and Citing Bibliographic Data

  11 Responses to “Amazed by Google Books”

  1. As a side note. I know some authors are upset that current books with sample pages are scanned by google. i'd say 20% of the time I find the reference I'm looking for, but 80% of the time it causes me to track down the full hard copy of the book. This often leads to a purchase.

    On the other hand, if their book was not scanned by google with liberal sample pages – I would never buy their book. I'd never know that they treated the subject I'm interested in or that it could possibly be a valuable resource.

  2. Project Gutenberg have some greek sources as well: http://www.gutenberg.org/browse/languages/el

    There isn’t a great deal there but it may be helpful to somebody.

  3. The issues are tricky for books not in the open domain. Google recentrly reached a settlement with some folks regarding that issue (see link in the article, or Related Posts links).

    With open-domain books, however, the issues are not complicated, and the benefits are amazing.

  4. Tommy, how do you reference these in your dissertation? As if you had the hard copy of the book in front of you or as an internet source?

  5. As if the hard copy was in front of me (since they are scans of the original that include page ##s). I guess in the interest of full disclosure you can could list the work as-normal in your bibliography and add a link to your online source at the end, but that seems unnecessary to me provided the source is an original scan.

    If the source is NOT a scan (like a plain-text version of the Early Church Fathers) AND you need to reference page numbers (a lot of times for ancient material you don't, since there are standard subdivisions similar to biblical chapter-verse notation) everything changes, but it really depends on the primary source. In general I follow SBL guidelines. They have some specifics on Ancient Christian Writings on p 84 ff.

    Am I making sense?

    • Yes, that makes sense. I think that if the point of the reference is to allow anyone else to look it up and see what you are seeing then it's a lot easier just to cite as if it is a hard copy. And actually, it is no different from working from a photocopied section, I suppose.

      I am just nervous that someone (by which I mean my external examiner) will come along and ask when it was that I had access to some exceptionally obscure text hidden away in the Vatican and I'll have to explain that I just googled it. But I'm sure you're right. ;)

  6. Yes I do like Project Guttenburg. Excellent stuff! Thanks for the link

  7. Tommy,
    Is the digitized version only available to read online? I downloaded the .pdf version of and old version of LSJ (http://books.google.com/books?id=moTvy2iYtcEC&amp… but it is the actual scan of the old book, which is hard to read at many points. Is this the normal way they do things, where the digitization is not available for download?
    Ben

    • The quality of the scans varies by the book. The quality of PG that I mentioned above is pretty rough at points (makes me want to take a class on text criticism), but at least I have it without driving to Duke.

      The LS scan is particularly bad. It seems that some of the pages are not scans at all but pictures, and the scans themselves are low resolution. The technology isn't perfect yet, and scans are inconsistent because they are archived from a variety of sources, and there is no "training" provided on how to do it, etc.

      But it was available to download from the link you provided. Do you not see the "Download" link?

      Also, that was not the only scan of LS available. Here is a better one: http://books.google.com/books?id=nBwQAAAAYAAJ&amp… . I think it's little-liddell though. Browse around. Scans are provided by libraries and other sources that are participating with Google, so maybe there is a better version available.

      The real problem is that the Greek at least is not searchable, which makes it difficult to find things. Still no replacement for the actual book, at least for Greek, but it's showing promise, and for rare items it is irreplaceable!

  8. [...] for reading and browsing biblical texts. There is a lot of data online, and the continuing digitilazation of texts means the wealth of data is growing every [...]

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

   
© 2011 Nerdlets Suffusion theme by Sayontan Sinha