Identifying Syriac Fragments in the Digital Age

January 5, 2023
Identifying Syriac Fragments in the Digital Age

This story is part of an ongoing series of editorials in which HMML curators and catalogers examine how specific themes appear across HMML’s digital collections. From the Eastern Christian collection, Dr. James Walters shares this story about Fragments.

What would you do if someone handed you a page that had clearly been torn out of a book and asked if you could identify the source of the fragment? Where would you start?

Recently, the Eastern Christian cataloging team at HMML has been working on a large collection of manuscripts from the library of the Dominican Friars of Mosul (DFM). Among those materials, we came across DFM 00809, an unidentified, small fragment with Syriac writing, written on parchment, likely torn from a larger manuscript.

Parchment was a very common writing material for Syriac manuscripts until the 12th–13th centuries, when it gradually began to be replaced by paper. Given the use of parchment and the fragment’s particular Syriac script (known as Estrangelā, associated with earlier stages of Syriac literature), it seemed reasonable to conclude that this fragment was considerably older than many other manuscripts in the DFM collection.

DFM 00809
Two sides (front and back) of a Syriac fragment on parchment. (DFM 00809)

Looking at both sides of the fragment, what is visible to the naked eye is only a few words at the edge of a single folio. Some of these words are extremely common words like “God” (Syriac: alāhā) and “king” (Syriac: malkā), making it difficult to narrow down potential matches for the text.

As you’ll see, eventually I was able to determine that the small manuscript fragment is from a witness to a text known as the Testament of Ephrem, a text that claims to be an autobiographical account of the life of the famous Syriac author, Ephrem the Syrian, but that was in fact almost certainly written well after his death by another author.

Testament of Ephrem
Images of a printed edition of the Testament of Ephrem, edited and published by M. Rubens Duval in 1901 (Le Testament de Saint Éphrem, Paris: Imprimerie Nationale). Photographs by Google Books (replicated here under Fair Use guidelines).

I would like to say that the identification of this particular fragment was the result of a deep knowledge of Syriac literature, but the simple truth is that a correct identification was only possible through the combination of technological advancements and a significant amount of luck.

The primary breakthrough was thanks to the application of OCR (Optical Character Recognition) technology to a wide variety of languages and writing systems. (Note: if you aren’t familiar with OCR, it is the same technology that makes it possible for you to search a PDF document.) The availability of automated OCR has led some repositories of digitized materials, such as Google Books and the Internet Archive, to make an OCR-generated transcription of each book’s text, paired with the scanned images of the book; subsequently, any word in that text can be found using a search engine. OCR technology has existed for some time, but only recently has it been harnessed for more complex writing systems like Hebrew, Syriac, Arabic, and Persian.

The luck aspect of the identification of DFM 00809 is two-fold: 1) one of the surviving lines preserves a two-word phrase that is rare enough that it allowed for an easy positive identification (returning one, rather than thousands, of search results), and 2) the correct text just happens to exist in digital format through Google Books. Given how few words from the fragment survive and how few Syriac texts have been digitized with OCR, it is remarkable that such an identification was possible.

page of text
A screen-capture of the successful search result that allowed for the identification of the text. Photograph by Google Books (replicated here under Fair Use guidelines).

What’s even more remarkable is that the Syriac phrase that ultimately returned a successful search result—ܥܠ ܕܗܘܢܝ (“on account of my mind,” or “because of my mind”)—was not even in the main text of the Testament of Ephrem book that was scanned with OCR; it appears in a footnote from the printed book’s editor (M. Rubens Duval), which provides alternate readings from specific manuscripts that the editor used in the creation of the edition.

In this case, it is notable that one of the manuscripts that the editor mentions in the footnote, which he designates as “G,” is elsewhere identified only as a “manuscript from Mosul.” Unfortunately, Duval does not provide any additional details about this manuscript, but it is possible that the fragment DFM 00809 is from the very manuscript that he consulted for his edition.

The status of the rest of the manuscript from which the fragment originated is unknown. It has not yet been found among the other manuscripts in the collection of the Dominican Friars of Mosul, so unfortunately, it is possible that it is either lost or destroyed. The survival of such manuscript fragments, and their further preservation in digital images, highlights the importance of digitization, cataloging, and access through preservation.

Despite all the progress that we have made on cataloging fragments, there remain hundreds of unidentified fragments, not only in Syriac, but in other languages as well. So if you ever want to try your hand at solving a manuscript fragment puzzle, there is still plenty of work to be done!

collection of small manuscript fragments
A collection of small manuscript fragments, awaiting identification. (DSCQ T 00136)

Get the latest news direct to your inbox

You can unsubscribe at any time.