Search SBL
 







SBL Forum Archive
<< Return to SBL Forum Archive XML: A Procrustean Bed for Biblical Texts?

XML (Extensible Markup Language) is what tells a computer how to process and display text. It makes sure a paragraph appears where it should, and generally creates the visual markers of content logic or flow. XML markup uses arrowed brackets, like: <verse id="Matt.1.24">; these are not visible to the reader, but they inform the computer that a verse is about to begin and to display the text accordingly.

Without XML we wouldn't be able to easily interchange texts or use the same text for an article, a webpage and a monograph. But XML has its limitations which raise the question of whether it unduly forces conformity on literary complexity.

For example, if a first year New Testament class was asked to open their bible and identify the beginning and ending of the sentence in Matt: 1.24-25:

24. When Joseph awoke from sleep, he did as the angel of the Lord commanded him; he took her as his wife, 25. but had no marital relations with her until she had borne a son; and he named him Jesus.

It would be no surprise that every member of the class would complete the assignment correctly.

If, however,you were familiar with XML and wanted to mark up the Matthean text you would face a difficult choice. You could present the text with the traditional versification. To enable searching and display by verses, you would write:

<verse id="Matt.1.24">When Joseph awoke from sleep, he did as the angel of the Lord commanded him; he took her as his wife,</verse><verse id="Matt.1.25">but had no marital relations with her until she had borne a son; and he named him Jesus.</verse>

On the other hand, you may want to record other divisions in the text. To allow linguistic analysis of sentence level structures, you could write:

<sentence>When Joseph awoke from sleep, he did as the angel of the Lord commanded him; he took her as his wife, but had no marital relations with her until she had borne a son; and he named him Jesus.</sentence>

But the second method loses the versification information that is traditionally used to locate passages in the biblical text. What would be ideal, would be to mark the text as follows:

<verse id="Matt.1.24><sentence>When Joseph awoke from sleep, he did as the angle of the Lord commanded him; he took her as his wife,</verse><verse id="Matt.1.25"> but had no marital relations with her until she had borne a son; and he named him Jesus.</sentence></verse>

Unfortunately, all extant XML software will only allow the text to have either the verses or the sentences marked as shown, but not both.

The condition illustrated above is what is known in XML literature as "overlapping hierarchies." The sentence in the example "overlaps" the end of the verse (marked with </verse> and continues into another verse. A first year student can easily see both the verses and the sentence, or other syntactic or narrative structures that are found in a text. XML software on the other hand, does not know how to proceed when it reaches the end of the first verse and the previous sentence has not ended.

This problem is not limited to the verse-versus-sentence case but extends to such traditional topics as the redaction history of the Hexateuch (JEDP), alignment of the synoptic gospels, narrative or poetic structures, variant textual traditions, and others. One of the principal difficulties of applying XML to biblical texts is the poor support of XML for encoding complex texts. Scholars using XML can be faced with ugly choices that ultimately mandate forcing a biblical text to lose some of its richness in order to conform to the limitations of XML as a tool.

The problem of overlapping hierarchies is not a new one and a variety of approaches have been developed to address the problem. Some of those solutions exist only in theory or rely upon software not yet written. Others mandate special processing and a good deal of expertise in markup systems to implement and use. A solution that requires expertise to use or relies upon non-existent software is obviously not a good choice for biblical scholars.

The SBL has been sponsoring research to devise a solution to the overlapping hierarchy that does not require expertise to implement or use. In a nutshell, the current research is focused on making the XML processor (software that processes the text) see the text much as does the first year student. In other words, when told to find verses Matt: 1.24-25, from:

<verse id="Matt.1.24><sentence>When Joseph awoke from sleep, he did as the angle of the Lord commanded him; he took her as his wife,</verse><verse id="Matt.1.25"> but had no marital relations with her until she had borne a son; and he named him Jesus.</sentence></verse>

The processor only sees the verse markup:

<verse id="Matt.1.24><sentence>When Joseph awoke from sleep, he did as the angle of the Lord commanded him; he took her as his wife,</verse><verse id="Matt.1.25"> but had no marital relations with her until she had borne a son; and he named him Jesus.</sentence></verse>

And when told to process sentences in the text, the processor only sees the sentence markup:



<verse id="Matt.1.24><sentence>When Joseph awoke from sleep, he did as the angle of the Lord commanded him; he took her as his wife,</verse><verse id="Matt.1.25"> but had no marital relations with her until she had borne a son; and he named him Jesus.</sentence></verse>

Such processing allows scholars to mark any structure they find in a text, without regard to the customary limitations of XML. Using a variation of that processing, it is also possible to determine when a particular word or set of words occurs within different structures in the text, which may overlap.

Scholars who encode biblical texts for more than temporary use should be mindful of the problem of overlapping hierarchies and the proposed solutions to those problems. A good starting place for researching those problems and the range of possible solutions is at: http://www.sbl-site2.org/Overlap. Papers that address this problem with citations to prior solutions as well as the ongoing research by the SBL are located at this site.

Poorly used, XML is a procrustean bed for biblical texts. Used with attention to its limitations and the ongoing development of better tools, XML can serve the needs of biblical scholars as they explore the complexities of biblical texts.

Patrick Durusau is the SBL's Director of Research and Development.

Citation: Patrick Durusau, " XML: A Procrustean Bed for Biblical Texts?," SBL Forum , n.p. [cited Feb 2006]. Online:http://sbl-site.org/Article.aspx?ArticleID=131

 


JOIN   |  DONATE   |  CONTACT   |  SBL TWITTER   |  BIBLE ODYSSEY TWITTER   |  PRIVACY POLICY

© 2024, Society of Biblical Literature. All Rights Reserved.