Appendix B. Transforming a TEI document

This text will give a few instructions on how to create a stylesheet in Oxygen that can be used to transform a more complicated XML document such as a letter that is encoded in TEI.

1. Create an empty stylesheet

Start Oxygen and open and validate the TEI file. Next, select File > New, and choose XSL Stylesheet in the window that appears. This should produce the xml-declaration and an empty <stylesheet> element. In the XSLT tutorial, it has been explained that each transformation template must be included in the document as a direct child of the <stylesheet> element. Also recall that the first template that you write must match either the root element or the root node. More concretely, we have two options for the match-attribute of our first template. We can use either

<xsl:template match="/">

or

<xsl:template match="TEI.2">

Next, we shall create an HTML file which has more or less the following structure:

<html>

<head><title>TEI</title></head>

<body>

<table>

[title of the letter]
  [text of the letter]

</table>

</body>
</html>

The body of the HTML file that we will make will contain a table. In HTML files, tables are often used to position the various element on the page. Such a table, together with the required HTML-tags (<html>,<head> and <body>) belongs to the "basis" of the webpage, and these can be created as soon as the root element or node is found in the document. The table consists of two rows. The first row contains only one column, and the title of the letter should be printed on this location.

The second row contains two columns, but the first of these remains empty. It only creates a "margin" on the page. The second column in the second row contains all the text of the letter. Since we want to ceate an HTML-file, we can, in our first template, include all the HTML tags that should minimally appear in the output file. In the stylesheet below you can see that the template has a match attribute with the root element (TEI.2). This template also contains all the HTML tags that produce the basic layout of the page.

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="TEI.2">

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF8"/>
</head>
<body text="#333333" bgcolor="#FFFFCC"><font face="Arial, Helvetica, sans-serif">
<table border="0" width="100%" cellpadding="40">
<tr>
<td colspan="2">
<h3><u><i> <!-- title of the letter goes here --> </i></u> </h3>
</td>
</tr>
<tr>
<td width="17%"></td>
<td width="83%"><!-- text of the letter goes here -->
</td>
</tr>
</table></font></body>
</html>

</xsl:template>

</xsl:stylesheet>

Next, we need to make sure that the appropriate text is extracted from the XML document and that it is entered on the right location in the HTML file.

2. Using XSLT Statements

Selecting the title of the letter

We can begin by entering the title of the letter in the first row of the HTML table. One of the locations from which the title can be selected is the teiHeader. The teiHeader contains a <titleStmt>, which is contained within <biblFull>, which in turn is contained within <sourceDesc>, inside <fileDesc>. The complexity of working with TEI is caused primary by the fact that the paths that you need to write can easily become very lengthy, since the file consists of many different levels. With such long paths, it is very easy to make a typing mistake, in which case the XPath expression will produce no results.

One way of checking if you have keyed in the correct path to a location is by copying the expression that you use in the XSLT stylesheet into the XPath text field in the top right hand corner of the Oxygen workbench. Realise that if you use this window, no specific context has been set. For this reason, each path that you type in must always depart from either the root element or the root node. If the path is correct, there will be a message that the XPath query was successful, and the result of the query will be displayed in a separate window underneath the XML-document.

In this XPath window in Oxygen, you can check that the XPath expression

TEI.2/teiHeader/fileDesc/sourceDesc/biblFull/titleStmt/title

does indeed select the text that we need.

This Xpath expression can now be included into an <xsl:value-of>-statement.

Selecting the text of the letter

Since the template that we have created is already quite long, this would be a good point to start a second template. This first template sets up the basics of the HTML page and selects the relevant information from the <teiHeader>, placing the title on the appropriate place in the HTML file. Other aspects of the transformation can then be arranged in other templates. Remember that you can use the element <apply-templates> to invoke other templates.

All the text that we need to include from the letter is contained within <div1>, which is found under <body>, under <text>, so it would be very convenient if the second template would set the current context to the <div1> element within <text>. Recall that the <apply-templates> element may contain a select attribute which can point towards a location further down in the hierarchy. If this is the case, the XSLT processor will look inside the stylesheet for templates which have a match-attribute which mentions this same element or one of its subelements. If a match is established, the context changes to the element that is mentioned, and the instructions inside the template will be followed.

To change the context to <div1> in our current stylesheet, we must first write a template which has a match-attribute with the value "div1". Secondly, we need to invoke the second template from the first template. To do this, replace <!-- text of the letter goes here --> with : <xsl:apply-templates select="text/body/div1">. A match for <div1> will have the effect that all the XPath expression in the second template can depart from the <div1> element.

We shall display the name of the place and the date from the <opener> in the TEI document first, using right alignment. To align text in HTML, the element that contains this text should be given the align attribute and a value which is either left, center or right. As has been explained earlier, you can select the value of the elements by using the <xsl:value-of> element :

<xsl:template match="div1">
<p align="right">
<xsl:value-of select="opener/dateline/name"/>
<xsl:text>, </xsl:text>
<xsl:value-of select="opener/dateline/date"/>
</p>
</xsl:template>

Next, you need to select the salute. This text will be given left alignment, but since this is the default way of aligning text in HTML, you do not need a special instruction to align text to the left. There is normally only one salute element, so this element can be also be added to the output stream simply by using <xsl:value-of>.

The situation for the <p> elements in the TEI-letter is differerent, as there will often be more than one paragraph in the letter. To ensure that every single paragraph is selected, use <xsl:for-each select="p">.

Within the <for-each>-block, you have various options for selecting the paragraphs. You can choose, for instance, <xsl:value-of select=".">. This will select the value of the <p> element, and all its subelements. As has been explained in part 11 of the tutorial, the disadvantage of this is that there is no possibility to process any of the child nodes of the <p> element in that situation. Another option is <xsl:apply-templates select=".">. This statement will instruct the XSL processor to find a matching template not only for the <p> element itself but also for all of its subelements. If no matching template is present, a built-in template will be used which selects only the contents of the element. This second option is more effective because this allows you to use templates for some of the subnodes of the <p> element as well.

A paragraph in a TEI-encoded letter may contain <lb>-element, which indicates the occurrence of a line break in the original letter. You can display these line breaks by writing a separate template for the <lb>-element. This can be a very simple template. The only thing which needs to be done is that an HTML line break is included each time the <lb/> element is found in the XML file. When you choose <apply-templates>, the template for <lb> is invoked each time this <lb/> element is encountered in the XML document.

<xsl:template match="lb">
<br/>
</xsl:template>

The closer can be created in much the same way as the opener. Note, however, that the <signed> element is centred, so you need to write the following code for this particular element :

<p align="center">
<xsl:value-of select="signed"/>
</p>