Appendix C. XSL:FO

You should now have a fairly good idea of how XML files can be transformed into HTML files. As has been mentioned in the introduction of this course, you can use XSL for different types of transformations. This section will discuss XSL:FO, which is a language which can be used to give formatting instructions. In combination with XSLT, XSL:FO enables you to create a PDF file on the basis of an XML document.

A stylesheet that can be used to perform this type of transformation is tei-pdf.xsl, which you can download from this website. When you open this stylesheet in Oxygen, you will see that the document contains both XSLT-elements and XSL:FO-instructions. There is a distinction between formatting and transforming. When a file is transformed, this generally means that the information from the original file is re-used in a different way. It may mean that the items in the file are sorted in a different way, or that there is a shift from one mark-up language to another. Formatting instructions, on the other hand, always deal with aspects such as the size of the font, the font-family, the colour of the text and the margins and the dimensions of a page. If you use XSLT to transform XML into HTML, the distinction is somewhat blurred, because HTML tags are very often used to create a visual presentation of a text in a webbrowser.

To transform example.xml into a PDF-file, follow these steps:

This part of the course will not provide a detailed explanation on how to write a stylesheet in XSL:FO, since it is a rather complicated language. What follows will only be a very general description of the stylesheet tei-pdf.xsl. A more detailed description of XSL:FO can be found in the official W3C Specifications or in this tutorial.

Some of the central concepts which are used in XSL:FO should be very familiar to you if you already have some experience with working in Adobe InDesign. When you open the stylesheet and inspect the source code, you should see that the template contains two distinct parts :

<?xml version="1.0" encoding="ISO-8859-1"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">

<fo:layout-master-set>
<fo:simple-page-master master-name="default-page-master">
...
</fo:simple-page-master>
</fo:layout-master-set>

<fo:page-sequence master-reference="default-page-master">

</fo:page-sequence>

</fo:root>

In general terms, the <fo:layout-master-set> and the <fo:page-sequence> coincides with the distinction in InDesign between template mode and publication mode, respectively. The <fo:layout-master-set> contains one or more page templates, which are called <fo:simple-page-master>-elements. Such a master-page contains the structural elements which should appear on every page of the document. Typographical specifications such as the general page dimensions and margin settings should be defined on this page. The actual text is then placed in a page which is placed within the <page-sequence>-element. The <page-sequence> can refer to the <simple-page-master> on the basis of the name that has been specified ("default-page-master" in the above example).

On each page, the page dimensions and the size of the margins can be defined on the basis of the fact that each page in XSL:FO contains 5 regions : region-body (the body of the page), region-before (the header of the page), region-after (the footer of the page), region-start (the left sidebar) and region-end (the right sidebar). Text is normally placed within the region-body. The text should be placed inside a <fo:block>-area. Such blocks are comparable to the text frames in InDesign.