Appendix C. XSL:FO
You should now have a fairly good idea of how XML files can be transformed into HTML files. As has been mentioned in the introduction of this course, you can use XSL for different types of transformations. This section will discuss XSL:FO, which is a language which can be used to give formatting instructions. In combination with XSLT, XSL:FO enables you to create a PDF file on the basis of an XML document.
A stylesheet that can be used to perform this type of transformation is tei-pdf.xsl, which you can download from this website. When you open this stylesheet in Oxygen, you will see that the document contains both XSLT-elements and XSL:FO-instructions. There is a distinction between formatting and transforming. When a file is transformed, this generally means that the information from the original file is re-used in a different way. It may mean that the items in the file are sorted in a different way, or that there is a shift from one mark-up language to another. Formatting instructions, on the other hand, always deal with aspects such as the size of the font, the font-family, the colour of the text and the margins and the dimensions of a page. If you use XSLT to transform XML into HTML, the distinction is somewhat blurred, because HTML tags are very often used to create a visual presentation of a text in a webbrowser.
To transform example.xml into a PDF-file, follow these steps:
- Create a new transformation scenario. To do this, click on the "Configure
Transformation Scenario" button. It has an icon with a spanner and a
screwdriver:
- In the window that pops up, click on "New". Another window appears.
In the top of new window, you can see a text field in which you can enter
a name for the scenario that you are about to create. Type, for instance,
"Transformation1". This is the name that will appear in the "Configure
Transformation Scenario"-window. This window contains three tabs :
- Click on the tab which is marked "XSLT". This screen contains a the text field which is labelled "XSLT URL", and here you can select the stylesheet that you want to use. Click on the "open"-button to navigate to tei-pdf.xsl.
- The second tab is labelled "FO Processor". Transformations on the basis of stylesheets which contain XSL:FO-instructions need to be performed by an FO-processor. Oxygen makes use of a built-in FO processor which has been developed by Apache. To activate this processor, you need to select the check box in front "Perform FO Processing".
- The rightmost tab allows you to name the output file. After "Save
as", type in the name that you would like to give to the PDF file
(e.g. output.pdf). If you do not specifiy a path, Oxygen will automatically
save the output in the same directory as the stylesheet and the XML-file.
- Click on "OK" in the current window to switch back to the the underlying window (the "Configure Transformation Scenario"-window). Having checked that the scenario that you have just created appears in the list and that it is selected, click on "OK" to close this window.
- If a scenario is available and if it has been configured correctly, you
can execture the transformation by clicking on the "Apply Transformation"-button
which is marked by a red arrow:
If the transformation is sucessful, a PDF file with the name that you have specified will have been created.
This part of the course will not provide a detailed explanation on how to write a stylesheet in XSL:FO, since it is a rather complicated language. What follows will only be a very general description of the stylesheet tei-pdf.xsl. A more detailed description of XSL:FO can be found in the official W3C Specifications or in this tutorial.
Some of the central concepts which are used in XSL:FO should be very familiar to you if you already have some experience with working in Adobe InDesign. When you open the stylesheet and inspect the source code, you should see that the template contains two distinct parts :
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="default-page-master">
...
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="default-page-master">
</fo:page-sequence>
</fo:root>
In general terms, the <fo:layout-master-set> and the <fo:page-sequence> coincides with the distinction in InDesign between template mode and publication mode, respectively. The <fo:layout-master-set> contains one or more page templates, which are called <fo:simple-page-master>-elements. Such a master-page contains the structural elements which should appear on every page of the document. Typographical specifications such as the general page dimensions and margin settings should be defined on this page. The actual text is then placed in a page which is placed within the <page-sequence>-element. The <page-sequence> can refer to the <simple-page-master> on the basis of the name that has been specified ("default-page-master" in the above example).
On each page, the page dimensions and the size of the margins can be defined on the basis of the fact that each page in XSL:FO contains 5 regions : region-body (the body of the page), region-before (the header of the page), region-after (the footer of the page), region-start (the left sidebar) and region-end (the right sidebar). Text is normally placed within the region-body. The text should be placed inside a <fo:block>-area. Such blocks are comparable to the text frames in InDesign.