[15c7d39] | 1 | <refentry xmlns="http://docbook.org/ns/docbook"
|
---|
| 2 | xmlns:xlink="http://www.w3.org/1999/xlink"
|
---|
| 3 | xmlns:xi="http://www.w3.org/2001/XInclude"
|
---|
| 4 | xmlns:src="http://nwalsh.com/xmlns/litprog/fragment"
|
---|
| 5 | xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
---|
| 6 | version="5.0" xml:id="make.index.markup">
|
---|
| 7 | <refmeta>
|
---|
| 8 | <refentrytitle>make.index.markup</refentrytitle>
|
---|
| 9 | <refmiscinfo class="other" otherclass="datatype">boolean</refmiscinfo>
|
---|
| 10 | </refmeta>
|
---|
| 11 | <refnamediv>
|
---|
| 12 | <refname>make.index.markup</refname>
|
---|
| 13 | <refpurpose>Generate XML index markup in the index?</refpurpose>
|
---|
| 14 | </refnamediv>
|
---|
| 15 |
|
---|
| 16 | <refsynopsisdiv>
|
---|
| 17 | <src:fragment xml:id="make.index.markup.frag">
|
---|
| 18 | <xsl:param name="make.index.markup" select="0"/>
|
---|
| 19 | </src:fragment>
|
---|
| 20 | </refsynopsisdiv>
|
---|
| 21 |
|
---|
| 22 | <refsection><info><title>Description</title></info>
|
---|
| 23 |
|
---|
| 24 | <para>This parameter enables a very neat trick for getting properly
|
---|
| 25 | merged, collated back-of-the-book indexes. G. Ken Holman suggested
|
---|
| 26 | this trick at Extreme Markup Languages 2002 and I'm indebted to him
|
---|
| 27 | for it.</para>
|
---|
| 28 |
|
---|
| 29 | <para>Jeni Tennison's excellent code in
|
---|
| 30 | <filename>autoidx.xsl</filename> does a great job of merging and
|
---|
| 31 | sorting <tag>indexterm</tag>s in the document and building a
|
---|
| 32 | back-of-the-book index. However, there's one thing that it cannot
|
---|
| 33 | reasonably be expected to do: merge page numbers into ranges. (I would
|
---|
| 34 | not have thought that it could collate and suppress duplicate page
|
---|
| 35 | numbers, but in fact it appears to manage that task somehow.)</para>
|
---|
| 36 |
|
---|
| 37 | <para>Ken's trick is to produce a document in which the index at the
|
---|
| 38 | back of the book is <quote>displayed</quote> in XML. Because the index
|
---|
| 39 | is generated by the FO processor, all of the page numbers have been resolved.
|
---|
| 40 | It's a bit hard to explain, but what it boils down to is that instead of having
|
---|
| 41 | an index at the back of the book that looks like this:</para>
|
---|
| 42 |
|
---|
| 43 | <blockquote>
|
---|
| 44 | <formalpara><info><title>A</title></info>
|
---|
| 45 | <para>ap1, 1, 2, 3</para>
|
---|
| 46 | </formalpara>
|
---|
| 47 | </blockquote>
|
---|
| 48 |
|
---|
| 49 | <para>you get one that looks like this:</para>
|
---|
| 50 |
|
---|
| 51 | <blockquote>
|
---|
| 52 | <programlisting><indexdiv>A</indexdiv>
|
---|
| 53 | <indexentry>
|
---|
| 54 | <primaryie>ap1</primaryie>,
|
---|
| 55 | <phrase role="pageno">1</phrase>,
|
---|
| 56 | <phrase role="pageno">2</phrase>,
|
---|
| 57 | <phrase role="pageno">3</phrase>
|
---|
| 58 | </indexentry></programlisting>
|
---|
| 59 | </blockquote>
|
---|
| 60 |
|
---|
| 61 | <para>After building a PDF file with this sort of odd-looking index, you can
|
---|
| 62 | extract the text from the PDF file and the result is a proper index expressed in
|
---|
| 63 | XML.</para>
|
---|
| 64 |
|
---|
| 65 | <para>Now you have data that's amenable to processing and a simple Perl script
|
---|
| 66 | (such as <filename>fo/pdf2index</filename>) can
|
---|
| 67 | merge page ranges and generate a proper index.</para>
|
---|
| 68 |
|
---|
| 69 | <para>Finally, reformat your original document using this literal index instead of
|
---|
| 70 | an automatically generated one and <quote>bingo</quote>!</para>
|
---|
| 71 |
|
---|
| 72 | </refsection>
|
---|
| 73 | </refentry>
|
---|