Formatting dictionaries with CSS

February 26th, 2008 by Eric Albright

In evaluating CSS as a stylesheet language for formatting dictionaries, I started putting PrinceXML through its paces. I tried what I considered to be Cobuild dictionarythe hardest dictionary layout and while I think I have matched many of the features. The sidenotes are just not going to happen without specialized support for them in CSS. (The closest I could get was a float but of course if you have more than one within a line, they just write on top of each other). That result is here. I then switched to a more typical layout which had no problems at all. That result is here. You can get all the files to reproduce this exercise here.

Types of style

There are really a number items which contribute to the style of a dictionary:

  1. Selection of fields
  2. Order of fields
  3. Textual markup - characters or text that is added before, after, or around items to distinguish a field from surrounding text
  4. Character styles - font changes
  5. Paragraph styles
  6. Page layout - columns

CSS actually allows us to handle items 1, 3-6. (Selection of fields can be handled by setting the display property to none.) All the textual markup in the examples was done using the CSS content property.

CSS3 Selectors

Another interesting behavior of CSS 3 is that you cannot select the first element having a class containing the word ‘pronunciation’:

.pronunciation:first-of-type

You can only use the :first-of-type selector to select the first element with a particular name so a general div and span with class attributes would have to be converted to xml named elements instead. There is a way around this, given that our document will be generated from another format and that is to actually add classes first-of-type and last-of-type. Then the data becomes:

<span class="pronunciation first-of-type">...</span><span class="pronunciation">...</span><span class="pronunciation last-of-type">...</span>

and

<span class="pronunciation first-of-type last-of-type">...</span>

Playing with both the xml and xhtml varieties in IE7 and Firefox 2 shows that both do a much better job with the xhtml over xml.

Column-span

The only other problem I ran into was that Prince does not yet support the column-span property. This ended up not being a big problem since I just wanted the heading to span both columns and was able to work around this by making the first page of the section have a 12cm top margin and to float the heading into this space.

4 Responses so far »

  1. 1

    Greg said,

    February 28, 2008 @ 1:50 am

    I tried the files. I want to thank you for doing this work. It is a good start and looks interesting.

    A couple of points. The .xhtml file included will not load in Oxygen or PrinceXML. It complains of character problems.

    I wonder if the right way is to put the first-of-type inside the class attribute or if a separate attribute should be created for this? The converted file looks excessively bloated with such long designators.

    I also noticed the sample has complicated text formatting but doesn’t include the more complicated elements of dictionary formatting: including pictures and tables. On our page of samples, these were included.

    There seems to be a discussion of how to describe the letter headers in section 14 of:
    http://www.w3.org/TR/2007/WD-css3-gcpm-20070504/ (I found crop marks discussed in the same document.) It also describes how to place pictures in the ways we want.

    So it is a good start. I don’t know if PrinceXML implements enough of this working group article to allow us to use it for these purposes. For us to implement it, we may need to set our priorities too.

  2. 2

    Eric Albright said,

    March 4, 2008 @ 10:38 am

    Greg, sorry about the character problems. I’ve fixed those and replaced the file. The ‘first-of-type’ class attribute could be ‘first’ or ‘f’ just as well, but what you gain in space, you lose in readability. My purpose here was clarity. The converted file is transient, I would be more concerned about what the stylesheet authors will use.

    In the updated version, I also added an image and a locator (letter header). I haven’t included tables still since table support shouldn’t really be a version 1 feature.

  3. 3

    HÃ¥kon Wium Lie said,

    March 7, 2008 @ 4:31 pm

    This is great work — you are pushing CSS and Prince to where few have ventured!

    Sidenotes are probably best handled with floats and negative margins in CSS. The idea is that you float a sidenote to the side and then use a negative margin value to push it outside the box from where the sidenote naturally appears.

    Unfortunately, Prince6 has a bug where floats may overlap when pushed fully outside their box. The bug has a workaround: ensure that there is a tiny bit of overlap. Both the bug and the workaround can be seen here:

    http://www.princexml.com/howcome/2008/tests/float.html
    http://www.princexml.com/howcome/2008/tests/float.pdf

    Column-span isn’t implemented yet, but page-floating content will span multiple columns. So, you can make the first heading span multiple columns by way of:

    h1 { float: top }

    I use that technique in this document:

    http://www.princexml.com/howcome/2008/wikipedia/s2.pdf

    The document has been created with this command:

    prince –no-author-style -s http://www.princexml.com/howcome/2008/wikipedia/wiki2.css http://en.wikipedia.org/wiki/Soviet_Union -o s2.pdf

    -h&kon

  4. 4

    Greg said,

    March 8, 2008 @ 12:50 am

    Eric,

    I guess I find it more elegant to handle the punctuation like this:
    <span class=”variants”> <span class=”variant”>v1</span><span class=”variant”>v2</span></span>

    .variants::before { content: ‘(’ }
    .variants::after { content: ‘)’ }
    .variant + .variant::before { content: ‘,’ }

    This avoids the whole need to pre-process with XSLT to add first-of-type and last-of-type and it also keeps track of the structure of the document.

Comment RSS · TrackBack URI

Say your words