September 27, 2004

The Perils of PDF: The Joy of TeX

In our last installment, I looked at OpenOffice and found it wanting (for my purposes, at least). Today, I'll talk about a solution I like better...but first, some history.

Don Knuth is one of the grand old men of the field of computer science. In the late 1970s, when he was one of the grand young men of the field of computer science, he began writing a multi-volume tome entitled The Art of Computer Programming. And when the first volumes came back from the publisher, young Don was purely disgusted at what he found.

Computer science, at its base, is heavily mathematical, and The Art of Computer Programming includes vast quantities of mathematical notation, some of it rather novel. And young Don considered that the publisher had done a lousy job of typesetting it. Reflecting further, he decided that the problem was that he didn't have the right tools. And so, in classic nerd style, he took a break from working on his book and developed a typesetting program he called "TeX" (which, by the way, is pronounced "tech", not "tecks"). TeX is really, really good at putting neatly set type on a page; its algorithm for breaking and justifying lines is the accepted standard. And it's really, really good at mathematical typesetting. And on top that, it's programmable. But it's kind of low-level, and it's tricky to use.

So in 1985 a fellow named Leslie Lamport came along and wrote a package on top of TeX that he called LaTeX. LaTeX makes TeX easy to use. You write your document as a plain text file, and indicate the logical structure (chapter headings, section headings, etc.) with a special mark-up notation. The result is somewhat similar to the HTML used to create the page you're reading (or, rather, HTML is somewhat like LaTex, since Tim Berners-Lee didn't invent the World Wide Web until 1989); but when you process it, what you get is nicely typeset output. And using TeX is rather like using HTML--you're constantly needing to check your work in a browser of some kind.

I used LaTeX quite a bit for about a year back in the late 1980's, and really liked it, using it for memos and software documentation both. Eventually I switched to a different project using different hardware, and didn't have TeX readily available to me; after that I languished along with word processors until I started using HTML in the mid-1990's. I took to HTML like a duck to water. HTML's one defect, as I saw it, was that it didn't have a macro language; it was memories of LaTeX that later led me to remedy that lack with a tool I call Expand. And somehow I never went back to using LaTeX.

So a couple of weeks ago I started looking into free ways to produce high-quality PDF output--and ran into an interesting name: "pdflatex". LaTeX and PDF together? Interesting! Perhaps, just perhaps.... So I went looking for a LaTeX system for my PowerBook--and Googled my way into a maze of twisty little passages, all more or less the same. There are dozens of slightly different TeX distributions out there, all of them mostly interoperable, and each with its own documentation on-line--and mostly that documentation is in PDF. It took me quite a while to figure out where I was and what I was doing and which version of LaTeX I should use.

I ended up with two packages, the first of which is Gerben Weirda's packaging of TeX-Live. TeX-Live is a TeX/LaTeX distribution augmented with a vast array of add-on packages; it's maintained by the TeX User's Group. Gerben Weirda adds a few additional packages and a very nice installer. Now, LaTeX is a command-line tool, which is sometimes convenient and sometimes a nuisance. So the other package I downloaded is LaTeX "front-end" called TeXShop. TeXShop provides a tightly coupled editor and viewer so that you can edit your document, press a button, and see the freshly typeset output immediately. It's pretty spiffy.

The proof of the pudding is in the eating, so they say. Once I got TeX-Live and TeXShop installed, was I able to make it do what I need? The answer is a resounding yes. I spent a couple of evenings reading some on-line LaTeX tutorials and refreshing my memory. After that, it took me about an hour to convert the text of Through Darkest Zymurgia from its original form (HTML with Expand macros) to LaTeX format, and there was very little hand-editing involved. I just wrote a couple of short scripts and let the computer do its thing. And then all I needed to do to create the PDF file was push a button.

By way of contrast, it took roughly three hours to print the resulting PDF file on my inkjet printer, so that I could give my brother a copy of the novel to read. Not too shabby.

There's still quite a lot to do, of course, before the manuscript is ready to be uploaded to CafePress. I'll talk about that in the next thrilling installment of The Perils of PDF!

Posted by Will Duquette at September 27, 2004 08:48 PM

steve h said:

I should have known it...

I was introduced to TeX and LaTeX just last year, and loved it. The ability to think of document-creation in the context of a programming language was exhilerating.