Home > Software, Software Design > LaTeX Tough Love

LaTeX Tough Love

October 9th, 2006

There’s a post over at the defmacro blog that may serve as a high-level introduction to TeX for those unfamiliar with the markup language used by academics. The article states that TeX got it right by separating content from style, and that modern attempts to reinvent this particular wheel should be abandoned as they are unnecessary. In his conclusion, the author states,

Word Processors are the least useful components of modern office suits. An argument about Microsoft Word vs. Word Perfect is a false dilemma as there are better alternatives.

Wow. I completely agree that, once you get past the initial learning curve, working with LaTeX (or, more commonly these days, pdflatex) is not so bad. In fact, it’s pretty nice! Typically you’re using a style that someone else made, and at that point you’re just writing text into your favorite text editor and getting very pretty output at the end of the compilation pipeline. The combination of XHTML and CSS come fairly close to this, but have no accepted format that represents a static compilation of the content and style into a single file, and support much less powerful styling than TeX. But that power comes at enormous expense. When submitting to an IEEE-managed event, you may be using the IEEE LaTeX class which has items like this,

% V1.6
% LaTeX is a little to quick to use hyphenations
% So, we increase the penalty for their use and raise
% the badness level that triggers an underfull hbox
% warning. The author may still have to tweak things,
% but the appearance will be much better “right out
% of the box” than that under V1.5 and prior.
% TeX default is 50
\hyphenpenalty=750
% If we didn’t adjust the interword spacing, 2200 might be better.
% The TeX default is 1000
\hbadness=1350
% IEEE does not use extra spacing after punctuation
\frenchspacing

In fact, there are many instances of minor tweaks to values in order to make LaTeX do the right thing. Does this remind anyone else of something you’d expect to find on quirksmode despite the fact that there is no multiple-browser problem with TeX? At the end of the day, LaTeX is very sensitive to minor formatting changes, the style sheets are difficult to work with, and you will still likely end up with some coupling of style with your content. This last point is seen when you include typographic instructions (e.g. italic, bold, monospacing) in your content, or when positioning / sizing an image at the top of a column. The former activity is a result of the fact that most people don’t want to touch the prescribed styling for fear of breaking something. The latter often involves specifying exact dimensions for images which are only relevant given a particular text layout, which is specified in the separate class file.

It is unfortunate that modern word processors threw out the separation of style and content with WYSIWYG editing as a justification, but I don’t think they’re useless. The fact is that spending time with TeX reveals some key annoyances that modern word processors have addressed.

First, many people fall back to using a generic text editor to work with TeX files. This leads to a situation where final formatting of a document (e.g., ascertaining length, making sure page breaks are appropriate, ensuring figures are legible) involves typing an edit, saving the file, running a lengthy compilation step (particularly when working with bibliographies, which may require 2 passes of the TeX compiler to resolve new references), and then viewing the output PDF in a PDF reader. These steps are somewhat integrated in some applications, but it still feels a bit clumsier than even the old days having to go to Print Preview in Word Perfect to make sure everything still looks okay. Initial layout of the document is a wonderful experience because the style sheet does all the work, but fine tuning becomes painful because of the discrete compilation and view steps. These stages of document creation are a strong reminder of the benefits of modern document formatting tools. Let me emphasize here that I think that TeX authoring has an advantage in the early stages of document creation due to the fact that all you need is a text editor, but that this usage of a less-specific tool is not without its shortcomings in the later stages of the authoring process when you may feel pressured to use a heavier, integrated application for the faster feedback cycle. In fact, I would say that the ease of use of any text editor for document creation and this particular audience’s existing familiarity with emacs and vi have lessened the incentive to create modern, integrated TeX editors.

Second, in my experience I see more spelling and grammar mistakes in TeX documents than Word or OO.o documents. Surely there are many factors at play here, but I think most people just don’t bother setting up a spell-check routine when going into a generic text editor. The integrated help in modern word processors is often annoyingly intrusive, but it does prevent some mistakes.

Third, collaborating on documents is difficult with TeX. This is, really, a result of the fact that any text editor can edit a TeX document, which leads to a situation where everyone uses something different. This is great in that people get to chose their own software, but features like Word’s Track Changes are left by the wayside. Word and OO.o can actually be used in concert while tracking changes with only minor quirks, but notepad and emacs offer no such collaborative feature on their own.

As a secondary aspect to this problem, bibliographies can sometimes be tricky to manage among separate people as well. This is more of a usage issue than anything else, but it is a consequence of separating the document into its component parts. Many people maintain a large BibTeX file that includes all the papers they’ve referenced in their writings (aside: BibDesk is a great program for doing this on Mac OS X). Each entry has a name, so you can just include a reference to the citation in your document. This is very practical since one often cites the same source in multiple papers on a particular topic. However, this creates a usage model where an author maintains their own large reference library which must be linked in to the document at compilation time. This can lead to a situation where either one person has to be the one to compile the document because theirs is the BibTeX file that includes all the necessary references with the correct names assigned to each reference. (If my BibTeX file refers to a paper as cite-Foo and you called it cite-Bar, then our TeX file is going to be ambiguous.) Alternatively, a group can build up a fresh BibTeX file by having each person copy the relevant portions of their personal BibTeX library into a new file. This loses some of the benefits of maintaining references in one location (duplication of information being generally a bad thing), and represents a fairly strict coupling between separate files.

Fourth, style sheet accessibility. This one is totally debatable, and if you’re a LaTeX class writer, then there’s no issue here. I personally prefer CSS’s syntax.

I’ll stop there, and end with this: When I want to produce a scholarly paper, I open TextMate and warm up pdflatex. The output is, quite simply, superb. The author gets truly professional-looking layout, kerning, and typography from very modest effort. However, if I needed my document to be styled radically differently from one of the handful of styles that I have installed, I wouldn’t even bother trying to get TeX to do what I want. The fact is that TeX got some key things right: separation of style from content, plain text source file formats, and compilation into a very exacting compiled document format. But there is a lot of room for improvement here, and more modern word processors showcase how the TeX authoring experience could be improved. Last I checked, Writely doesn’t support TeX syntax yet, but when it, or a similar service, does, look for it to address many of the issues I raised above.

Anthony Software, Software Design

  1. October 10th, 2006 at 14:56 | #1

    As you’ve mentioned on the end, I think that the best way to use LaTeX is to integrate it with other programs, especially web-based ones.

    LaTeX produces beautiful output, but it comes with a price of the learning curve. Users don’t want that, they just want the “beautiful output” part.

    Imagine how beautiful it would be if you could click one button on top of this article, and download the pdf of this post which is generated with LaTeX typesetting.

    As for using LaTeX on everyday documents, I think it’s a lost battle. In times when people tend to think visually, and when you need to create visually rich documents, LaTeX is definitely not the best choice.

  2. October 10th, 2006 at 17:15 | #2

    Petar, I think you’re right about LaTeX for everyday use being unreasonable. What I wonder though is if a type setting program can give up that market (which is the vast majority of the market as a whole) and still survive in its own niche. Since it is so old, and since academics tend to be so contrarian :) , there seems little chance of LaTeX going anywhere any time soon. That said, LaTeX isn’t going to rise to any new challenges in the same way that it’s not going to produce uglier output going into the future. It’s a relatively stationary target, yet nobody has come along to offer something better. This is probably a case of there just not being a market for anyone to enter; LaTeX is essentially the soil of academia: it’s just there.

    It does however seem that newer technologies like CSS and even WPF could benefit from studying what LaTeX does well. WPF, in particular, seems to offer some very nice text layout options that I would hope CSS and its ilk can compete with over the next year or two.

  3. October 12th, 2006 at 12:47 | #3

    Here’s the example of use-case of LaTeX I would like to see more:
    According to this [1], goffice [2] (on-line office suite) is using LaTeX to produce high-quality output.

    I’m not able to try it because it’s not free, but I think they are on the right track. Users want simple solutions to create beautiful documents, and integrating LaTeX like this is IMO the right way to go.

    [1] http://www.sutor.com/newsite/blog-open/?p=1132
    [2] http://www.goffice.com

  4. October 13th, 2006 at 00:34 | #4

    Petar, nice find! That does look like just the sort of thing we’re talking about, but that website is really leaving me cold. I couldn’t find any kind of free trial or even a screencast. Next question: how much of LaTeX is exposed? If you present a subset of the available formatting commands in GUI form, how does everything pan out if you then go ahead and use other TeX markup that’s not offered by the GUI?

    Boy, I can’t say enough how disappointing it is that the website doesn’t give a better representation of that software.

  5. October 27th, 2006 at 15:24 | #5

    As far as I could tell from their F.A.Q. (you’re absolutely right about the website… it’s awful) they do not expose LaTeX at all, they just use it as a backend, which, if implemented correctly is (IMHO) the right way to go for the ordinary users.

    But another news appeared this week: new OpenOffice.org 2.04 supports direct LaTeX exporting! I still haven’t tried it, so I can’t comment on the implementation, but I’ll post impressions on my blog as soon as can.

Comments are closed.