I'm an unhappy user of the LaTeX text-formatting system. The journals I publish in all require it. But I hate it.
It's hard to find the information needed to understand how to use LaTeX. Lamport's book lacks much of what you need; the on-line documentation focuses on minutiae, and does not provide a conceptual orientation to the typesetting system. I'll try to provide that orientation here. If you find LaTeX as confusing and user-hostile as I do, this page is for you.
texdoc texdocat a shell prompt to see how that works.
Then the command
texdoc texlivewill show the documentation for texlive itself. (You can skip its Chapter 3 about how to install the package, if you've used Debian's apt tool to install texlive .) Installing the texlive package actually installs several other packages, so you get dozens of TeX-related commands, like latex and bibtex and pdflatex, as well as management commands like tlmgr, recommended fonts, and other utilities.
Now, let's concentrate on LaTeX.
Superficially, LaTeX appears simple: take the text you want to typeset, add some formatting commands to it, run the marked-up text through the latex program, and the output (after a little further processing) is a nicely-typeset PostScript or PDF file that can be printed. It seems like a fancier version of HTML. But this notion is oversimplified and misleading.
The input to the latex command is a flat ASCII file, which contains both text to be formatted, and instructions to the formatter (LaTeX/TeX). The immediate output is not itself printable; it's a “device-independent” or DVI file, which must be post-processed before printing. The ultimate output should be a beautifully typeset document. But a lot happens in the meantime; if anything goes wrong, you have a mess on your hands. So you need to understand a few details.
LaTeX is really a set of macros for the underlying typesetting system, TeX, whose formatting commands are very low-level. LaTeX is intended to provide a high-level language to access the power of TeX. Unfortunately, it hides so much of what's actually going on that it's usually difficult to track down the problems that inevitably occur.
Furthermore, LaTeX and TeX aren't just markup codes like HTML. They're full-fledged programming languages. Think of their formatting commands as forming a program, whose data are the text to be set in type, and whose final output is a printed page. From this point of view, the LaTeX macros are like subroutines that you call to perform particular tasks (like setting a heading in larger type). Supplemental macro packages, like graphicx (which allows images to be added to the typeset pages), are like subroutine libraries.
Another programming language related to typesetting and printing is PostScript. It, too, is a complete programming language — in many ways, a more complete one than TeX. And TeX documents often are converted to PostScript for printing. But, while PostScript is intimately connected to the marks that get put on the page, Tex and LaTeX know nothing of the actual glyphs and graphics to be printed or displayed. They deal only with how things are arranged or “laid out” on the page, not the forms of the things themselves. So, while a PostScript font describes in excruciating detail the actual shapes of letters and other characters, TeX understands only their bounding boxes. The characters themselves are hidden from TeX in font files, instead of being available to the program as they are in PostScript.
But, although fonts are central to typesetting, the TeX user doesn't see much information related to fonts. Unfortunately, fonts come in several quite different forms; so there have to be programs — invoked automatically by LaTeX — that convert raw font data to the form TeX needs. Indeed, there is a whole infrastructure of scripts and libraries that are used by LaTeX, which relies on a special system of directories and subdirectories and indexes to find everything it needs.
That's fine when everything works smoothly in the background. But problems arise if this system is misconfigured in any way; not everything that goes wrong is due to errors in the user's input.
Let's begin with that input file. In what follows, I'll call it ms.tex .
The input file, ms.tex , contains both high-level instructions to the LaTeX interpreter, and the data (i.e., text) that TeX ultimately formats. This mixture of program and data can be confusing.
But in vi, the editor switches from command to input mode only when explicitly told to do so. LaTeX and TeX aren't as explicit: when they finish processing a command, the formatters return to input mode automatically. This makes the input file more difficult to read, because a command can be terminated by just a blank. So it's a little hard to keep track of what mode you're in at a particular place in the input document.
TeX recognizes math mode in a special way: material to be set as mathematics is delimited by $ signs, which toggle between normal-text input and math-input modes.
some text in Roman type and then {\it something in Italics}
you get
some text in Roman type and then something in Italics
as the formatted output.
Unfortunately, curly braces are used for other purposes as well. They're used to delimit the arguments to some of those formatting subroutines (or macros). A common example would be
\section{This is a section heading}
which sets the words This is a section heading in larger, boldface type. The text of the heading , in braces, is the argument of the \section command, which acts like a LaTeX subroutine.
This “overloading” of braces is confusing : the command to switch to a section heading is placed outside the braces, but the command to switch typefaces is inside the braces. Just another example of bad user-interface design.
\begin{environment}
input to be formatted in a particular way goes here
\end{environment}
An environment sets up a particular style in which text is formatted. Some environments make lists: itemize, enumerate, center. Some make other types of displayed matter: figure, table, equation. There are many others. (Again, the variety of treatments is potentially confusing.)
The most important environment is the one that contains all the text of a document: you might have noticed that it must be contained within a
\begin{document}pair. (The document environment sets up ordinary paragraph-style text formatting.) As other environments can be nested inside the document environment, you could (rightly) expect that further nesting is possible. A list, or a displayed equation, might be inside a table element, for example.
. . .
\end{document}
\documentclass{article}
loads one of the mandatory LaTeX packages; and
\documentclass{article}
\usepackage{graphicx}
tells it to load an additional macro package needed to include illustrations. (Here, the graphicx package makes available commands like \includegraphics. You can think of the macro package as a subroutine library that makes available special operations for some particular purpose.)
Often the preamble also contains a few special commands or definitions to alter the defaults of the plain-vanilla macros.
\documentclass[12pt]{article}
\usepackage[dvips]{graphicx}
modifies the typesetting style by using 12-point body type instead of the default 10-point; and the [dvips] option to the graphicx macro package tells it to add information that will be needed by a post-processor called dvips (more about this later).
Some formatting commands have no required arguments (in braces), but may take an optional argument (in square brackets). For example, the \\ command (which forces a line break) can be followed by an optional vertical space in square brackets; so
\\[.5cm]
ends the current line and adds half a centimeter of vertical space before the next one.
One might have expected a sensible programmer to have allowed for such variations by including another regular argument, instead of modifying the name of the command with an asterisk. Perhaps it's useful to think of this as an implicit logical argument, whose value (true or false) depends on the presence or absence of an asterisk at the end of the name.
To compound confusion, some commands take both optional arguments and these “star forms” (as they're called in LaTeX jargon).
This can have side effects. In input mode, the end of a line (like spaces and tabs) is treated as “white space”, which normally becomes just an inter-word space in the typeset text. But the white space implied by the end of a line is commented out if the last character on a line is %. So
You can comment out white%
space at the end of a line.
typesets as
You can comment out whitespace at the end of a line.
Actually, all following whitespace, including any at the start of the next line, is obliterated by this trick. The same thing happens in command mode, where a % at the end of a line is sometimes used to suppress whitespace.
The percent sign commonly causes problems when you want to use it literally in the input text. To typeset “99% of LaTeX users have problems,” you have to type 99\% of LaTeX users . . . . In fact, \% is really a TeX command that means “set a percent sign”.
char | literal | special meaning (unescaped) | char | literal | special meaning (unescaped) | ||
---|---|---|---|---|---|---|---|
% | \% | introduces a comment | ^ | \^ | introduces a superscript in math mode | ||
# | \# | introduces macro's formal parameters | _ | \_ | introduces a subscript in math mode | ||
$ | \$ | math-mode delimiter | { | \{ | begin a group | ||
& | \& | data separator in tables | } | \} | end of group | ||
~ | \~ | non-breakable space in text | \ | \backslash | begins a command |
All of these but the last obey the rule: put a backslash before a special character, to make it appear literally in the formatted text. (We can't do this with the backslash itself, because \\ is the special command to force a line break — yet another confusing inconsistency.)
Usually, the first time you try to run latex on your input file, you'll get an error message. Unfortunately, most of the error messages are unhelpful or misleading. In fact Alan Hoenig, in his book TeX Unbound , refers to
. . . error messages that are mystical, opaque, and vaguely frightening. Experience soon teaches you that the best thing is to ignore the messages . . . .
The error messages are certainly one of the most frustrating features of TeX and LaTeX. To begin with, there's no useful indication of how to recover from an error, or how to quit and try again: standard responses such as quit and bye just generate new errors.
And it doesn't help that TeX and LaTeX present their errors in slightly different formats. However, Sections 8.2 and 8.3 of Lamport's book do explain some of the commoner errors of LaTeX and TeX, respectively.
Of course, that's just the line where the error was detected . The actual cause of the error might be many lines above it. Sometimes there's a mis-typed command name, and in these cases, the error message is usually useful: it breaks the input line where the error was found, so look closely at the break in the line to see if something is obviously wrong there.
! LaTeX Error:
is a LaTeX error. But if it begins with just ! and doesn't actually say LaTeX, it's an error detected by TeX.
TeX error messages are the most mysterious, because they often refer to cryptic internal details of LaTeX macros. If you find TeX complaining about some command you don't recognize, and which is certainly not in your input file, it's likely that something you did (often much earlier) has so confused LaTeX that it has expanded a macro in a way that's nonsense to TeX itself. Such errors can be difficult to figure out.
Fortunately, there is a good explanation of both TeX and LaTeX error messages at MIT's website.
A good way to catch potential mystifying errors before they can elicit puzzling messages is to run your file through lacheck, the LaTeX syntax checker. It won't catch all syntax errors, but it does look for simple things like mis-matched braces, which are often the cause of mysterious error messages from TeX. (You'll need to install the Debian package lacheck to get this utility; then see man lacheck for details.)
Running the command line
latex ms.tex
does not produce a printed page. Instead, it produces a “device-independent” file, ms.dvi. Why?
When TeX and LaTeX were written, computers were slow. It made sense to split the text-formatting task into two parts: determining where to put the characters on a page, and actually putting them there. And printers were slow, so it made sense (and still does) to check the formatted output on the terminal screen before committing it to paper.
The ms.dvi file doesn't contain any actual characters. It just tells some post-processing program where to put them. Then you can use one post-processor to view the result on the screen, and a different one to print a hard copy, without having to re-run latex again.
This division of labor is examined more thoroughly in the next section of this discussion, which goes into more technical detail.
Copyright © 2005, 2006, 2010, 2021 Andrew T. Young
or the website overview page