But aspell is an ungodly mess to configure. Its documentation was written by its original author, who had more problems with English than just spelling: his syntax and punctuation are as haphazard as his orthography. So aspell itself is very disorganized. It combines many disparate functions in a single executable, so it's not easy to figure out how to use it. And it suffers from “creeping featurism”, and has become a complex “little language” of its own, with command-line arguments that correspond to interactive commands.
Despite these drawbacks, aspell has been widely adopted, largely because of its flexibility in suggesting possible correct spellings for a badly mangled word. Probably this is due to its multiple ways of finding possible variations of a misspelled word.
The user's problem is to make sense of this chaotic collection of utilities. Unfortunately, the documentation for aspell is so disorganized that you have to read through it several times before you can find all the pieces that might solve your particular problem. What's needed is a more coherent overview of aspell. This Web page is a start in that direction.
But aspell knows nothing about any human language. The identification of errors, as well as their suggested replacements, depends on a dictionary, or word-list, of correct spellings. So aspell also embodies utilities for maintaining and extending the dictionaries it uses. However, those dictionaries come in several different formats, are stored in multiple locations, and have different attributes — so there are also utilities to convert dictionaries from one kind to another.
In addition to dictionaries, there are grammar files that have rules for affixes; suggestion files that have rules describing common spelling errors; and configuration files that specify the locations of all the other files, and which ones should be invoked for each language. (And nearly everything that can be specified in a configuration file can also be specified on the aspell command line, or in an environmental variable.)
In Debian, the simplest installation of aspell also installs dictionary and other language-dependent files for a default language. But you will quickly find problems with these defaults: usually the default dictionary has too many or too few words for your purposes. It certainly won't have your e-mail address or your username, and probably won't know the names of your friends and co-workers, which will all be flagged as spelling errors if you try to spell-check your outgoing e-mail. So let's start with dictionary problems.
Words that you commonly type but are missing from the default dictionary should be put into what the aspell documentation calls a “personal” dictionary, or wordlist . It's almost, but not quite, just a list of words: there has to be a special one-line header that tells aspell what language it's for, and what its encoding is. For English, this header line is
personal_ws-1.1 en 0 utf-8
This personal wordlist file normally is named .aspell.en.pws and is placed in your home directory. That's the default location — which, like everything else, is configurable, and can be specified in a configuration file, an environmental variable, or even on the aspell command line. But, like everything else in this complicated system, it's best to use the defaults unless you have a really urgent reason (and not just a whim, or curiosity) to do otherwise.
The default location [ ~/ ] and default filename [ .aspell.XX.pws ] — where XX is the language code — are configured into aspell's environment. If this *.pws file has the expected name and location, aspell will automatically add its contents to the default list of correctly-spelled words for this language.
You can find the many configurable variables that aspell uses with the command
aspell dump config
aspell dump config dict-dir
aspell dump config home-dir
aspell dump config personal
Finally, if you want to spell-check a different language whose code is XX, you can add the option “-l XX” to those aspell commands. For example,
aspell -l hr dump config personal
As I mentioned above, the personal wordlist names always end with .pws. These files (apart from the one-line header) contain only lists of correctly-spelled words, and are plain text files, readable and editable with any text editor.
But the system dictionaries that end in .rws are large files that contain many thousands of words. Back around the turn of the century, when aspell was first introduced, personal computers were still fairly limited in mass storage, so these files of several tens or hundreds of kilobytes were compressed to save disk space. And the .rws files use special compression algorithms to make them more compact.
One of the tricks used to make dictionaries compact is to separate the grammatical inflections, which change word forms in languages like Latin and Russian, from word stems . By separating stems from prefixes and suffixes, the main dictionary can be reduced to some tens of thousands of words, plus a much shorter file describing the rules for conjugations and declensions. In aspell, these rules are put into a separate file, named XX_affix.dat for the language coded as XX. (Despite the misleading “dat” extension, these are not binary data files, but plain text.) On Debian, you'll find one affix file in the /usr/lib/aspell directory for each language you have installed.
These affix files gather all the affixes that belong to each inflection pattern as a set of prefixes or suffixes, denoted by a single letter. These one-letter codes are then attached to the stem or uninflected form in a “compressed word list”, stored as a binary file with a .cwl extension. You'll find gzipped versions of some of these files in the /usr/share/aspell/ directory.
aspell dump config lang
Many languages have regional dialects with slightly different spelling conventions. For example, it's well known that American English uses different spelling conventions than British English.
aspell dump config variety
aspell dump config size
If you find that aspell doesn't recognize a lot of correctly-spelled words in the text you are checking, you should increase the size setting in your configuration file.
If you need a bigger dictionary, but no larger size is available, you will just have to augment the personal word list in your home directory.
If you find that aspell suggests many alternative spellings that look obviously wrong, your dictionary size may be too big; try reducing it. But the trouble may also be that you are just asking aspell to try too hard to generate alternatives to words it does not recognize. In that case, change the default setting of the sug-mode variable from "normal" to "fast", or otherwise tweak the selection of suggestions (such as by changing the "sug-edit-dist" variable from 2 to 1, or changing "sug-typo-analysis" from "true" to "false").
But be careful: if you ask aspell to use a file that doesn't exist, it will complain. So don't call for a local spelling list that doesn't exist. (You can always make a dummy file that just has the 1-line header but no list of words, if you're not sure you need it.)
info aspell
Debian now has "man" pages for the associated commands that come in the aspell package, like preunzip and prezip-bin. These help you convert among the several dictionary-file formats.
Copyright © 2021, 2023 Andrew T. Young
or the
alphabetical index
or the
main mirage page
or the
GF home page
or the website overview page