A good detailed explanation of how to set up Grub is available at https://help.ubuntu.com/community/Grub2/Setup. While there are several other HOWTOs available, they deal with the details of isolated parts of the booting process. This page is intended to provide a broader view of the matter.
A one-card loader contained 24 instructions that read your program into fixed locations in the machine's memory (all of 8192 words). When the operator had put your deck in the card reader, he punched a button marked LOAD CARDS on the control panel. This started the card reader; copied the first row of bits (from the bottom row of the punched card) into the first two words of memory; and transferred control of the machine to the first location. Those two instructions had to read the rest of the card into successive locations in memory (i.e., core storage).
The rest of the instructions on that first card had to read in the rest of the cards in your program's “object deck”. Each card contained the address into which its first instruction was loaded. At the end of your absolute-binary deck was a card with the address of the first instruction in your program; the loader had to transfer control to this address to start your program. You were expected to end the program with a HALT instruction; when your program finished, the machine just sat there until the operator put another deck in the card reader and pressed the LOAD CARDS button again.
If you used FORTRAN, you got a “relocatable” deck instead of absolute binary cards. This required a “relocatable loader” that adjusted the address fields of some instructions to suit the places where they were stored in memory. The relocatable loader was a couple of dozen cards, containing instructions on how to adjust the addresses. It began with a one-card loader very similar to the absolute-binary loader: the instructions on the first card told the machine how to load the rest of the relocatable loader, and the whole relocatable loader loaded your relocatable routines. But, as there was still no operating system, the machine just stopped when your program was done.
Eventually, early in the 1960s, there appeared the first operating system, called the Fortran Monitor System. FMS could be told to compile your FORTRAN source deck to machine-language instructions (the so-called object deck), and then load and run the object deck ! We were pretty impressed by that. It was so efficient that it could run somebody else's program automatically when yours finished ! Amazing !
FMS lived on a big tape. The operator mounted the tape on a drive and then pressed the LOAD TAPE button — which turned on the tape unit; copied the first two words from the tape into memory, and transferred control to the first instruction. The first record on the tape had the equivalent of a one-card loader image, which loaded the system into memory, and transferred control of the machine to the system. Wow.
But the overall pattern of the whole process was the same: a tiny piece of machine code (the first 2 instructions) read in a little piece of code (the 1-card loader) that read in a bigger piece (the relocatable loader) that read in the final operating system. Each step made the running code more capable; piece by piece, we go from nothing to everything we need. So, metaphorically, the machine pulls itself up by its bootstraps, until it's fully up and running.
When you turn on your computer, it knows nothing about filesystems or operating systems. It goes through a short process that's stored in a CMOS EEPROM or a flash-memory chip that contains both a list of the peripherals (disks, memory controllers, busses, keyboards, graphics and network-interface cards, monitors, etc.) and a hardware-testing routine (to make sure they all work): the Power-On Self-Test, or POST.
The routines that allow the CPU to talk to the peripheral devices during this stage are in a part of the firmware's memory that was originally called the Basic I/O System, or BIOS. It has become customary, if incorrect, to refer to everything in the firmware as “the BIOS”, even though the BIOS is really just one part of it. (Modern computers have even replaced the old BIOS with a more capable Extensible Firmware Interface, or EFI.)
At the end of the POST, the firmware's initialization routine looks for a particular peripheral device — which might be a hard disk, an optical disk, a floppy disk, a USB device, or a network interface — where (according to data stored in the firmware) there is an operating system to be booted. We'll call this designated device the boot device , without worrying much about what it really is. But, for the sake of definiteness, let's suppose it's a hard disk.
Data are stored on disks in logical segments called blocks . Typically, a block is 512 bytes. When an old, “Legacy”-style BIOS tries to boot an operating system, it reads the first block from the boot device into memory, and transfers control to the beginning of that block. (This block is usually called the Master Boot Record, or MBR.) Actually, only 446 bytes of the MBR are available for this first-stage loader, because the rest of it usually holds the disk's partition table.
You can see that the first 446 bytes of the boot block play the same role in a desktop computer that the 24 instructions of a one-card loader did on the ancient mainframe I used back in 1959: it has to be able to load the rest of the boot-loading system into memory. Then that remaining, much larger, piece of boot-loader code has the job of loading the operating system itself into memory.
Debian provides a man boot page that describes this process clearly. It wisely describes the 446-byte chunk of code in the MBR as the “primary loader”, and the rest of the OS loader as the “secondary loader” — a useful distinction. The primary loader works with absolute, hardware addresses (i.e., the locations of physical blocks on the disk); the secondary loader contains modules that can read your filesystem, and load the files that contain the kernel and some utilities into memory.
Before we get to the complications, you might take a look at some files in your /boot directory. Back when we used LILO, you'd see several 512-byte files in /boot with names like boot.0200 and boot.0300, which were copies of old boot-blocks from a floppy disk and a hard disk, respectively. If you use the old Grub (now called “grub-legacy”), look at /boot/grub; there's a 512-byte file named stage1, which was Grub's boot-block.
Now, however, we have Grub2, which has a more complicated boot-loader. Instead of a one-block primary loader, it can use a larger primary boot “image” designed to read modules from a more specialized core image as the secondary loader. And there is a further complication: because the enormous multi-terabyte disks have too many sectors to be addressed with only 32 bits, most modern 64-bit systems use a still more complicated booting system called EFI (Extensible Firmware Interface), which has replaced the old Master Boot Record (MBR) system's BIOS. A good guide to EFI booting of Debian-based systems is at https://help.ubuntu.com/community/UEFI.
Here's a good rule of thumb:
Too clever is dumb.
— Ogden Nash
Well, it turned out that early versions of Grub couldn't read ext4 filesystems; so I ended up having to boot from a rescue disk, and struggle to get things working normally again. (At that time, there wasn't the profusion of Web pages devoted to migrating from ext3 to ext4 filesystems that a Google search reveals today.) Even with a rescue disk, it's a nuisance to have to cope with Grub's peculiar user interface; without a rescue disk, it would have been a complete disaster.
My struggle was mostly due to an unfamiliarity with Grub, combined with the transition from the old version (now disparagingly known as “grub-legacy”) to Grub2 (now confusingly known as just plain “grub”).
The situation wasn't helped by the unclear terminology used in the Grub documentation, where “install” is used for both the installation of the whole Grub system on a computer, and the process of setting it up to boot your machine. Worse than this overloading of the word “install” is the more complex overloading of the word “boot”: we have a boot partition whose mount-point is /boot; then there's the “boot image”: the file boot.img that gets installed (!) on the Master Boot Record by being copied from the boot directory by the grub-install command. Of course, the “image” files in the boot directory must not be confused with the graphic image files that are also discussed at length in the grub documentation. Worst of all, although it's bad enough that the words “boot” and “root” are very similar to begin with, the Grub configuration files contain menu-stanzas in which “root=something” points to the partition that contains the /boot directory, followed just two lines later by a linux command line in which “root=something” points to the partition that contains the root filesystem. So sometimes “root” means “root”, and sometimes “root” means “boot”.
Are you confused yet? I sure was.
Fortunately, there is a short account of setting up grub in The Debian Administrator's Handbook at http://debian-handbook.info/browse/stable/sect.config-bootloader.html. A longer account, reasonably well written, is at https://help.ubuntu.com/community/Grub2/Setup on the Ubuntu website. An even longer and more complete account of how Grub works is in its Wikipedia article.
Probably the best guide to recovering from Grub disasters, as well as the use of the normal Grub-shell command line, is How to Rescue a Non-booting GRUB 2 on Linux/, which clearly explains how to use the infamous Grub Rescue shell. (I have a short treatment of it here.)
Besides the long and confusing info manual, the complete Grub documentation is available at http://www.gnu.org/software/grub/manual/, but it's huge: the PDF version is 130 pages long. However, it's somewhat clearer than the info page for Grub, because the cross-references are easier to follow in PDF format than in info.
But the big problem is that when the system is booting, none of the facilities of a running system are available. There is no filesystem, just some storage devices that may have pieces of one or more filesystems on them. The files may be fragmented into disjoined blocks. There is no shell like bash available. Somehow, the bootloader has to find the blocks for the system's files, concatenate them in memory, and then initialize and transfer control to the operating system.
To make the process manageable, the authors of Grub devised a shell-like interpreter, and a scripting language devoted to booting. (This reminds me of PostScript: the PS interpreter concentrates on putting ink marks on a page, while the Grub interpreter concentrates on finding partitions on disks, and files on those partitions, as well as the special requirements of several common operating systems.) The Grub documentation awkwardly describes this scripting as its “command-line interface”; but it would be more sensible to just call it “Grubscript”. It's intended to look like standard POSIX shell-script, with common commands like echo and ls; but it's actually considerably different, so that the similarities are misleading rather than helpful. (The full syntax of Grubscript is hidden in section 6.3 of the Manual, “Writing full configuration files directly”.)
Because Grub's need for a fairly high-level understanding of disk partitions (and the filesystems on them) can't be met with the low-level approach of absolute block-number addresses that is the only thing available early in the booting process, it has to get the filesystem stuff and the script interpreter up quickly. That lets it (or the user) do everything else in the higher-level language of Grubscript, which controls the actual booting of the OS kernel.
I'll divide Grub into two areas. The first is the executable part that runs at boot time. Because Grub can boot any of several operating systems, it has to ask you which one to boot. So it presents a menu of possibilities that you can choose from; one will boot by default after a delay of a few seconds, if you make no choice. If it has problems finding the pieces needed to boot your selection, it drops you into a primitive shell that allows you to locate those pieces manually, and put together a workable Grubscript command line to boot Linux. (This is the dreaded command-line interface of the Grub “rescue shell”; a detailed discussion of dealing with these problems is in the Ubuntu Community Help Wiki.)
Notice that the executable stuff that runs at boot-time — not only the primary and secondary stages of the boot-loader, but also the stand-alone program that displays the menu, and uses your menu selection to boot the selected OS; and the Grubscript interpreter, and all its commands — all have to run before a regular operating system is available. In particular, the various filesystems don't get mounted on their mount points until the kernel has been booted and can read /etc/fstab ; Grub sees just unmounted filesystems on isolated disk partitions. So modules that can do all these things by themselves, and read your filesystems to find them when they're needed, have to be part of Grub's boot-time apparatus. These modules are in the /boot/grub/ directory if you have the old MBR booting hardware, or in /boot/grub/x86_64-efi if your hardware boots in the modern EFI mode.
The other area of Grub is the part that sets up the part that runs at boot time. This is Grub's installation system: it organizes the boot-time menu, collects the executable modules that actually boot your machine, and tells the boot-time stuff where (and how) to find everything, such as the kernel to be booted, and its root filesystem. It also installs the primary stage of the boot-loader in the MBR, and puts the later stages where the primary stage can find them.
These things are done by ordinary executable programs and scripts, such as the commands:
which writes the primary bootloader, boot.img,
to the MBR or the EFI partition
grub-mkconfig which prints a new copy of the grub.cfg script to its standard output
update-grub which saves that new copy of grub.cfg at /boot/grub/grub.cfg
There are so many of these isolated pieces that make up the installer part of Grub that it may help to show their relationships here.
The basic specifications for configuring Grub are shell scripts (or rather, fragments of shell scripts) in two areas of the /etc directory:
which contains numbered pieces of shellscript that are
executed in numerical order,
and whose output is blocks of Grubscript text; and
/etc/default/grub which sets various configuration parameters used by Grub.
These pieces might well have been combined ; and in fact they are combined into a single shell script by the update-grub command. That command (which is really just a front-end for grub-mkconfig) runs the resulting shell script, whose output is the actual Grubscript input file for Grub's bootloader. The default destination for that output is the file /boot/grub/grub.cfg, which is described more fully below.
That file, /boot/grub/grub.cfg, is pure Grubscript text. It tells the part of Grub that runs at boot time how to display a menu of possible ways to boot your system, as well as how to carry out the choice you select from the menu — or else, how to carry out a default boot sequence, if you do nothing within a few seconds. The part of Grub that interprets the grub.cfg script is the normal module that lives in a subdirectory of /boot/grub/.
The full list of configuration parameters that can appear in the /etc/default/grub file are listed in section 6.1 of the Grub manual ("Simple configuration"), which is available in the info page of grub. This part of the Manual calls these variables "keys".
Notice that /etc/default/grub itself is just a section of ordinary shell-script code that does nothing but initialize a set of shell variables whose names all begin with GRUB_ . These variables’ values are used by the whole shell script assembled from the numbered sections in /etc/grub.d when it is used by grub-mkconfig to generate a new grub.cfg file.
One reason Grub is so confusing is that pieces of it are scattered all over your disk. Some of them are in the /boot/grub directory, which holds many Grub executable modules and some Grubscript configuration files; some are shellscript fragments in the /etc/grub.d directory, where you might expect to find Grub's configuration files; and some are environmental parameters for those scripts that are set in /etc/default/grub .
Likewise, there is no grub command, and no grub manpage. Instead, there are about 20 different commands that set up various pieces of the whole system, each with its own man page. The installation commands are in /usr/sbin. However, there is a huge Grub Manual available with info grub. It's like LaTeX : gloriously powerful and adaptable for the expert, but a nightmare for the casual user.
Fortunately, there is a single file that tells Grub what to do when it boots your machine: /boot/grub/grub.cfg . This is a simple Grubscript text file that describes Grub's whole configuration. If you learn to read this configuration file, you can understand how Grub carries out the booting process.
Because grub.cfg is generated by the shellscripts in the /etc/grub.d directory, using the environmental parameters in the /etc/default/grub file, you can think of it as the major link between Grub's installation subsystem and its booting subsystem. It's the main output from the installation process, and the main input to the booting process.
If you configure Grub correctly, the menu items presented by the booting subsystem will work correctly, and you'll never have to deal with the weird conventions and obscure commands of the rescue shell.
At the top, there are some general sections produced by the /etc/grub.d/00_header and /etc/grub.d/05_debian_theme shell-scripts. These set up some basic information and make a few subroutines (i.e., "modules") available to the secondary bootloader. For example, if your hard disk uses the traditional MS-DOS partitioning scheme, you'll see the line insmod part_msdos, which inserts a module that can read an MS-DOS partition table. If your system uses the ext2/ext3/ext4 group of filesystems, you'll see insmod ext2, which can read those filesystems. The timeout interval appears at the end of the 00_header section, copied directly from the value set in /etc/default/grub. Such things are detected automatically by the installation subsystem, and we don't expect to see problems here.
The interesting stuff begins with /etc/grub.d/10_linux, which sets up the menu entry for the default operating system. The word menuentry is followed by text describing the system to load, enclosed in single quotes; this will be displayed in the menu at boot time. Then come some --class declarations that tell the booting subsystem what kind of OS it's expected to boot. This first line ends with a left (opening) brace; everything that follows, up to a closing brace, is included in this menu item.
First comes load_video and some more module-insertions, which are pretty obvious. The first tricky item is
which illustrates two of Grub's strange quirks. First of all, “root” here does not mean the root filesystem. Instead, it means the value of Grub's oddly-named “root” variable — which actually means the disk partition that contains the filesystem where the grub.cfg file lives. Well, if you have a separate /boot filesystem (as I do), this really is a pointer to that filesystem's partition; so “root” here really means “boot”. (Of course, if your /boot directory were actually in the same disk partition that contains the root directory , then Grub's root parameter would designate your root partition .)
But this pointer, (hd0,msdos1), is an example of Grub's peculiar notation. The hd0 part means the disk identified in the important file /boot/grub/device.map, which contains a persistent identifier for the disk Grub calls (hd0). Of course, the msdos1 part indicates the first primary partition on this disk. Though the hd0 part resembles the old scheme for naming IDE disks in the /dev directory, it's different — because this is Grubscript, not shell-script. (And these days, Linux calls the disk /dev/sda, or something like that; the old /dev/hdX names are gone.)
Next comes a line beginning with search, which tells the boot-subsystem to look for something in this partition:
search --no-floppy --fs-uuid --set=root d074378b…
The line also contains the flag --fs-uuid, which says the UUID of the filesystem partition is coming up, and then --set=root and the actual UUID itself (a string of gibberish at the end of the line). This line is a command in the Grubscript language that sets the Grubscript variable root to the Grubscript name of the partition whose UUID is the one specified.
After echoing a comment telling the user at boot-time that Grub is trying to boot the specified Linux kernel, the menu entry has a line beginning with linux that specifies the kernel's command-line:
linux /vmlinuz-3.2.0-4-686-pae root=UUID=0899983c… ro video=630x480 quiet
There is the name of the kernel file, followed by root=UUID= and the UUID of the kernel's root filesystem. The line ends with the arguments that will be fed to the kernel, like ro (meaning “mount the root FS read-only”), and a video mode. So, in the “linux” line, “root” means the root filesystem — unlike the usage a few lines earlier. (That's because the root= in the “linux” line is an argument to the kernel itself, and not a Grub variable being set.)
NOTE: a list of the possible arguments in the kernel command line is available at https://www.kernel.org/doc/html/v4.14/admin-guide/kernel-parameters.html . Not all of the possible kernel parameters are described by man bootparam . Some of the arguments are really kernel parameters, and some are Grub variables; so be careful.
You can see the parameters that were actually used to boot a running kernel with the command cat /proc/cmdline . This can be helpful in debugging , especially if you had to alter these parameters at boot time by editing the command line.
An obscure kernel parameter that may be needed to boot a kernel on a USB device is rootwait. USB devices are often slow to become available, and may be reported by Grub as “not found” when they are merely not ready for use when Grub searches for them.
The full syntax of a menuentry is hidden in section 16.1.1 of the Grub manual (or its info version.) Some more detailed discussion of menuentry construction is in the Ubuntu wiki.
Most of the installation system works well automatically; so only a few changes usually need to be made to make Grub do what you want. The trick is to learn what parts need to be tweaked. You can experiment by adding menuentry items to the end of /etc/grub.d/40_custom , followed by running update-grub , and re-booting.
Better yet, you can change any of the scattered Grub configuration files, and then safely see how this will change a new grub.cfg , with the simple command line
grub-mkconfig | less ,
which lets you examine the grub.cfg file that would have been generated and installed in /boot/grub/grub.cfg if you had run Debian's update-grub command. (The default output of grub-mkconfig goes to standard output; so your working grub.cfg file doesn't get changed this way.) So your actual configuration file is undisturbed, while you can see the effects of changes to files in /etc/grub.d or /etc/default/grub .
It's also useful to use grub-script-check to make sure your new grub.cfg is free of syntax errors. This program reads from its standard input, so it's easy to couple it to grub-mkconfig:
grub-mkconfig | grub-script-check .
If everything is OK, grub-script-check produces no output.
Remember to keep an un altered copy of any file you change, so you can quickly reverse the changes if they don't do what you want. And be especially careful not to leave an incorrect set of Grub configuration files in place when installing a new kernel, as the upgrade will invoke update-grub automatically, and overwrite your grub.cfg .
Before the boot process gets to Grub and its menu, it's initiated by settings in the firmware on the motherboard. You can interrupt the boot process and check the startup settings in the firmware by pressing some special key at the start of the boot. Even when Grub has started, the /etc/grub.d/30_uefi-firmware menu item may provide access to those startup menus. Make sure you are telling the hardware to boot from the right device, and in the right mode (BIOS/Legacy vs. UEFI).
And grub-mkconfig itself is just a POSIX shell script that executes the numbered scripts in /etc/grub.d in numerical order. (And you can intervene, if you want, by adding more numbered scripts to this directory.) So, if you can read standard Bourne shell code, you can figure these things out. However, there are lots of tricky cross-references that can make the details difficult to follow. In particular, the operation is guided by the various Grub environmental variables that are set in /etc/default/grub.
NOTE: these variables are not the “Special environment variables” described in Chapter 15.1 of the Grub manual. Instead, the variables set in /etc/default/grub are the ones described as “keys” in Section 6.1 of the Manual, “Simple configuration handling”.
One important feature of the standard operation is that the currently running kernel is the one that will be selected in the first (i.e., number zero) menuentry stanza in the grub.cfg that is produced. And, as the default menu selection is also number zero, that means that Grub will normally boot the current OS again when you re-boot.
Or, if you decide to make a different menu the default item, you only need to run (as root) update-grub to rewrite /boot/grub/grub.cfg while you are running on the OS version you prefer. Then, when you re-boot, that version will automatically be the default.
But if you have put some fancy booting arrangements of your own on the /etc/grub.d/40_custom shell script, that means your customized booting sequence won't use the new kernel. Worse yet, you might remove some old outdated kernels and their associated initrd files after several kernel upgrades; then the stanzas in grub.cfg that referred to them won't work at all.
So it's essential to check your /boot/grub/grub.cfg file a few times a year to make sure the kernels mentioned there are still available in /boot. Better yet, fix up your additions to /etc/grub.d/40_custom to use the newest kernel version. Then run update-grub to make those changes effective.
And if a new kernel was installed to fix security issues, make sure the older ones are removed from both /boot and the configuration files in /etc/grub.d !
Copyright © 2011, 2015, 2017, 2018, 2021 – 2023 Andrew T. Young
alphabetic index page
GF home page
or the website overview page