Using the ext3 filesystem in 2.4 kernels Introduction This document is a brief description of how to get up and running with the ext3 journalling filesystem on 2.4 kernels. ext3 was written by Dr Stephen C. Tweedie for 2.2 kernels. The filesystem was ported to 2.4 kernels by Peter Braam, Andreas Dilger, Andrew Morton and, of course, Stephen Tweedie. Ted Ts'o supports the all-important e2fsprogs utilities, as well as providing ext3 feature work and design advice. Alexander Viro has contributed to ext3's directory searching code. Please send any comments on this document to Andrew Morton. Please send any queries, questions or bug reports on this software to the ext3 user's mailing list. Instructions for subscribing to this list are at https:// listman.redhat.com/mailman/listinfo/ext3-users/ Status Across July, ext3 development has slowed as we head toward a 1.0 release. As of kernel 2.4.7, the ext3 patch is quite stable and performs well. Testing has been on x86 SMP. Please send any success or failure reports for other architectures to the ext3-users list. One outstanding problem is disk quotas. There are several known sources of deadlocks in the 2.4.7 quota code, and ext3 adds one more source. The quota code in the -ac kernels is very different, and once that gets merged into Linus' tree we shall continue development and testing of quota code for ext3. This is not to say that ext3+quotas crashes all over the place - but if you push the filesystem hard enough for long enough, the code will lock up and you will need to reboot to reestablish operation on the affected filesystem. We only test quota code against the -ac kernels - this is supported and works well. Installation 1. Download the latest kernel patch from http://www.zip.com.au/~akpm/linux/ ext3/ 2. cd /usr/src/linux 3. gunzip < ~/ext3-2.4-0.x.y.patch.gz | patch -p1 4. make menuconfig Under the filesystems menu, select ext3. Please also select "JBD debugging support", as it will produce useful diagnostics if something goes wrong. You shouldn't normally select "Buffer head tracing" - it uses a lot of memory. However if you do see `assertion failures' from ext3, please see if you can reproduce them with buffer tracing enabled before reporting them - that will provide much useful information. The filesystem may be compiled into the kernel or built as a module. Building it into the kernel can simplify the gathering of diagnostic information if something fails. 5. Build and install the kernel. Other software You will need the latest util-linux package from http://www.kernel.org/pub/ linux/utils/util-linux/ . The changes in mount are described below. You will need to download version 1.25 or later of e2fsprogs from http:// e2fsprogs.sourceforge.net/. Converting ext2 filesystems An ext2 filesystem maybe converted to ext3 by creating a journal file on it. To do this, run tune2fs -j /dev/hdXX on the target filesystem (which may be mounted). The filesystem is now ext3 capable. This means that it can be mounted as type ext3. Now you can unmount/ mount (after changing your /etc/fstab appropriately) to do this. To mount the root filesystem ext3, the easiest thing is probably to just reboot. Creating new ext3 filesystems Simply run mke2fs -j /dev/hdXX to create a new ext3 filesystem on that device. Switching between ext2 and ext3 ext3 filesystems may still be mounted by ext2 as long as they have been cleanly unmounted. ext2 will refuse to mount ext3 filesystems which have not been cleanly shut down, because there is live data still in the journal which ext2 does not know how to deal with. The e2fsck application from e2fsprogs can perform journal replay, so running e2fsck -fy /dev/hdXX on a damaged ext3 filesystem will repair it, allowing ext2 to mount it. ext3 software will refuse to mount an ext2 filesystem - at present there must be a journal file on the filesystem. LILO options for the root filesystem If your root filesystem is ext3, an ext3-capable kernel will, by default, mount it using ext3. This may be overridden via the following LILO option: LILO: linux rootfstype=ext2 You may provide mount options to the root filesystem via LILO using the rootflags option. For example: LILO: linux rootflags=data=journal Non-LILO bootloaders The LILO bootloader doesn't know about filesystems - it uses a pre-prepared list of blocks to locate and load the operating system image into memory. However other (smarter?) software such as SILO (SPARC) and yaboot (built on Open Firmware) (PPC) have filesystem drivers in them, and they know how to directly open and load an ext2 file. This can be a problem if the boot filesystem is ext3, and it has suffered an unclean shutdown. When ext3 is in this state it is not compatible with ext2 - it neds recovery to be performed. This incompatibility is recorded in the filesystem's superblock, and a fully ext2-compatible bootloader implementation should complain and refuse to open files on the filesystem. This is, of course, not what we want to happen. The system won't boot! Versions of yaboot prior to 1.3.5 will refuse to boot from a "needs recovery" filesystem. Version 1.3.5 and later support ext3 via libext2fs. SILO also has the correct compatibility checks, and booting from a "needs recovery" ext3 filesystem will cause SILO to complain about "too many symlinks", or something else inappropriate. To avoid this serious problem you will need to ensure that your boot filesystem is of type ext2, not ext3 (or patch SILO to defeat the compatibility checks?) Making things seamless One problem with switching back and forth between ext2-only and ext3-enhanced kernels is the need to tell the kernel what sort of filesystem to mount all your devices with. This usually involves playing games with /etc/fstab. The latest version of mount recognises ext3 and can automatically choose the ext3 filesystem type. The version of fsck in e2fsprogs-1.23 and later can also do this if the fstype is auto. Here's the state of play: * If mount is not told the target fstype, and it detects ext3, it will try ext3 and then ext2. * If mount is told fstype auto then it will detect ext3 and will try ext3 and then ext2. * If fsck is told fstype auto then it will autodetect the type of the filesystem and run the appropriate checker (which is fsck.ext2 for both ext2 and ext3). Is is recommended that you use latest stable version of e2fsprogs, and that you use fstype auto in /etc/fstab. NOTE: You must be using e2fsprogs-1.23 or later for fstype auto to work correctly! NOTE: If you are using a recent Red Hat distribution and if you have built your own util-linux from the official tarball you may have problems with mount failing to mount filesystems. This is because Red Hat have added the "-O" option to their version of mount. This option is used in their /etc/rc.d/ rc.sysinit, and this causes the standard mount to fail with "unrecognised option -O". The fix is to edit /etc/rc.d/rc.sysinit and remove any instances of "-O no_netdev". NOTE: Using filesystem type auto for the root filesystem confuses /bin/df, and causes it to not print out information for the root filesystem. Fix: always specify the root filesystem as ext3 in /etc/fstab. Filesystem check intervals A feature of e2fsck is that it will regularly force a check of a filesystem even if the filesystem is marked clean. Typically, this happens on every twentieth mount or every 180 days, whichever comes first. This still happens with ext3, and is quite possibly not what you want to happen - one of the reasons you chose ext3 was to avoid the downtime which is caused by a long fsck. So it is a good idea to turn this feature off for ext3. Use the command tune2fs -i 0 -c 0 /dev/hdxx To disable the checking. NOTE: this means that it is your responsibility to periodically schedule downtime for the manual checking of disks. In many Linux distributions this is most easily done by creating a file called /forcefsck and rebooting. External journals As of version 0.9.5, ext3 supports the placement of the journal on a separate device. It is intended that this be a magnetic disk, or an NVRAM device. NVRAM devices may be simulated by using Andrew Tridgell's trivial RAM disk driver trd, which is in the ext3 CVS repository. You will need a very recent e2fsprogs. Version 1.23-WIP-0727 or later. To install trd: mknod /dev/trd b 240 0 insmod trd.o trd_size=50000 (For a 50 megabyte device) To create an external journal: mke2fs -O journal_dev /dev/trd To create an ext3 filesystem on /dev/hda5 which uses the external journal device: mke2fs -J device=/dev/trd /dev/hda5 mount /dev/hda5 /mnt/some/place -t ext3 NOTE! You'll need to specify the filesystem type (-t ext3) because the automatic filesystem type detection in mount(8) doesn't recognise ext3 with external journals. Of course you may use a partition on a real disk as the external journal device - just replace /dev/trd above with /dev/hdXX. NOTE: using a RAM disk driver to simulate an NVRAM device should ONLY be used for testing. Doing so will lose the benefits of journalling at recovery time (you will always get a full fsck of the filesystem), and in fact you will lose data and have an increased chance of filesystem corruption after a crash. A HOWTO Rajesh Fowkar has prepared an ext3 installation HOWTO. It is available at http://www.symonds.net/~rajesh/howto/ext3/index.html.