The Filesystems HOWTO is about filesystems and accessing filesystems from various OS. Although this document has been put together to the best of my knowledge, it may and probably does contain mistakes. Please if you find some mistake or outdated information, let me know. I will try to keep this document up to date and as error free as possible. Any contributions are also welcome, so if you want to write anything about filesystems, please contact me via e-mail.
Before you read this HOWTO it's recommended to read Stein Gjoen's Disk-HOWTO (you can obtain it from http://sunsite.unc.edu/LDP/HOWTO/ ).
This HOWTO can be obtained from http://penguin.cz/~mhi/fs/ or http://metalab.unc.edu/filesystems/howto/.
If you are Japanese user, you might be interested that FUJIWARA Teruyoshi translated this HOWTO to Japanese. It is available at http://www.linux.or.jp/JF/JFdocs/Filesystems-HOWTO.html. SGML source file can be downloaded from ftp://ftp.linet.gr.jp/pub/JF/sgml/Filesystems-HOWTO.sgml.gz.
The Filesystems HOWTO, Copyright (c) 1999 Martin Hinner < mhi@penguin.cz>.
This HOWTO is free document; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This HOWTO is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this document or GNU CC; if not, write to the: Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
You may want to join Filesystems mailing list. It's intended to be a good source of information for both end-users and developers. So if you have anything to do with filesystems, join ;-) To subscribe send email to < majordomo@penguin.cz> and in the BODY (not the subject) of the email message put (without quotes): "subscribe fs-l".
To join Linux kernel filesystems mailing list
linux-fsdev@vger.rutgers.edu,
send e-mail to
listserv@vger.rutgers.edu. Put "subscribe linux-fsdev"
in message body.
To join techical FreeBSD filesystems mailing list
freebsd-fs@FreeBSD.org,
send e-mail to
majordomo@FreeBSD.org. Put
"subscribe freebsd-fs"
in message body.
Filesystems collection is FTP/WWW site providing useful information about filesystems and filesystem-related programs and drivers. It lives at http://metalab.unc.edu/filesystems/, or FTP-only at ftp://metalab.unc.edu/pub/docs/filesystems/.
The original "Filesystems access HOWTO" was written by Georgatos Photis (see his homepage at http://students.ceid.upatras.gr/~gef/). This HOWTO contains a lot of information from his webpage. Thanks, Gef.
FUJIWARA Teruyoshi <fujiwara@linux.or.jp> translated this HOWTO to Japanese.
Other people who have contributed or helped me (directly or indirectly) with this HOWTO are, in alphabetical order:
This is filesystem accessibility "map", alphabetically ordered by operating system. You may find this list a little bit chaotic. It's because Linux sgmltools don't know tables.
YOU SEE THAT THIS `MAP' IS NOT STILL COMPLETE. I WILL TRY TO FINISH IT IN THE NEAR FUTURE.
FreeBSD: BSD FFS | Ext2 | HPFS | NTFS
Linux: AFFS| BeFS| BFS| Ext2 FS| BSD FFS| HPFS| Qnx4 FS| Xia
NetBSD: BSD FFS | FAT12/16 | ISO9660
NetWare 2.x: NWFS-286
NetWare 3.x, 4.x: NWFS-386 | ISO9660
NetWare 5.x: NWFS-386 | NSS | ISO9660
OS/2: Ext2 FS | FAT12/16/32 | HPFS | HPFS | ISO 9660 | JFS | VFAT
QNX 4: FAT12/16 | ISO 9660 | Qnx4 FS
SCO OpenServer: AFS| DTFS| EAFS| HTFS| ISO 9660 | S51K
SCO UnixWare: BFS| DTFS| ISO 9660 | System V| VxFS
Some contiguous filesystems: BFS, ISO9660 and extensions.
(todo)
Some FAT filesystems: FAT12/16/32, VFAT and NetWare filestem.
(todo)
(todo)
Some 'extent' filesystems: EFS and VxFS.
(todo)
Some filesystems which use B+ trees: HFS, NSS, Reiser FS and Spiralog filesystem.
File systems update their structural information (called metadata) by synchronous writes. Each metadata update may require many separate writes, and if the system crashes during the write sequence, metadata may be in inconsistent state.
At the next boot the filesystem check utility (called fsck) must walk through the metadata structures, examining and repairing them. This operation takes a very very long time on large filesystems. And the disk may not contain sufficient information to correct the structure. This results in misplaced or removed files.
A journaling file system uses a separate area called a log or journal. Before metadata changes are actually performed, they are logged to this separate area. The operation is then performed. If the system crashes during the operation, there is enough information in the log to "replay" the log record and complete the operation.
This approach does not require a full scan of the file system, yielding very quick filesystem check time on large file systems, generally a few seconds for a multiple-gigabyte file system. In addition, because all information for the pending operation is saved, no removals or lost-and-found moves are required. Disadvantage of journaling filesystems is that they are slower than other filesystems.
Some journaling filesystems: BeFS, HTFS, JFS, NSS, Spiralog filesystem, VxFS and XFS.
This is useful for creating space for new operating systems, reorganising disk usage, copying data between hard disks, and "disk imaging" - replicating installations over many computers.
Parted has support for these operations:
Filesystem detect create resize copy check ext2 * * *1 *2 *3 fat * * *4 *4 * linux-swap * * * * *
NOTES:
(1) The start of the partition must stay fixed for ext2.
(2) The partition you copy to must be bigger (or exactly the same size) as the partition you copy from.
(3) Limited checking is done when the filesystem is opened. This is the only checking at the moment. All commands (including resize) will gracefully fail, leaving the filesystem in tact, if there are any errors in the file system (and the vast majority of errors in general).
(4) The size of the new partition, after resizing or copying, is restricted by the cluster size for fat (mainly affects FAT16). This is worse than you think, because you don't get to choose your cluster size (it's a bug in Windows, but you want compatibility, right?).
So, in practise, you can always shrink your partition (because Parted can shrink the cluster size), but you may not be able to grow the partition to the size you want. If you don't have any problems with using FAT32, you will always be able to grow the partition to the size you want.
Summary: you can always shrink your partition. If you can't use FAT32 for some reason, you may not be able to grow your partition.
Because I use only Intel x86 machines, any contributions (or non-x86 machine donation ;-) ) are very welcome. If you can provide any useful information, don't hesitate to mail me.
(todo)
(todo)
UnixWare VTOC (Volume Table Of Contents) divides disk partition to 16 logical partitions. Linux kernel supports UnixWare VTOC, you must check "UnixWare slices support (EXPERIMENTAL)" and recompile your kernel. Another way of reading UnixWare disklabel is using GPL port of prtvtoc(1) command, which is in vxtools package.
(todo)
(todo)
Linux implementation is available here:
For more information about Veritas Volume Manager see http://www.veritas.com/.
See also: VxFS (Veritas Journaling Filesystem).
Logical Volume Manager is available in OS/2 WarpServer 5. It allows you to create linear volumes on several disks/partitions. Some people say that it's compatible with IBM AIX Logical Volume Manager.
StackVM is CrosStor's volume manager. Using StackVM the administrator can combine multiple physical disk slices into a single logical device know as a vdisk. Vdisk is short for virtual disk. The physical disks can be combined to form a concatenation, RAID 0 (stripe), RAID 1 (mirror), RAID 4 or RAID 5. In addition a single disk partition can be subdivided into multiple simple vdisks. For more information see CrosStor homepage at http://www.crosstor.com/.
NetWare volumes are used for NWFS-386 filesystem.
Windows 95/98 and Windows NT/2000 store long filenames on FAT in special directory entries with set attributes ReadOnly, Hidden, System and Volume, so if you access FAT volume from DOS you don't see these "files". These special entries have this mad structure:
byte sequence number for slot string(10) first 5 characters in name byte attribute byte byte always 0 byte checksum for 8.3 alias string(12) 6 more characters in name word starting cluster number, 0 in long slots string(4) last 2 characters in name
Problem occur when you delete or modify file with long name from system without VFAT support, because only DOS 8+3 entry will be deleted or modified. Scandisk from Windows 95/98 can repair this problem.
Linux has it's own FAT extensions which gives you long filenames, permissions and owners, links and special devices on FAT partition, called UMSDOS. Each directory contains file named "--linux-.---". There are stored long names and other necessary fields. For more information see file /usr/src/linux/Documentation/filesystems/umsdos.txt. Author of Linux umsdos driver is Jacques Gelinas < jacques@solucorp.qc.ca> and it is currently maintained by Matija Nalis < mnalis@jagor.srce.hr>.
OS/2 Warp version 3,4 and 5 stores long filenames and extended attributes on FAT volume in files "\ea data. sf" and "\wp root. sf" (both files are in root directory of filesystem). AFAIK there is no known implementation of OS/2 EAs for any other OS. If you can supply any information about EA structure, don't hesitate to mail them to me.
Star LFN is an emulator that allows programs, running under DOS 4.0 or above, to use the long filename functions present in Windows'95 DOS boxes. Currently, it can only read and write long filenames from and into a system+hidden file, which means you can't either read or write real Windows'95 long filenames. For more information see http://sta.c64.org/starlfn.html.
Some people say that Microsoft has released a driver called LFNDOS that provides the Microsoft Long Filename API under DOS. If you know where can this driver be downloaded, send me e-mail please.
Under Windows95, a DOS program can use long filenames by calling a set of interrupt functions, which Windows provides. For example, COMMAND.COM will allow long filenames when run as a DOS Prompt from Windows, but not if you restart in MS-DOS mode. Other programs such as EDIT.COM and all DJGPP programs use long filenames if available.
The author probably won't be releasing any more versions of fsresize, because he is working on parted - a Partition Magic clone. It will be able to resize, copy, create and check filesystems/partitions.
Good HPFS links:
iHPFS makes possible for OS/2 users to use their HPFS partitions when they boot plain DOS. The HPFS partition is assigned a drive letter, and can be accessed like any DOS drive.iHPFS is restricted to read-only access.
This program is no longer being developed, because author doesn't use OS/2. If you are willing to maintain the program, let him know.
If you have kernel with HPFS support, say "Y"es to 'OS/2 HPFS filesystem support' in Filesystems submenu. Then recompile kernel using 'make dep bzImage', reboot and try to mount your HPFS partition (e.g. mount /dev/hda2 /mnt -t hpfs).
References:
NTPwd contains command line tools to access NTFS partition, it'a a Dos port of the driver used by Linux. It contains too a little utility to change NT password.
Author now works for Be Inc, so you will not see his NTFS and ext2 filesystem support updated on the web much more. The drivers will be pulled into future BeOS releases.
Extended filesystem (ext fs), second extended filesystem (ext2fs) and third extended filesystem (ext3fs) were designed and implemented on Linux by Rémy Card, Laboratoire MASI--Institut Blaise Pascal, < card@masi.ibp.fr>, Theodore Ts'o, Massachussets Institute of Technology, < tytso@mit.edu> and Stephen Tweedie, University of Edinburgh, < sct@redhat.com>
This is old filesystem used in early Linux systems.
The Second Extended File System is probably the most widely used filesystem in the Linux community. It provides standard Unix file semantics and advanced features. Moreover, thanks to the optimizations included in the kernel code, it is robust and offers excellent performance.
Since Ext2fs has been designed with evolution in mind, it contains hooks that can be used to add new features. Some people are working on extensions to the current filesystem: access control lists conforming to the Posix semantics, undelete, and on-the-fly file compression.
Ext2fs was first developed and integrated in the Linux kernel and is now actively being ported to other operating systems. An Ext2fs server running on top of the GNU Hurd has been implemented. People are also working on an Ext2fs port in the LITES server, running on top of the Mach microkernel and in the VSTa operating system. Last, but not least, Ext2fs is an important part of the Masix operating system, currently under development by one of the authors.
The Second Extended File System has been designed and implemented to fix some problems present in the first Extended File System. Our goal was to provide a powerful filesystem, which implements Unix file semantics and offers advanced features.
Of course, we wanted to Ext2fs to have excellent performance. We also wanted to provide a very robust filesystem in order to reduce the risk of data loss in intensive use. Last, but not least, Ext2fs had to include provision for extensions to allow users to benefit from new features without reformatting their filesystem.
The Ext2fs supports standard Unix file types: regular files, directories, device special files and symbolic links.
Ext2fs is able to manage filesystems created on really big partitions. While the original kernel code restricted the maximal filesystem size to 2 GB, recent work in the VFS layer have raised this limit to 4 TB. Thus, it is now possible to use big disks without the need of creating many partitions.
Ext2fs provides long file names. It uses variable length directory entries. The maximal file name size is 255 characters. This limit could be extended to 1012 if needed.
Ext2fs reserves some blocks for the super user
(root
). Normally, 5% of the blocks are reserved. This
allows the administrator to recover easily from situations
where user processes fill up filesystems.
In addition to the standard Unix features, Ext2fs supports some extensions which are not usually present in Unix filesystems.
File attributes allow the users to modify the kernel behavior when acting on a set of files. One can set attributes on a file or on a directory. In the later case, new files created in the directory inherit these attributes.
BSD or System V Release 4 semantics can be selected at mount time. A mount option allows the administrator to choose the file creation semantics. On a filesystem mounted with BSD semantics, files are created with the same group id as their parent directory. System V semantics are a bit more complex: if a directory has the setgid bit set, new files inherit the group id of the directory and subdirectories inherit the group id and the setgid bit; in the other case, files and subdirectories are created with the primary group id of the calling process.
BSD-like synchronous updates can be used in Ext2fs. A mount option allows the administrator to request that metadata (inodes, bitmap blocks, indirect blocks and directory blocks) be written synchronously on the disk when they are modified. This can be useful to maintain a strict metadata consistency but this leads to poor performances. Actually, this feature is not normally used, since in addition to the performance loss associated with using synchronous updates of the metadata, it can cause corruption in the user data which will not be flagged by the filesystem checker.
Ext2fs allows the administrator to choose the logical block size when creating the filesystem. Block sizes can typically be 1024, 2048 and 4096 bytes. Using big block sizes can speed up I/O since fewer I/O requests, and thus fewer disk head seeks, need to be done to access a file. On the other hand, big blocks waste more disk space: on the average, the last block allocated to a file is only half full, so as blocks get bigger, more space is wasted in the last block of each file. In addition, most of the advantages of larger block sizes are obtained by Ext2 filesystem's preallocation techniques.
Ext2fs implements fast symbolic links. A fast symbolic link does not use any data block on the filesystem. The target name is not stored in a data block but in the inode itself. This policy can save some disk space (no data block needs to be allocated) and speeds up link operations (there is no need to read a data block when accessing such a link). Of course, the space available in the inode is limited so not every link can be implemented as a fast symbolic link. The maximal size of the target name in a fast symbolic link is 60 characters. We plan to extend this scheme to small files in the near future.
Ext2fs keeps track of the filesystem state. A special field in the superblock is used by the kernel code to indicate the status of the file system. When a filesystem is mounted in read/write mode, its state is set to ``Not Clean''. When it is unmounted or remounted in read-only mode, its state is reset to ``Clean''. At boot time, the filesystem checker uses this information to decide if a filesystem must be checked. The kernel code also records errors in this field. When an inconsistency is detected by the kernel code, the filesystem is marked as ``Erroneous''. The filesystem checker tests this to force the check of the filesystem regardless of its apparently clean state.
Always skipping filesystem checks may sometimes be dangerous, so Ext2fs provides two ways to force checks at regular intervals. A mount counter is maintained in the superblock. Each time the filesystem is mounted in read/write mode, this counter is incremented. When it reaches a maximal value (also recorded in the superblock), the filesystem checker forces the check even if the filesystem is ``Clean''. A last check time and a maximal check interval are also maintained in the superblock. These two fields allow the administrator to request periodical checks. When the maximal check interval has been reached, the checker ignores the filesystem state and forces a filesystem check.
An attribute allows the users to request secure deletion on files. When such a file is deleted, random data is written in the disk blocks previously allocated to the file. This prevents malicious people from gaining access to the previous content of the file by using a disk editor.
Last, new types of files inspired from the 4.4 BSD filesystem have recently been added to Ext2fs. Immutable files can only be read: nobody can write or delete them. This can be used to protect sensitive configuration files. Append-only files can be opened in write mode but data is always appended at the end of the file. Like immutable files, they cannot be deleted or renamed. This is especially useful for log files which can only grow.
The physical structure of Ext2 filesystems has been strongly influenced by the layout of the BSD filesystem. A filesystem is made up of block groups. Block groups are analogous to BSD FFS's cylinder groups. However, block groups are not tied to the physical layout of the blocks on the disk, since modern drives tend to be optimized for sequential access and hide their physical geometry to the operating system.
,---------+---------+---------+---------+---------, | Boot | Block | Block | ... | Block | | sector | group 1 | group 2 | | group n | `---------+---------+---------+---------+---------'
Each block group contains a redundant copy of crucial filesystem control informations (superblock and the filesystem descriptors) and also contains a part of the filesystem (a block bitmap, an inode bitmap, a piece of the inode table, and data blocks). The structure of a block group is represented in this table:
,---------+---------+---------+---------+---------+---------, | Super | FS | Block | Inode | Inode | Data | | block | desc. | bitmap | bitmap | table | blocks | `---------+---------+---------+---------+---------+---------'
Using block groups is a big win in terms of reliability: since the control structures are replicated in each block group, it is easy to recover from a filesystem where the superblock has been corrupted. This structure also helps to get good performances: by reducing the distance between the inode table and the data blocks, it is possible to reduce the disk head seeks during I/O on files.
In Ext2fs, directories are managed as linked lists of variable length entries. Each entry contains the inode number, the entry length, the file name and its length. By using variable length entries, it is possible to implement long file names without wasting disk space in directories.
In Linux, the Ext2fs kernel code contains many performance optimizations, which tend to improve I/O speed when reading and writing files.
Ext2fs takes advantage of the buffer cache management by
performing readaheads: when a block has to be read, the kernel
code requests the I/O on several contiguous blocks. This way,
it tries to ensure that the next block to read will already be
loaded into the buffer cache. Readaheads are normally performed
during sequential reads on files and Ext2fs extends them to
directory reads, either explicit reads (readdir(2)
calls) or implicit ones (namei
kernel directory
lookup).
Ext2fs also contains many allocation optimizations. Block groups are used to cluster together related inodes and data: the kernel code always tries to allocate data blocks for a file in the same group as its inode. This is intended to reduce the disk head seeks made when the kernel reads an inode and its data blocks.
When writing data to a file, Ext2fs preallocates up to 8 adjacent blocks when allocating a new block. Preallocation hit rates are around 75% even on very full filesystems. This preallocation achieves good write performances under heavy load. It also allows contiguous blocks to be allocated to files, thus it speeds up the future sequential reads.
These two allocation optimizations produce a very good locality of:
Ext3 support the same features as Ext2, but includes also Journaling. You can download pre- version from ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/.
Authow now works for Be Inc, so you will not see his ext2 and NTFS filesystem support updated on the web much more. The drivers will be pulled into future BeOS releases.
All Macintosh storage devices except floppy disks are partitioned into one or more volumes. Volumes can contain four kinds of items: files, directories, directory threads and file threads. Each item is described by a catalog record which is analogous to a Unix inode. Catalog records are organized in the on-disk catalog B-Tree. Directory contents are derived from searching the catalog B-Tree. Only a file can occupy space outside of its catalog record.
A Macintosh "file" contains two components, or forks. The resource fork is an indexed file containing code segments, menu items, dialog boxes, etc. The data fork has the "stream of bytes" semantics of a Unix file contents. Each fork is comprised of one or more extents or contiguous runs of blocks. An extent descriptor encodes an extent's starting block and length into a 32bit quantity. The first extent record (three extent descriptors) of each fork is a part of the file's catalog record. Any further extent records are kept in the extents overflow B-Tree.
In addition to file and B-Tree extents a volume also contains two boot blocks, a volume information block, and a free space bitmap. There is a remarkable amount of redundancy in the on diskdata structures which improves crash recovery. While not strictly a part of the filesystem, it should be noted that several catalog record fields are reserved for the exclusive use of Finder, a program which handles user access to the filesystem and automatically maintains associations between applications and data files. Thus, HFS must also maintain this Finder info.
Every file and directory on an HFS volume has an identification number, similar to an inode number in the Unix filesystem. However, a file or directory is named by its parent's identification number and the file or directory's file name, which is a 32 character string that can contain nulls. This combination is the search key to the volume's catalog B-Tree. The catalog B-Tree differs from a traditional B-Tree structure in that all the nodes at each level of the B-Tree are linked together to form a doubly linked list and all of the records are in the leaf nodes. These variations permit accessing many items in the same directory by traversing the leaves using the linked list. Strictly speaking, the HFS B-Trees are a variant of B+-Trees although Apple's technical documentation calls them B*-Trees.
Each directory, including the root directory, contains its directory thread, which has the empty filename. The directory thread record contains the name of the directory and the id of the parent of the directory. Similarly, filethreads contain the name of a file and the id of the directory they are in. While every directory must contain a directory thread, file threads are very uncommon. In fact, both are examples of HFS redundancy - for undamaged trees, threads are not strictly necessary. Both file and directory records contain 32 bytes of information used by Finder. The first three extent descriptors for the catalog B-Tree are kept in the volume information block. If the catalog B-Tree file grows beyond three extents, the remaining extent descriptors are kept in the extents overfow.
HFS and HFS+ (also called Sequoia) filesystems are well documented. The best source of tech. information about HFS can be found in the Inside Macintosh series of books. Look at http://developer.apple.com/techpubs/mac/Files/Files-99.html. The HFS+ filesystem is described in Technote 1150, available online at http://developer.apple.com/technotes/tn/tn1150.html. A lot of information is available also in other technotes. This links are collected by Paul H. Hargrove:
HFS/2 lets OS/2 users seamlessly read and write files on diskettes formatted with the Hierarchical File System, the file system used by Macintosh computers. With HFS/2, Macintosh diskettes can be used just as if they were regular diskettes.
This program is no longer being developed, because author doesn't use OS/2. If you are willing to maintain the program, let him know.
The hfsutils package contains a set of command-line utilities such as hformat, hmount, hdir, hcopy, etc. They allow read-write access of files and directories on HFS volumes.
Useful ISO-9660 links:
Extensions allowing long filenames and Unix-style symbolic links.
Useful RockRidge links:
Joliet is a Microsoft extension to the ISO 9660 filesystem that allows Unicode characters to be used in filenames. This is a benefit when handling internationalization. Like the Rock Ridge extensions, Joliet also allows long filenames.
Hybrid CDs contains three filesystems on one disc - ISO9660/RR, Joliet and HFS. Such CD-ROMs are accessible under DOS, Unix, Macintosh and Windows 9x/NT. All three filesystems use the same data, only metadata are the disc three times.
(todo)
The Acorn Disc Filing System is the standard filesystem of the RiscOS operating system which runs on Acorn's ARM-based Risc PC systems and the Acorn Archimedes range of machines.
Linux kernel 2.1.x+ supports this filesystem. Author of Linux filesystem implementation is Russell King < rmk@arm.uk.linux.org>.
The Fast File System (FFS) is the common filesystem used on hard disks by Amiga(tm) systems since AmigaOS Version 1.3 (34.20).
Linux kernel 2.1.x+ supports this filesystem. Author of Linux filesystem implementation is Ray Burr < ryb@nightmare.com>.
BeFS is journaling filesystem used in BeOS. For more information about BeFS see Practical File System Design with the Be File System book or BeFS linux driver source code.
UnixWare BFS filesystem type is a special-purpose filesystem. It was designed for loading and booting UnixWare kernel. BFS was designed as a contiguous filesystem. BFS supports only one (root) directory and you can create only regular files; no subdirs or special files such as devices or sockets can be created.
For more information about BFS see http://uw7doc.sco.com/FS_admin/_The_bfs_File_System_Type.html.
You can access BFS filesystem from Linux:
There is also mine old implementation, which is now obsolete. My plan is to port this code to FreeBSD:
This is new name for High throughput filesystem (HTFS). For more information see CrosStor homepage at http://www.crosstor.com.
Goals in designing the Desktop File System were influenced by impression of what environment was like for small computer systems. DTFS compress the data stored in regular files to reduce disk space requirements (directories remain uncompressed). Compression is performed a page at a time and occur 'on-the-fly'. DTFS supports LZW and no-compression but you can add your own algorithms. Some space is saved by not pre-allocating inodes. Any disk block is fair game to be allocated as an inode. Each inode is stored as a B+tree. For more information see DTFS USENIX paper (you can download it from ftp://ftp.crosstor.com/pub/DTFS/papers/).
Read/Write commercial driver available from CrosStor for UnixWare and SUN Solaris:
The Enhanced Filing system project aims to create a new filing system for Linux and eventually other OSs which will allow the administrator to define mountable "file systems" on a set of block devices (either hard drives or partitions). The aim is to allow file systems to be added or removed from the partition set while the system is running and partitions may be added to a set (or removed if the remaining partitions have enough space to contain all the data) while the system is running.The two main aims are to allow a number of mountable file systems to share the same pool of storage space (IE have the user home dirs on the same drive as the news spool but have separate accounting for them), and to allow the easy addition of more hard drives to allow more space.
Some other features that authors want to implement are logging/journaling, support for as many OSs as possible (although all work will be initially done on Linux), and quotas in the FS so we don't need to waste ages running a silly quotacheck program at boot - the logging should avoid quotacheck the same way it avoids fsck! They want to be able to boot a system with 10gig of news spread over 4 hard drives with full quotas AFTER a power failure with less than 20 seconds for mounting file systems!
Homepage of Enhanced FS is at http://www.coker.com.au/~russell/enh/. Contact Russell Coker < russell@coker.com.au> for more information.
The Extent File System (efs) is Silicon Graphics' early block-device filesystem, widely used on pre-6.0 versions of IRIX. Since 6.0, xfs has been bundled with IRIX and users are being encouraged to migrate to xfs filesystems. IRIX support for efs will be read-only in versions of IRIX beyond 6.5, however efs is still very much in use on SGI software distribution CDs.
There are two kernel modules for linux to access EFS filesystem.
Original efsmod is also available:
Useful links:
This is native filesystem for most BSD unixes (FreeBSD, NetBSD, OpenBSD, Sun Solaris, ...).
See also: SFS, secure filesystem, UFS.
This is a UNIX(tm) operating system style file system designed for the RS/6000 SP(tm) server. It allows applications on multiple nodes to share file data. GPFS supports very large file systems and stripes data across multiple disks for higher performance. GPFS is based on a shared disk model which provides lower overhead access to disks not directly attached to the application nodes and uses a distributed locking protocol to provide full data coherence for access from any node. It offers many of the standard AIX(tm) file system interfaces allowing most applications to execute without modification or recompiling. These capabilities are available while allowing high speed access to the same data from all nodes of the SP system, and providing full data coherence for operations occurring on the various nodes. GPFS attempts to continue operation across various node and component failures assuming that sufficient resources exist to continue.
This is the second hfs that appears in this howto. It is used in older HP-UX versions.
Useful links:
Read/Write commercial driver available from CrosStor:
Linux Log structured filesystem implementation called d(t)fs:
There will also be a dtfs mailing list that will be announced on the homepage. For more information you can have a look at: http://www.xss.co.at/mailman/listinfo.cgi/dtfs
MFS is original Macintosh filesystem. It has been replaced by HFS / HFS+. If you can provide further information, mail me please.
This is Minix native filesystem. It was also used in first versions of Linux.
NWFS is native in Novell NetWare OS. It is modified FAT-based filesystem. Two variants of this filesystem exists. 16bit NWFS 286 is used in NetWare 2.x. NetWare 3.x, 4.x and 5 use 32bit NWFS 386.
(todo)
(todo)
This is a new 64bit journaling filesystem using a balanced tree algorithms. It is used in Novell NetWare 5.
This is OpenVMS and VMS native filesystem.
This filesystem is used in QNX. Two major filesystem version exists, version 2 is used by QNX 2 and version 4 by QNX 4. QNX 4 doesn't support version 2 and vice versa.
QNX4 filesystem is now accessible from Linux 2.1.x+. Say "Y"es to 'QNX filesystem support';
Reiserfs is a file system using a variant on classical balanced tree algorithms. The results when compared to the ext2fs conventional block allocation based file system running under the same operating system and employing the same buffering code suggest that these algorithms are more effective for large files and small files not near node size in time performance, become less effective in time performance and more significantly effective in space performance as one approaches files close to the node size, and become markedly more effective in both space and time as file size decreases substantially below node size (4k), reaching order of magnitude advantages for file sizes of 100bytes. The improvement in small file space and time performance suggests that we may now revisit a common OS design assumption that one should aggregate small objects using layers above the file system layer.
Useful links:
Sony's incremental packet-writing filesystem.
Author of Linux RomFS implemplementation is Janos Farkas < chexum@shadow.banki.hu> For more information see /usr/src/linux/Documentation/filesystems/romfs.txt file.
The sfs filesystem type is a variation of the FFS filesystem type. The boot block,superblock, storage blocks, and free blocks for the sfs filesystem type are, at the administrative level, identical to those for FFS. The inodes differ from FFS inodes, however. Each odd-numbered inode is reserved for security information. The information contains Access Control List information. I'm not sure if SFS has any other abilities though.
SFS links:
Spiralog is a 64bit high-performance filesystem for the OpenVMS. The Spiralog combines log-structured technology with more traditional B-tree technology to provide a general abstraction. The B-tree mapping mechanism uses write-ahead logging to give stability and recoverability guarantees.
Spiralog-related links at Digital:
Homepage of System V Linux project is at http://www.knm.org.pl/prezes/sysv.html. Maintainer of this project is <kgb@manjak.knm.pl.org>.
The Acer Fast Filesystem is used on SCO Open Server. It is similar to the System V Release 4 filesystem, but it is using bitmaps instead of chained free-list of blocks.
The AFS filesystem can be 'extended' to handle file names up to 255 characters, but directories entries still have 14-char names. This filesystem type is used on SCO Open Server.
This filesystem is used in UnixWare. It's probably SystemV compatible, but I haven't verified it yet. For more information see http://uw7doc.sco.com/FS_admin/_The_s5_File_System_Type.html.
This filesystem type is used on Version 7 Unix for PDP-11 machines.
Philips' standard for encoding disc and track data on audio CDs.
There is a Linux UDF filesystem driver:
Note: People often call BSD Fast Filesystem incorrectly UFS. FFS and UFS are *diferrent* filesystems. All modern Unixes use FFS filesystem, not UFS! UFS was used in early BSD versions. You can download source code at http://minnie.cs.adfa.edu.au/TUHS/.
Useful links:
See also: BSD FFS
The V7 Filesystem was used in Seventh Edition of UNIX Time Sharing system (about 1980). For more information see 7th Ed. source code, which is available from the Unix Archive: http://minnie.cs.adfa.edu.au/TUHS/.
This is commercial filesystem developer by Veritas Inc. You can see it in HP-UX, SCO UnixWare, Solaris and probably other systems. It has very interesting features: Extent based allocation, Journaling, access control lists (ACLs), up to 2 terabyte large file support, online backup (snapshot filesystem), BSD style quotas and many more.
Three VxFS versions are available with VxFS:
Version 1: This is original VxFS, not commonly in use.
Version 2: Support for filesets and dynamic inode allocation.
Version 4: Latest version, supports large files and quotas.
Note that HP-UX, Solaris and UnixWare versions use slightly different structures, so you may not be able to read VxFS when you connect it to different system.
VxFS related links:
See also: VxVM (Veritas volume manager) and journaling filesystems.
Unix command-line utilities for accessing VxFS versions 2 and 4 are available under the GNU GPL:
I (mhi) plan also VxFS Linux kernel driver.
AFAIK, Rodney Ramdas < rodney@quicknet.nl> works on VxFS driver for FreeBSD. I don't know current status of his project, so if you want more info contact him directly.
XFS(tm) is the next-generation file system for Silicon Graphics[TM] systems, from desktop workstations to supercomputers. XFS provides full 64-bit file capabilities that scale easily to handle extremely large files and file systems that grow to 1 terabyte. The XFS file system integrates volume management, guaranteed rate I/O, and journaling technology for fast, reliable recovery. File systems can be backed up while still in use, significantly reducing administrative overhead.
XFS is designed for a very high performance; sustained throughput in excess of 300MB per second has been demonstrated on CHALLENGE systems. The XFS file system scales in performance to match the CHALLENGE MP architecture. Traditional files, directories, and file systems have reduced performance as they grow in size. With the XFS file system, there is no performance penalty. For example, XFS directories have been tested with up to 32 million files in a single directory.
XFS is a journalled file system. It logs changes to the inodes, directories and bitmaps to the disk before the original entries are updated. Should the system crash before the updates are done they can be recreated using the log and updated as intended.
XFS uses a space manager to allocate disk space for the file system and control the inodes. It uses a namespace manager to control allocation of directory files. These managers use B-tree indexing to store file location information, significantly decreasing the access time needed to retrieve file information.
Inodes are created as needed and are not restricted to a particular area on a disk partition. XFS tries to position the inodes close to the files and directories they reference. Very small files, such as symbolic links and some directories, are stored as part of the inode, to increase performance and save space. Large directories use B-tree indexing within the directory file to speed up directory searches, additions and deletions.
Useful XFS links:
XFS Linux port covered by the GNU General Public License is available from SGI Inc.:
This filesystem was developed to replace old Minix filesystem in Linux. Author of this fs is Franx Xia < qx@math.columbia.edu>
(todo: www.crosstor.com)
This HOWTO is not about Network filesystems, but I should mention them.
There is a brief list of some which I know:
This protocol is used in Windows world.
( TODO: http://www.cs.auckland.ac.nz/~pgut001/sfs/index.html )
I haven't seen yet any good page about writing DOS filesystem drivers (Network redirectors) on the net. The best source is Ralf Brown's interrupt list and iHPFS source code.
Microsoft IFS kit page ( http://www.microsoft.com/ddk/IFSkit/) will be useful as the best way to get into NT filesystems development (even for $1K it costs).
For more information about writing FS drivers for Windows NT see http://www.ing.umu.se/~bosse/ by < bosse@acc.umu.se>.