CertCities.com -- The Ultimate Site for Certified IT Professionals
Listen, See, Win! Register for a Free Tech Library Webcast Share share | bookmark | e-mail
  Microsoft®
  Cisco®
  Security
  Oracle®
  A+/Network+"
  Linux/Unix
  More Certs
  Newsletters
  Salary Surveys
  Forums
  News
  Exam Reviews
  Tips
  Columns
  Features
  PopQuiz
  RSS Feeds
  Press Releases
  Contributors
  About Us
  Search
 

Advanced Search
  Free Newsletter
  Sign-up for the #1 Weekly IT
Certification News
and Advice.
Subscribe to CertCities.com Free Weekly E-mail Newsletter
CertCities.com

See What's New on
Redmondmag.com!

Cover Story: IE8: Behind the 8 Ball

Tech-Ed: Let's (Third) Party!

A Secure Leap into the Cloud

Windows Mobile's New Moves

SQL Speed Secrets


CertCities.com
Let us know what you
think! E-mail us at:



 
 
...Home ... Editorial ... Columns ..Column Story Saturday: April 5, 2014


 Notes from Underground  
James Ervin
James Ervin


 Unclean!
The dawning of a new age in Linux journaling filesystems.
by James Ervin  
10/1/2000 -- Microsoft issued a white paper one year ago criticizing Linux's high-availability capabilities. Many assertions were made, among them that "Linux lacks a commercial-quality journaling file system."

This statement from Microsoft is confusing on several levels. First, it ignores the fact that journaling filesystems are of several stripes. Second, journaling filesystems are not a panacea--data loss can result from causes journaling cannot address. Lastly, there is more to these filesystems than the ability to journal. NTFS's tendency to fragment, for instance, is as troubling as ext2fs' lack of journaling capabilities.

Even so, the core of Microsoft's criticism carries validity in that Linux journaling filesystems aren't at the level they should be--even one year after that white paper was written. But change is coming. Let's take a closer look at the current state of Linux journaling filesystems, and what's promised in the near future.

Current State of Journaling
First things first: Why care about journaling? The need for journaling stems from the fact that disk size increases more rapidly than disk speed. Unclean system shutdowns nearly always necessitate a filesystem check. The traditional Unix tool for this function is the fsck command. On Windows NT/2000, it's CHKDSK.EXE. The larger the filesystem, the longer a check takes. For large filesystems, integrity checks can take unacceptably long-hours even. Given the increasing disparity between disk size and speed, this situation will only worsen.

Journaling filesystems avoid intolerably long filesystem checks by logging changes to the filesystem in an area separate from the data itself. Following an unclean shutdown, the log is replayed, allowing failed writes to complete. By avoiding file-by-file checks, journaling reduces the time required for a filesystem check by several orders of magnitude.

There are multiple approaches to journaling. Metadata logging, where only structural changes are logged (directories, but not data), is the most prevalent form of journaling. The alternative is data logging, where metadata and data are logged. Data logging reduces performance considerably, especially on large files, as all data is written to disk twice; this is often called "true" journaling. Metadata-only logging provides better performance but has a higher potential for data corruption. Conversely, data logging reduces performance but increases confidence in data integrity. Both offer poorer performance than non-journaled filesystems.

Neither of these journaling approaches should be confused with log-structured filesystems, where the entire filesystem itself is a log-all new data is appended to the end of the log, and earlier blocks of data are invalidated as necessary. Log-structured filesystems have the advantage of dramatically increased write performance (because all writes happen sequentially), and can potentially provide for multiple-level undos, but are not prevalent as general-purpose filesystems.

The de facto standard for Linux filesystems, at the moment, is the venerable ext2fs (Second Extended Filesystem). It provides no journaling, has some tendency to fragment, and possesses other limitations under certain conditions (such as a 2GB file size on 32-bit systems, although the 2.4 kernel will lift this and other restrictions). In some respects, Linux's difficulties, including those of the filesystem, are neglected in the rush to make Linux replicate the functionality of other operating systems. For instance, Linux can act as a fileserver for everything from Windows to Macintosh machines, and some sort of support exists for nearly every conceivable filesystem-even relatively arcane formats such as Amiga's Fast File System have representation on Linux. Now that Linux is coming into its own as a server and future desktop operating system, though, people are making improvements to Linux's native capabilities, rather than improving its mimicry.

The Next Generation
There are several filesystems in various stages of development, all of which have a shot at replacing ext2fs.

The first contestant, ext3fs (ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/), layers journaling capabilities atop the familiar ext2 Linux filesystem. In fact, ext3fs' developers promise that one will be able to migrate between the two effortlessly. Currently, ext3fs supports data logging; metadata-only logging is promised for a future release.

Two more potential players are XFS and JFS. The XFS project is a Linux port of the famed Silicon Graphics filesystem of the same name, regarded by some as the most advanced filesystem in the world. Similarly, IBM also has released the source code to its aptly named JFS (i.e., journaling filesystem) under the GNU General Public License (GPL). Note, however, that neither performs data logging (see section above). In my opinion, SGI's commitment to Linux development is stronger than IBM's, and only XFS seems likely to develop on its own and provide a fount of ideas for ext3fs developers. Neither filesystem is close to completion.

Conceptually, I think the most interesting contender is ReiserFS, named after its lead developer, Hans Reiser. To grossly simplify, ReiserFS is a metadata-logging filesystem with an emphasis on increased small file performance. Balanced tree algorithms are used to aggregate small files; thus, ReiserFS avoids the waste inherent in traditional block-allocated filesystems where blocks remain partly empty if filled with a file smaller than the block size. B-tree based filesystems also improve searching by imposing some structure on data as it is written, rather than randomly allocating space (XFS uses a different type of balanced tree in some areas, including directory allocation).

For More Information...
  • Filesystems How-To
  • Getting to Know the Solaris
    Filesystem, Part 1
  • Joe Pranevich: Wonderful World of
    Linux 2.4 (Final Draft
  • Journaling the Linux ext2fs Filesystem
  • LFS-A Log-Structured Filesystem for Linux
  • Microsoft's Linux Myths
  • ReiserFS
  • XFS
The future vision for ReiserFS is even more impressive, with integration of keyword and database search capabilities (functions traditionally performed by specialized database or indexing applications) into the filesystem itself. To clarify: Searching on most filesystems involves traversing a hierarchical naming system, /usr, /usr/local, /usr/local/bin and so on, where the filenames imply next to nothing about their content. Searching for data within a file is a different and much more performance-intensive task. This latter search type, however, is exactly what relational databases perform routinely. Most databases of note avoid this problem by writing their data to large files or raw disk partitions (think Oracle) within which the database software uses its own logic. Incorporating database-like functionality into the filesystem is, by Reiser's own admission, a tall order. If accomplished, though, it would further unify the Unix namespace, and could be as important as the groundbreaking notion of treating everything in Unix-I/O devices, processes and so on-as part of the filesystem, an innovation that immensely increased Unix's utility.

We can't address all the pertinent issues in this space, and it's too early to predict a winner, but some speculation is possible. I think ReiserFS is the most intellectually stimulating contender, but whether this is apparent or relevant to the average administrator remains to be seen. Additionally, ReiserFS is not yet fully optimized for large files, something critical to the scientific computing community. Nonetheless, ReiserFS developers promise performance improvements over ext2fs at all file sizes, and it is the only reliable Linux journaling filesystem at the moment--several major sites, including sourceforge.com, are using it in production. XFS and ext3fs are in similar stages of development, though XFS may have a higher potential for serious scalability, something Linux in the datacenter needs. Ext3fs promises to be the simplest of all to implement and to offer performance similar to ext2fs, and has the added benefit of familiarity. Since ext3fs developer Stephen Tweedie and ReiserFS developer Hans Reiser are both regular contributors to the Linux kernel community, we can assume both ext3fs and ReiserFS will show up in the kernel relatively soon, though there is still significant contention on the linux-kernel mailing list as to whether or not either will be incorporated into the 2.4 release. Some distributions (SUSE) already carry ReiserFS. Meanwhile, the XFS project has its work cut out for it excising SGI-specific code.

Even so, XFS may be the key to Linux's penetration into the high-end server market simply because of the backing of SGI: Despite its schizophrenic commitment to dual Irix/MIPS and Linux/Intel development, SGI has a presence in the datacenter market that Linux largely lacks, with the exception of cutting-edge research institutions that find it profitable to invest in large Linux clusters rather than large multiprocessor servers because of the low price point. In some ways, the co-opting of Linux by various vendors, SGI included, is disturbing, yet I believe the time and money these corporations have to spend will no doubt provide Linux with a significant boost in markets it has yet to conquer.

What would you like to see in the next generation of Linux journaling filesystems? Post your comments below or enter our Forums.

 


James Ervin is alone among his coworkers in enjoying Michelangelo Antonioni films, but in his more lucid moments suspects that they're not entirely wrong.

 


More articles by James Ervin:

-- advertisement --


There are 15 CertCities.com user Comments for “Unclean!”
Page 1 of 2
11/5/03: bart from baarn, the netherlands, europe says: Good article, thanks man! Novell announced today that will buy Suse for 210 milion dollar. Now I think you are right about ReiserFs to be the possible winner!
7/1/13: louis vuitton outlet from [email protected] says: nice articles louis vuitton outlet http://www.louisvuittonttoutlet.com
7/4/13: michael kors outlet from [email protected] says: Discover More michael kors outlet http://www.sales-michaelkors.com/
7/5/13: gucci outlet store from [email protected] says: good share. gucci outlet store http://www.guccioutletstore-online.com
7/5/13: christian louboutin outlet from [email protected] says: good share. christian louboutin outlet http://www.christianlouboutinoutleta.com
7/25/13: cheap sunglasses online from [email protected] says: thank you for share! cheap sunglasses online http://www.cheap-sunglass.net/
8/30/13: mike wallace youth jersey from [email protected] says: good articles mike wallace youth jersey http://www.cheapyouthnflljerseys.com
8/31/13: michael kors outlet online from [email protected] says: nice articles michael kors outlet online http://www.michaelkorsioutlet.org/
9/4/13: cheap moncler jackets from [email protected] says: thanks for share! cheap moncler jackets http://www.cheapmonclerejackets.org
9/5/13: nfl jerseys cheap china from [email protected] says: nice articles nfl jerseys cheap china http://www.cheapnflljerseysfromchina.com
First Page   Next Page   Last Page
Your comment about: “Unclean!”
Name: (optional)
Location: (optional)
E-mail Address: (optional)
Comment:
   

-- advertisement (story continued below) --

top