Earlier this year we entered a new era of hard disk storage when manufacturers rolled out a 1TB single hard disk drive.
One hard disk drive manufacturer is planning to deliver 4TB hard disk drives in 2011. While we all expect our personal libraries of music, videos, and documents to increase over the next few years, the entertainment industry is expecting to see a 1000% increase in total digital storage. According to analysis by Coughlin Associates, more than six exabytes of digital storage will be used for archiving, content conversion, and content preservation by 2012. An exabyte is a billion gigabytes in decimal terms. Another way to visualise this is in terms of DVDs; one exabyte is equivalent to 250 million DVDs. (That’s the 4GB size.) Six exabytes would equal 1.5 billion DVDs.
Transmitted computer data isn’t far behind. Networking giant Cisco Systems, Inc., published a forecast about the amount of data that will flow through the Internet in the very near future. The report lists total Internet traffic nearly doubling every two years and that consumer IP traffic will surpass business traffic and will be at 18 exabytes per month by 2011. Global Internet video (excluding Peer-2-peer usage) is estimated to be approximately 120 petabytes per month in 2006.
As Internet IP traffic grows, storage needs will also grow. We’ll move from the terabyte era to the exabyte era in a matter of years.
Your users or clients are dealing with an explosion of data growth. The challenge to IT professionals right now is managing all of this data and improving file access performance. Why do we say “improving”? Maintaining the status quo is not good enough. Large data storage devices are storing millions or even billions of files. With data storage systems growing exponentially our accessing tools need to be progressing as well. In addition to storing large data sets, what options are available for file systems? And when an unforeseen data loss occurs with one of these behemoths, who can provide data recovery?
Over the years, Linux has become the operating system of choice for many IT professionals. In the Linux environment, there are many different file systems available. With all the choices, selecting the right file system for users or clients can be challenging. Read on to explore some of the things to consider. (Read here first to learn more about Linux operating systems.)
In the past, there wasn’t a lot of choice when it came to file systems. The operating system only offered one or two choices for a file system and the file system was usually so transparent that it was taken for granted.
Over the years, programmers have contributed to the development of new and existing file systems for Linux. Linux operating systems offer a variety of choices for the organization and management of data files on hard disk drives. File systems are interchangeable with the Linux operating system by design; this is part of the portability of the operating system. This is called the Virtual File System (VFS) inside the Linux kernel (the fundamental core of the operating system). There is a great deal of discussion in Linux communities regarding the positive and negative aspects of each file system type. The following table lists, in no particular order, basic choices of Linux file systems and their commonly used aliases.
Current choices of Linux file systems are as follows. At the end of this article there are links to information that describes these file systems.
The following is a list of special Linux file systems that require additional configuration or that are owned by specific companies:
There are a lot of choices of Linux file systems for workstations and servers. Where does one start? Here are some things to consider.
The best way to answer the above questions is to perform research and testing. The goal should be to determine the performance and reliability of each file system under consideration. Use applications that test and benchmark the file systems being considered (here are some utilities to do that.) Then begin using the system normally, logging the timing, and performance. One writer for the Linux Gazette has benchmarked the most popular file system, read his findings here.
Other recommended tests involve simulating high volume file environments and then reproducing power failures. How long does it take for the volume to become ready, or ‘mount?’ How long does it take File System Check (FSCK) to work through the file system when there are errors? To test file data integrity, use a MD5 Hash Generator for a group of files, then perform the above tests to make sure the files remain the same. An MD5 Hash Generator is a mathematical algorithm that is used to create a unique signature, or “fingerprint” of a file or set of files to determine if any files suffered internal data corruption.
Testing the storage and performance of large files is important because nearly all Linux file systems fragment the files that are stored. Getting benchmarks for large file storage, helps determine what file system handles user or client needs.
The above suggestions for testing simulate extreme cases and it may be that users or systems will never reach the limits of the testing. However, to make the best choice in Linux file systems, they must be tested to know what can and cannot be handled.
Perhaps users or clients do not realise they are using a version of Linux. For example, Digital Video Recorders (DVR) have a Linux file system variant on it. A small Network Attached Storage (NAS) device for the home or small office network may also have a version of Linux on it. Future mobile phones may be running a Linux operating system simply because of its ease of design and flexibility. To sum up, software developers are using elements of different Linux file systems for new products.
The proliferation of Linux file systems are due to the open-source nature and general public licensing that follows these designs. No one person or company owns them, therefore their growth and improvement is limitless.
Despite improvements, however, there will always be unforeseen data loss occurrences where either the hard disk drive will malfunction or crash, or errant data corruption will occur and the file system will no longer be mountable. This is where a professional data recovery service is needed.
Kroll Ontrack has been successfully recovering data from Unix and Linux file systems for many years and our unique approach sets us apart from other data recovery companies.
What makes Kroll Ontrack the choice for data disasters? Companies choose us because of our experience, dedication to research and development and quality recoveries. We know that data recovery is a science - a discipline that requires trained experts. Using a company that claims to specialise in data recovery and uses off-the-shelf recovery tools does not guarantee success. Also, needed are software developers on staff to customise the recovery tools for your specific file system needs; Unix/Linux file system variants are very common.
We research and study these file systems and designs a suite of tools to recover the data. We take the side of the customer and do all we can to recovery quality data, providing the best solution to data loss.
© 2007 KrollOntrack