You are here

Sparse files

Taxonomy upgrade extras: 

What is a sparse file?

"A sparse file is a file where space has been allocated but not actually filled with data. These space is not written to the file system. Instead, brief information about these empty regions is stored, which takes up much less disk space. These regions are only written to disk at their actual size when data is written to them. The file system transparently converts reads from empty sections into blocks filled with zero bytes at runtime." [ 1 ]

In other words: Files are not as big as expected.

With databases this can be seen often: For example the MySQL Cluster REDO log files are created as sparse files or some ORACLE tablespace files.

But first let us create such a sparse file:

# dd if=/dev/zero of=sparsefile count=0 obs=1 seek=100G

# ls -lah sparsefile
-rw-r--r-- 1 oli users 100G 2007-10-24 11:18 sparsefile

# df -h .
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda9             5.0G  3.5G  1.2G  75% /home

Funny: How can I have a 100 Gbyte file on a 5 Gbyte device? And this also already shows the problem...

But first let us see how we can find the real size of the file. So we can see if a file will make trouble or not:

# du -ks sparsefile
0       sparsefile

In reality this file is only 0 kbyte in size.

Or an example from MySQL Cluster:

# ll -h D9/DBLQH/S?.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 18:02 D9/DBLQH/S0.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S1.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S2.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S3.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S4.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S5.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S6.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S7.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S8.FragLog
-rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S9.FragLog

# ll -hs D9/DBLQH/S?.FragLog
612K -rw-r--r-- 1 mysql dba 16M 2008-01-16 18:02 D9/DBLQH/S0.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S1.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S2.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S3.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S4.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S5.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S6.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S7.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S8.FragLog
548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S9.FragLog

Why are sparse files dangerous?

In productive environments we want to have predictable behaviors of our systems. We therefore monitor these systems. With sparse files it becomes a little bit more tricky: We have free disk space, we have used disk space and we have possibly used disk space in the close or far future...

What we can do against?

Right now: Not much until the software vendor provides a possibility to avoid this.

  • Calculate the exepcted disk space (quantity structure).
  • Monitor properly your system.

Literature

  1. 1 Sparse files on Wikipedia en
  2. 2 Sparse files on Wikipedia de

Comments

Hello Shinguz, NTFS is sparse file-capable too. However there is no tools to deal with it in convenient way. I have written one. Check the "SparseChecker" (http://www.opalapps.com/sparse_checker/sparse_checker.html). Current version is free. I guess my post is relevant here because MySQL is available for Windows too. And for huge preallocated database files sparse regions (regions full of continuous zeroes) is a known problem, as well as disk and memory images for virtual machines and preallocated download manager's files (pending downloads). Best regards, Oleh P.S. It would be interesting and valuable to hear your proffesional opinion about the SparseChecker as well as if it makes sense to port it to Linux.
opalcomment