Linux: Creating a compressed image of a disk

📄 Wiki page | 🕑 Last updated: Aug 1, 2022

Let's say you want to make a full, compressed image of a flash drive "/dev/sdc" (you want to back up everything - including the partition table and data on partitions).

Since our device "/dev/sdc" is also a file (like everything else on Linux), we can give it directly to gzip command. Either as an argument:

gzip -c /dev/sdc > /path_to_image.gz

or through stdin redirection:

gzip </dev/sdc >/path_to_image.gz

While this works (and it's actually pretty efficient), we can make things a bit nicer by using dd command for the reading part:

dd

dd is a very useful tool for working with disks and images, but be careful when you're working with it - something simple as accidental switching input and output devices can easily result in data loss.

Its original name comes from "Data Definition", but it's sometimes called "Disk Destroyer" for that reason :)

Let's change our original example to use dd:

dd if=/dev/sdc bs=128K status=progress | gzip -c >/path_to_image.gz

if means "input file" - this also refers to the "everything is a file" philosophy we mentioned earlier - everything works transparently the same way if the file is a device file or a regular file on the filesystem
status=progress tells dd to show us a nice progress status (for the reading part).

Also, we're using the block size of 128K for performance reasons - dd has a default block size of 512 bytes mostly for historical reasons, but there is another reason for that:

Read errors

In case your device contains errors, you may get a message like this:

dd: error reading ‘/dev/sdc’: Input/output error

If you still want to make an image of such device, I recommend doing something like this:

dd if=/dev/sdc conv=noerror,sync iflag=fullblock > /path_to_image.gz

I'll write a new post with more details on this, but for now remember that if you're not careful, you could end up with whole blocks messed up (and with larger block sizes, that can be a lot of data).

Basically, conv=noerror,sync tells dd to continue reading on errors and to append the blocks with zeros, so data offsets stay in sync.

iflag=fullblock is something that I've never seen in tutorials, but from my experience, it's very important not to end up with smaller images due to read() calls returning early on read errors.

If your device contains errors, this combination of flags will give you the best results. From my experience, the results were very similar to ddrescue (another tool you can use in such scenarios).

Backing up partitions

Backing up partitions works exactly the same way, just replace "/dev/sdc" with your partition:

dd if=/dev/sdc1 bs=128K status=progress | gzip -c >/path_to_image.gz

Restoring images

For restoring images, we can use dd, but this time with an "output file" argument (of):

gunzip -c /path_to_image.gz | dd of=/dev/sdc

Backing up to another machine over ssh

Instead of writing to a local file, we can easily just pipe the stream over ssh:

dd if=/dev/sdc bs=128K status=progress | gzip | ssh myhost "dd of =/path_to_image.gz"

Restoring images over ssh

And, of course, we could restore the remote image on a local machine:

ssh myhost "gunzip -c /path_to_image.gz" | dd of=/dev/sdc

Or, inversely - restore the local image on a remote machine:

gunzip -c /path_to_image.gz | ssh myhost "dd of=/dev/sdc"

Ask me anything / Suggestions

If you find this site useful in any way, please consider supporting it.