Linux: Creating a compressed image of a disk📄 BetterWays.dev wiki page | 🕑 Last updated: Aug 1, 2022
Let's say you want to make a full, compressed image of a flash drive "/dev/sdc" (you want to back up everything - including the partition table and data on partitions).
Since our device "/dev/sdc" is also a file (like everything else on Linux), we can give it directly to
gzip command. Either as an argument:
gzip -c /dev/sdc > /path_to_image.gz
or through stdin redirection:
gzip </dev/sdc >/path_to_image.gz
While this works (and it's actually pretty efficient), we can make things a bit nicer by using
dd command for the reading part:
dd is a very useful tool for working with disks and images, but be careful when you're working with it - something simple as accidental switching input and output devices can easily result in data loss.
Its original name comes from "Data Definition", but it's sometimes called "Disk Destroyer" for that reason :)
Let's change our original example to use dd:
dd if=/dev/sdc bs=128K status=progress | gzip -c >/path_to_image.gz
ifmeans "input file" - this also refers to the "everything is a file" philosophy we mentioned earlier - everything works transparently the same way if the file is a device file or a regular file on the filesystem
status=progresstells dd to show us a nice progress status (for the reading part).
Also, we're using the block size of 128K for performance reasons - dd has a default block size of 512 bytes mostly for historical reasons, but there is another reason for that:
In case your device contains errors, you may get a message like this:
dd: error reading ‘/dev/sdc’: Input/output error
If you still want to make an image of such device, I recommend doing something like this:
dd if=/dev/sdc conv=noerror,sync iflag=fullblock > /path_to_image.gz
I'll write a new post with more details on this, but for now remember that if you're not careful, you could end up with whole blocks messed up (and with larger block sizes, that can be a lot of data).
conv=noerror,sync tells dd to continue reading on errors and to append the blocks with zeros, so data offsets stay in sync.
iflag=fullblock is something that I've never seen in tutorials, but from my experience, it's very important not to end up with smaller images due to read() calls returning early on read errors.
If your device contains errors, this combination of flags will give you the best results. From my experience, the results were very similar to
ddrescue (another tool you can use in such scenarios).
Backing up partitions
Backing up partitions works exactly the same way, just replace "/dev/sdc" with your partition:
dd if=/dev/sdc1 bs=128K status=progress | gzip -c >/path_to_image.gz
For restoring images, we can use
dd, but this time with an "output file" argument (of):
gunzip -c /path_to_image.gz | dd of=/dev/sdc
Backing up to another machine over ssh
Instead of writing to a local file, we can easily just pipe the stream over ssh:
dd if=/dev/sdc bs=128K status=progress | gzip | ssh myhost "dd of =/path_to_image.gz"
Restoring images over ssh
And, of course, we could restore the remote image on a local machine:
ssh myhost "gunzip -c /path_to_image.gz" | dd of=/dev/sdc
Or, inversely - restore the local image on a remote machine:
gunzip -c /path_to_image.gz | ssh myhost "dd of=/dev/sdc"