kaashif's blog

Programming, with some mathematics on the side

How to Wipe a Disk

2013-11-06

This article is not only about disk wiping, it will hopefully teach you something about using some GNU command line tools . This tutorial was written on my ThinkPad, which runs Debian, so the output should be pretty similar to what you'd get on Ubuntu, Mint or any other Debian- or Ubuntu-based systems. Basically, if you're using something

non-standard, you already know that you are, since those types of operating systems are few and far between.

Getting into the right environment

Since you can't reliably wipe a disk with the OS which is on the disk you'll be wiping, you need some way to run an OS from something other than the hard drive. Enter live CDs, DVDs, and USB drives. You can download a disk image and write it to a drive any way you want to, it doesn't matter, as long as the disk boots into a GNU userland of some sort. I recommend Debian, you can get a selection of disk images here. Burn a disc, write it to USB or whatever. After you do that, insert it into your PC and reboot. Make sure the removable media of your choice is higher in the boot order of the BIOS than the hard drive, or this won't work. When you reboot, you will hopefully be confronted with a bootloader menu with several options. Pick the one which sounds most like "Try before you install" or "Live DVD", and you will be put into a GNU/Linux environment, hopefully with some sort of shell prompt. You are now ready to execute some commands!

Which drive are we wiping?

The first thing you need to do is find out which disk you need to wipe. Your USB drive may be one of the drives detected by the OS, so it's important you wipe the right thing. Even if you only have one drive, it's best to check which drive you're wiping just to make sure. The command to list block devices is lsblk. It is used as follows:

$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda  8:00 232.9G  0 disk 
`-sda1   8:10 232.9G  0 part
sdb  8:16   1  14.4G  0 disk 
`-sdb1   8:17   1   739M  0 part /

Here we see two disks: sda and sdb. With the Linux kernel, drives are given names based on what type of drive they are and the order in which they were detected. In this case, the 250 GB hard drive is detected first, so it ends in "a". All disks are given the "sd" prefix, unless they're really old IDE hard drives. The USB drive was detected 2nd, so its name is "sdb". So we have established that we want to wipe the drive "sda". Note that we are targeting the drive itself, not any of its partitions ("sdaX").

But how do we get to that drive? This is an question which really exposes the convenient and useful nature of the Unix philosophy. Specifically, the part which says all programs should aim to represent as much as possible using files. With this in mind, it's logical to say that the drive "sda" is represented somewhere on the filesystem. It happens to be in the directory "/dev", with all the other device files. Essentially, we will have to wipe the file "/dev/sda", which is, for all intents and purposes, the disk.

Wiping the disk

To wipe a disk, we have to first consider what we actually want to do to the disk. Of course, we want to erase the data on that disk permanently. Since erasing it by replacing the partitions doesn't actually erase the data (it is still there, it just cannot be read without restoring the partition table), we have to actually overwrite all of the data with some other data. We could write lots of random data, but remember that generating random noise takes time and CPU power. Instead, let's just overwrite it all with zeroes, which is very simple. Linux users do things like this so often that the devs saw fit to add a virtual device consisting entirely of zeroes, at "/dev/zero". You can think of this as a disk of infinite size consisting entirely of zeroes. Our aim has changed from the initial "I want to wipe a hard drive" to "I want to overwrite /dev/sda with data from /dev/zero".

There is a command to copy and convert data, and that program is dd. We'd use it to copy data from /dev/zero to /dev/sda as follows. Be careful, because the following command will irreversibly overwrite all of data on your primary hard drive. Make sure you have the right drive and that you really want to do this.

$ sudo dd if=/dev/zero of=/dev/sda

You won't see any output unless there's an error, so don't worry about dd's complete silence. A lack of errors means it's working! Now, you should switch to a different terminal, and let dd run in its own. You can do this by simply opening another terminal window, if you're using a GUI, or press CTRL-ALT-F2, to switch to the 2nd virtual terminal, if you're in text-only mode.

What to do while you wait

Now is a good a time as any to tell you about several useful features of Unix systems which let you find and learn about commands without resorting to the internet (they'd probably point you to this anyway). I am talking about the Unix manual pages. They are accessed through the man command, followed by the name of the program you want to learn more about. For example:

$ man man
MAN(1)               Manual pager utils              MAN(1)

NAME
    man - an interface to the on-line reference manuals

And a lot more information. You can exit the manual pager by pressing "q". This is not all you can do while you wait, if you read the man page for "dd", you might find a way to make it print how much it has copied. The man page is intended to be a reference for experienced users, so don't worry if you don't understand it.

On any OS, processes are not only known by the human-readable names of the programs, they are also known by Process IDs, or PIDs. When a program is run, it is assigned a PID. This means that the higher the PID, the later in the boot process or interactive session it was run. After a program terminates, its PID is recycled and given to the next program to be spawned, or just left unused. To find out the PID of the dd process you ran earlier, we can use the program pgrep which takes a program name, and outputs all of the numerical PIDs associated with programs with that name.

$ pgrep dd

It's safe to assume that the most recently started instance of dd is the one we just started - the one with the highest PID. Now that we know its PID, we can start sending it signals. The program to send a signal to a running program is kill, which is a bit of a misnomer, because not all of the signals it's capable of sending actually kill the process. In this next command, substitute "$pid" with the PID of your dd process.

$ sudo kill -USR1 $pid

This command won't output anything in the terminal you run it in. Instead, it sends a signal to dd to make it print out how much it has copied. This information would be in the terminal that dd was run from. If you're using a GUI terminal, switch to the window dd was run from. If you're in text-only mode, switch back to the 1st virtual terminal by pressing CTRL-ALT-F1.

Closing remarks

I hope you've learnt about more than just how to wipe a disk, although that is a useful skill, too. If you want to learn about Unix using the built-in system tools, you always have your trusty man pages, but also another new tool: apropos. It takes a list of keywords, and parses the man pages, searching for commands which match your description of what you want to do. For example:

$ apropos extract archive
unrar (1)           - extract files from rar archives
unar (1)             - extract archive file content

It generally tends to output many results, not all of which are commands. Remember "man man", the manual page for man? It had a list of section numbers and what they mean. We can see these section numbers in the search results for apropos, in the brackets just after the program name. To save you some effort...

$ man man
1   Executable programs or shell commands
2   System calls (functions provided by the kernel)
3   Library calls (functions within program libraries)
4   Special files (usually found in /dev)
5   File formats and conventions eg /etc/passwd
6   Games
7   Miscellaneous (including macro packages and conventions), e.g. man(7), groff(7)
8   System administration commands (usually only for root)

So out of all the results in that list, only the ones in sections 1 and 8 are usable as programs from the shell prompt. Using these tools, your knowledge of Unix can grow without resorting to searching the internet for hacks other people have made. Even after you know the ins and outs of a program, the man pages are still useful for when you can't remember the order the arguments go in, or the switch to make a program change behaviour, or things along those lines.