Question: How do you recover a deleted file that's been overwritten?
Answer: With the greatest of difficulty.
The blog post following this one in a few days time is, ironically, about backing things up. However, I actually ended up losing the entire post during the upload process to my server (it replaced both the source and destination files with an empty file!). I'd already saved it to disk, but still almost lost it anyway......
Recovery of deleted files is awkward at best. It relies on the fact that when you 'delete' something, it doesn't erase it from disk at all - just deallocates the sectors on disk that it was taking up and re-enters them into free pool of space - which is then re-used at will.
The most important thing to remember when you've just lost something is to not touch anything. Shutdown your computer, and, if you're not confident enough yourself, call someone who knows what they're doing to help you out.
The best way to recover a file is to boot into a live cd. This is a CD (or flash drive) that holds an (or multiple!) entire operating system(s). This way, no additional writing is done to the disk containing the deleted file - potentially corrupting it.
After fiddling about with this (I had to update my bootable flash drive, as Ubuntu 15.10 is out of support and I couldn't download the extundelete
tool, whichI'll mention shortly), I found that I'd hit a dead-end.
I was using the extundelete
(sudo apt install extundelete
, apt) tool, and it claimed that it couldn't restore the file because it had been reallocated. Here's the command I used:
sudo extundelete --restore-file /absolute/path/to/file /dev/sda7
I suspect that it was getting confused because I had a file by that name on disk that was now empty.
Anyway, after doing something else for a while, I had an idea. Since my blog posts are just text files on disk, shouldn't it be on my disk somewhere? Could I locate it at all?
As it turns out, the answer is yes. Remembering a short sentence from the post I'd just written, I started a brute-force search of my disk:
sudo dd if=/dev/sda7 | strings | grep -i "AWS S3"
This has several components to it Explain Shell is great at providing an explanation of each bit in turn. Here's a short summary:
dd
- This reads in the entire contents of a partition and pushes it into the following command. Find the partition name withlsblk
.strings
- This extracts all runs of printable characters from the input stream.grep
- This searches (case-insensitively with-i
) for an specified string in the input
I started to get results - a whole line from the blog post that had supposedly been deleted and overwritten! This wasn't really enough though. Taking a longer snippet to reduce the noise in the output, I tried again:
sudo dd if=/dev/sda7 | strings | grep -i -C100 "To start, we'll need an AWS S3 bucket"
This time, I added -C100
. This tells grep
that I want to see 100 lines before and after any lines that contain the specified search string.
With this, I managed to recover enough of the blog post to quickly re-edit and upload it. It did appear to remove blank lines and the back-ticks at the end of a code block, but they are easy to replace.
Note to self: Always copy first when crossing file system boundaries, and delete later. Don't move all in one go!