In this article we will talk about foremost
, a very useful open source forensic utility which is able to recover deleted files using the technique called data carving
. The utility was originally developed by the United States Air Force Office of Special Investigations, and is able to recover several file types (support for specific file types can be added by the user, via the configuration file). The program can also work on partition images produced by dd or similar tools.
In this tutorial you will learn:
- How to install foremost
- How to use foremost to recover deleted files
- How to add support for a specific file type
Software Requirements and Conventions Used
Category | Requirements, Conventions or Software Version Used |
---|---|
System | Distribution-independent |
Software | The “foremost” program |
Other | Familiarity with the command line interface |
Conventions | # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux commands to be executed as a regular non-privileged user |
Installation
Since foremost
is already present in all the major Linux distributions repositories, installing it is a very easy task. All we have to do is to use our favorite distribution package manager. On Debian and Ubuntu, we can use apt
:
$ sudo apt install foremost
In recent versions of Fedora, we use the dnf
package manager to install packages, the dnf
is a successor of yum
. The name of the package is the same:
$ sudo dnf install foremost
If we are using ArchLinux, we can use pacman
to install foremost
. The program can be found in the distribution “community” repository:
$ sudo pacman -S foremost
Basic usage
No matter which file recovery tool or process your are going to use to recover your files, before you begin it is recommended to perform a low level hard drive or partition backup, hence avoiding an accidental data overwrite !!! In this case you may re-try to recover your files even after unsuccessful recovery attempt. Check the following dd command guide on how to perform hard drive or partition low level backup.
The foremost
utility tries to recover and reconstruct files on the base of their headers, footers and data structures, without relying on filesystem metadata
. This forensic technique is known as file carving
. The program supports various types of files, as for example:
- jpg
- gif
- png
- bmp
- avi
- exe
- mpg
- wav
- riff
- wmv
- mov
- ole
- doc
- zip
- rar
- htm
- cpp
The most basic way to use foremost
is by providing a source to scan for deleted files (it can be either a partition or an image file, as those generated with dd
). Let’s see an example. Imagine we want to scan the /dev/sdb1
partition: before we begin, a very important thing to remember is to never store retrieved data on the same partition we are retrieving the data from, to avoid overwriting delete files still present on the block device. The command we would run is:
$ sudo foremost -i /dev/sdb1
By default, the program creates a directory called output
inside the directory we launched it from and uses it as destination. Inside this directory, a subdirectory for each supported file type we are attempting to retrieve is created. Each directory will hold the corresponding file type obtained from the data carving process:
output ├── audit.txt ├── avi ├── bmp ├── dll ├── doc ├── docx ├── exe ├── gif ├── htm ├── jar ├── jpg ├── mbd ├── mov ├── mp4 ├── mpg ├── ole ├── pdf ├── png ├── ppt ├── pptx ├── rar ├── rif ├── sdw ├── sx ├── sxc ├── sxi ├── sxw ├── vis ├── wav ├── wmv ├── xls ├── xlsx └── zip
When foremost
completes its job, empty directories are removed. Only the ones containing files are left on the filesystem: this let us immediately know what type of files were successfully retrieved. By default the program tries to retrieve all the supported file types; to restrict our search, we can, however, use the -t
option and provide a list of the file types we want to retrieve, separated by a comma. In the example below, we restrict the search only to gif
and pdf
files:
$ sudo foremost -t gif,pdf -i /dev/sdb1
Specifying an alternative destination
As we already said, if a destination is not explicitly declared, foremost creates an output
directory inside our cwd
. What if we want to specify an alternative path? All we have to do is to use the -o
option and provide said path as argument. If the specified directory doesn’t exist, it is created; if it exists but it’s not empty, the program throws a complain:
ERROR: /home/egdoc/data is not empty Please specify another directory or run with -T.
To solve the problem, as suggested by the program itself, we can either use another directory or re-launch the command with the -T
option. If we use the -T
option, the output directory specified with the -o
option is timestamped. This makes possible to run the program multiple times with the same destination. In our case the directory that would be used to store the retrieved files would be:
/home/egdoc/data_Thu_Sep_12_16_32_38_2019
The configuration file
The foremost
configuration file can be used to specify file formats not natively supported by the program. Inside the file we can find several commented examples showing the syntax that should be used to accomplish the task. Here is an example involving the png
type (the lines are commented since the file type is supported by default):
# PNG (used in web pages) # (NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION) # png y 200000 \x50\x4e\x47? \xff\xfc\xfd\xfe
The information to provide in order to add support for a file type, are, from left to right, separated by a tab character: the file extension (png
in this case), whether the header and footer are case sensitive (y
), the maximum file size in Bytes (200000
), the header (\x50\x4e\x47?
) and and the footer (\xff\xfc\xfd\xfe
). Only the latter is optional and can be omitted.
If the path of the configuration file it’s not explicitly provided with the -c
option, a file named foremost.conf
is searched and used, if present, in the current working directory. If it is not found the default configuration file, /etc/foremost.conf
is used instead.
Adding the support for a file type
By reading the examples provided in the configuration file, we can easily add support for a new file type. In this example we will add support for flac
audio files. Flac
(Free Lossless Audio Coded) is a non-proprietary lossless audio format which is able to provide compressed audio without quality loss. First of all, we know that the header of this file type in hexadecimal form is 66 4C 61 43 00 00 00 22
(fLaC
in ASCII), and we can verify it by using a program like hexdump
on a flac file:
$ hexdump -C blind_guardian_war_of_wrath.flac|head 00000000 66 4c 61 43 00 00 00 22 12 00 12 00 00 00 0e 00 |fLaC..."........| 00000010 36 f2 0a c4 42 f0 00 4d 04 60 6d 0b 64 36 d7 bd |6...B..M.`m.d6..| 00000020 3e 4c 0d 8b c1 46 b6 fe cd 42 04 00 03 db 20 00 |>L...F...B.... .| 00000030 00 00 72 65 66 65 72 65 6e 63 65 20 6c 69 62 46 |..reference libF| 00000040 4c 41 43 20 31 2e 33 2e 31 20 32 30 31 34 31 31 |LAC 1.3.1 201411| 00000050 32 35 21 00 00 00 12 00 00 00 54 49 54 4c 45 3d |25!.......TITLE=| 00000060 57 61 72 20 6f 66 20 57 72 61 74 68 11 00 00 00 |War of Wrath....| 00000070 52 45 4c 45 41 53 45 43 4f 55 4e 54 52 59 3d 44 |RELEASECOUNTRY=D| 00000080 45 0c 00 00 00 54 4f 54 41 4c 44 49 53 43 53 3d |E....TOTALDISCS=| 00000090 32 0c 00 00 00 4c 41 42 45 4c 3d 56 69 72 67 69 |2....LABEL=Virgi|
As you can see the file signature is indeed what we expected. Here we will assume a maximum file size of 30 MB, or 30000000 Bytes. Let’s add the entry to the file:
flac y 30000000 \x66\x4c\x61\x43\x00\x00\x00\x22
The footer
signature is optional so here we didn’t provide it. The program should now be able to recover deleted flac
files. Let’s verify it. To test that everything works as expected I previously placed, and then removed, a flac file from the /dev/sdb1
partition, and then proceeded to run the command:
$ sudo foremost -i /dev/sdb1 -o $HOME/Documents/output
As expected, the program was able to retrieve the deleted flac file (it was the only file on the device, on purpose), although it renamed it with a random string. The original filename cannot be retrieved because, as we know, files metadata is contained in the filesystem, and not in the file itself:
/home/egdoc/Documents └── output ├── audit.txt └── flac └── 00020482.flac
The audit.txt file contains information about the actions performed by the program, in this case:
Foremost version 1.5.7 by Jesse Kornblum, Kris Kendall, and Nick Mikus Audit File Foremost started at Thu Sep 12 23:47:04 2019 Invocation: foremost -i /dev/sdb1 -o /home/egdoc/Documents/output Output directory: /home/egdoc/Documents/output Configuration file: /etc/foremost.conf ------------------------------------------------------------------ File: /dev/sdb1 Start: Thu Sep 12 23:47:04 2019 Length: 200 MB (209715200 bytes) Num Name (bs=512) Size File Offset Comment 0: 00020482.flac 28 MB 10486784 Finish: Thu Sep 12 23:47:04 2019 1 FILES EXTRACTED flac:= 1 ------------------------------------------------------------------ Foremost finished at Thu Sep 12 23:47:04 2019
Conclusion
In this article we learned how to use foremost, a forensic program able to retrieve deleted files of various types. We learned that the program works by using a technique called data carving
, and relies on files signatures to achieve its goal. We saw an example of the program usage and we also learned how to add the support for a specific file type using the syntax illustrated in the configuration file. For more information about the program usage, please consult its manual page.