In this article we will talk about foremost, a very useful open source forensic utility which is able to recover deleted files using the technique called data carving. The utility was originally developed by the United States Air Force Office of Special Investigations, and is able to recover several file types (support for specific file types can be added by the user, via the configuration file). The program can also work on partition images produced by dd or similar tools.

In this tutorial you will learn:
  • How to install foremost
  • How to use foremost to recover deleted files
  • How to add support for a specific file type
foremost-manual
Foremost is a forensic data recovery program for Linux used to recover files using their headers, footers, and data structures through a process known as file carving.

Software Requirements and Conventions Used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Distribution-independent
Software The "foremost" program
Other Familiarity with the command line interface
Conventions # - requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command
$ - requires given linux commands to be executed as a regular non-privileged user

Installation

Since foremost is already present in all the major Linux distributions repositories, installing it is a very easy task. All we have to do is to use our favorite distribution package manager. On Debian and Ubuntu, we can use apt:

$ sudo apt install foremost

In recent versions of Fedora, we use the dnf package manager to install packages, the dnf is a successor of yum. The name of the package is the same:

$ sudo dnf install foremost

If we are using ArchLinux, we can use pacman to install foremost. The program can be found in the distribution "community" repository:

$ sudo pacman -S foremost

$$$ Looking for LINUX ADMINISTRATOR ! $$$

BLUE SKY STUDIOS are looking for Linux Administrator to maintain and support the Studio's 450+ production Linux workstations, including daily interactions with the Studio’s digital animation artists.
LOCATION: Greenwich, Connecticut, USA

APPLY NOW

Basic usage

WARNING
No matter which file recovery tool or process your are going to use to recover your files, before you begin it is recommended to perform a low level hard drive or partition backup, hence avoiding an accidental data overwrite !!! In this case you may re-try to recover your files even after unsuccessful recovery attempt. Check the following dd command guide on how to perform hard drive or partition low level backup.

The foremost utility tries to recover and reconstruct files on the base of their headers, footers and data structures, without relying on filesystem metadata. This forensic technique is known as file carving. The program supports various types of files, as for example:

  • jpg
  • gif
  • png
  • bmp
  • avi
  • exe
  • mpg
  • wav
  • riff
  • wmv
  • mov
  • pdf
  • ole
  • doc
  • zip
  • rar
  • htm
  • cpp

The most basic way to use foremost is by providing a source to scan for deleted files (it can be either a partition or an image file, as those generated with dd). Let's see an example. Imagine we want to scan the /dev/sdb1 partition:  before we begin, a very important thing to remember is to never store retrieved data on the same partition we are retrieving the data from, to avoid overwriting delete files still present on the block device. The command we would run is:

$ sudo foremost -i /dev/sdb1

By default, the program creates a directory called output inside the directory we launched it from and uses it as destination. Inside this directory, a subdirectory for each supported file type we are attempting to retrieve is created. Each directory will hold the corresponding file type obtained from the data carving process:

output
├── audit.txt
├── avi
├── bmp
├── dll
├── doc
├── docx
├── exe
├── gif
├── htm
├── jar
├── jpg
├── mbd
├── mov
├── mp4
├── mpg
├── ole
├── pdf
├── png  
├── ppt
├── pptx
├── rar
├── rif
├── sdw
├── sx
├── sxc
├── sxi
├── sxw
├── vis
├── wav
├── wmv
├── xls
├── xlsx
└── zip

When foremost completes its job, empty directories are removed. Only the ones containing files are left on the filesystem: this let us immediately know what type of files were successfully retrieved. By default the program tries to retrieve all the supported file types; to restrict our search, we can, however, use the -t option and provide a list of the file types we want to retrieve, separated by a comma. In the example below, we restrict the search only to gif and pdf files:

$ sudo foremost -t gif,pdf -i /dev/sdb1
In this video we will test the forensic data recovery program Foremost to recover a single png file from /dev/sdb1 partition formatted with the EXT4 filesystem.


Specifying an alternative destination

As we already said, if a destination is not explicitly declared, foremost creates an output directory inside our cwd. What if we want to specify an alternative path? All we have to do is to use the -o option and provide said path as argument. If the specified directory doesn't exist, it is created; if it exists but it's not empty, the program throws a complain:

ERROR: /home/egdoc/data is not empty
 	Please specify another directory or run with -T.

To solve the problem, as suggested by the program itself, we can either use another directory or re-launch the command with the -T option. If we use the -T option, the output directory specified with the -o option is timestamped. This makes possible to run the program multiple times with the same destination. In our case the directory that would be used to store the retrieved files would be:

/home/egdoc/data_Thu_Sep_12_16_32_38_2019

The configuration file

The foremost configuration file can be used to specify file formats not natively supported by the program. Inside the file we can find several commented examples showing the syntax that should be used to accomplish the task. Here is an example involving the png type (the lines are commented since the file type is supported by default):

# PNG   (used in web pages)
#	(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)
#  	png	y	200000	\x50\x4e\x47?	\xff\xfc\xfd\xfe

The information to provide in order to add support for a file type, are, from left to right, separated by a tab character: the file extension (png in this case), whether the header and footer are case sensitive (y), the maximum file size in Bytes (200000), the header (\x50\x4e\x47?) and and the footer (\xff\xfc\xfd\xfe). Only the latter is optional and can be omitted.

If the path of the configuration file it's not explicitly provided with the -c option, a file named foremost.conf is searched and used, if present, in the current working directory. If it is not found the default configuration file, /etc/foremost.conf is used instead.

Adding the support for a file type

By reading the examples provided in the configuration file, we can easily add support for a new file type. In this example we will add support for flac audio files. Flac (Free Lossless Audio Coded) is a non-proprietary lossless audio format which is able to provide compressed audio without quality loss. First of all, we know that the header of this file type in hexadecimal form is 66 4C 61 43 00 00 00 22 (fLaC in ASCII), and we can verify it by using a program like hexdump on a flac file:

$ hexdump -C
blind_guardian_war_of_wrath.flac|head
00000000  66 4c 61 43 00 00 00 22  12 00 12 00 00 00 0e 00  |fLaC..."........|
00000010  36 f2 0a c4 42 f0 00 4d  04 60 6d 0b 64 36 d7 bd  |6...B..M.`m.d6..|
00000020  3e 4c 0d 8b c1 46 b6 fe  cd 42 04 00 03 db 20 00  |>L...F...B.... .|
00000030  00 00 72 65 66 65 72 65  6e 63 65 20 6c 69 62 46  |..reference libF|
00000040  4c 41 43 20 31 2e 33 2e  31 20 32 30 31 34 31 31  |LAC 1.3.1 201411|
00000050  32 35 21 00 00 00 12 00  00 00 54 49 54 4c 45 3d  |25!.......TITLE=|
00000060  57 61 72 20 6f 66 20 57  72 61 74 68 11 00 00 00  |War of Wrath....|
00000070  52 45 4c 45 41 53 45 43  4f 55 4e 54 52 59 3d 44  |RELEASECOUNTRY=D|
00000080  45 0c 00 00 00 54 4f 54  41 4c 44 49 53 43 53 3d  |E....TOTALDISCS=|
00000090  32 0c 00 00 00 4c 41 42  45 4c 3d 56 69 72 67 69  |2....LABEL=Virgi|

As you can see the file signature is indeed what we expected. Here we will assume a maximum file size of 30 MB, or 30000000 Bytes. Let's add the entry to the file:

flac    y       30000000    \x66\x4c\x61\x43\x00\x00\x00\x22

The footer signature is optional so here we didn't provide it. The program should now be able to recover deleted flac files. Let's verify it. To test that everything works as expected I previously placed, and then removed, a flac file from the /dev/sdb1 partition, and then proceeded to run the command:

$ sudo foremost -i /dev/sdb1 -o $HOME/Documents/output

As expected, the program was able to retrieve the deleted flac file (it was the only file on the device, on purpose), although it renamed it with a random string. The original filename cannot be retrieved because, as we know, files metadata is contained in the filesystem, and not in the file itself:

/home/egdoc/Documents
└── output
    ├── audit.txt
    └── flac
        └── 00020482.flac


The audit.txt file contains information about the actions performed by the program, in this case:

Foremost version 1.5.7 by Jesse Kornblum, Kris
Kendall, and Nick Mikus
Audit File

Foremost started at Thu Sep 12 23:47:04 2019
Invocation: foremost -i /dev/sdb1 -o /home/egdoc/Documents/output
Output directory: /home/egdoc/Documents/output
Configuration file: /etc/foremost.conf
------------------------------------------------------------------
File: /dev/sdb1
Start: Thu Sep 12 23:47:04 2019
Length: 200 MB (209715200 bytes)

Num	 Name (bs=512)	       Size	 File Offset	 Comment

0:	00020482.flac 	      28 MB 	   10486784
Finish: Thu Sep 12 23:47:04 2019

1 FILES EXTRACTED

flac:= 1
------------------------------------------------------------------

Foremost finished at Thu Sep 12 23:47:04 2019

Conclusion

In this article we learned how to use foremost, a forensic program able to retrieve deleted files of various types. We learned that the program works by using a technique called data carving, and relies on files signatures to achieve its goal. We saw an example of the program usage and we also learned how to add the support for a specific file type using the syntax illustrated in the configuration file. For more information about the program usage, please consult its manual page.

ARE YOU LOOKING FOR A LINUX JOB?
Submit your RESUME, create a JOB ALERT or subscribe to RSS feed on LinuxCareers.com.
LINUX CAREER NEWSLETTER
Subscribe to NEWSLETTER and receive latest news, jobs, career advice and tutorials.
DO YOU NEED ADDITIONAL HELP?
Get extra help by visiting our LINUX FORUM or simply use comments below.