The basic concept involved with the use of the osquery application is the “tabular abstraction” of many aspects of the operating system, such as processes, users, etc. The data is stored in tables which can be queried using SQL
syntax, directly via the osqueryi
shell, or via the osqueryd
daemon.
In this tutorial we will see how to install the application, how to run basic queries, and how to use FIM
(File Integrity Monitoring) as part of your Linux system administration job.
In this tutorial you will learn:
- How to install osquery
- How to list the available tables
- How to perform queries from the osqueryi shell
- How to use the osqueryd daemon to monitor file integrity
Software Requirements and Conventions Used
- Basic knowledge of SQL concepts
- Root permissions to perform administrative tasks
Category | Requirements, Conventions or Software Version Used |
---|---|
System | Distribution-independent |
Software | Osquery |
Other | |
Conventions | # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux commands to be executed as a regular non-privileged user |
Installation
We have basically two option to install osquery
: the first consists into downloading the appropriate package for our system from the official website; the second, usually preferred, is to add the osquery repository to our distribution software sources. Here we will briefly explore both options.
Installing via package
From the official osquery website is possible to download signed deb
and rpm
packages, or more generic tarballs. As a first thing we select the version we want to install, then we download a package.
The advice is to select the latest available version (4.1.2 at the moment of writing). Once the package is downloaded we can install it using our distribution package manager. For example, to install the software on a Fedora system (assuming the package is located in our current working directory), we would run:
$ sudo dnf install ./osquery-4.1.2-1.linux.x86_64.rpm
Using a repository
As an alternative we can add the rpm
or deb
repository to our distribution. If we are using a rpm-based distribution, we can run the following commands to accomplish the task:
$ curl -L https://pkg.osquery.io/rpm/GPG | sudo tee /etc/pki/rpm-gpg/RPM-GPG-KEY-osquery $ sudo yum-config-manager --add-repo https://pkg.osquery.io/rpm/osquery-s3-rpm.repo $ sudo yum-config-manager --enable osquery-s3-rpm-repo $ sudo yum install osquery
With the linux commands above, we add the gpg pulic key used to sign the packages to our system, then we add the repository. Finally, we install the osquery package. Notice that yum
, in recent versions of Fedora and CentOS/RHEL is just a symbolic link to dnf
, so when we invoke the former the latter is used instead.
If we are running a Debian-based distribution, instead, we can add the deb repository to our software sources by running:
$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 1484120AC4E9F8A1A577AEEE97A80C63C9D8B80B $ sudo add-apt-repository 'deb [arch=amd64] https://pkg.osquery.io/deb deb main' $ sudo apt-get update $ sudo apt-get install osquery
Once we the package is installed, we can take a look at the software basic usage.
Basic usage
Osquery allow us to monitor various aspects of an operating system adopting a “tabular abstraction”, using a SQL syntax similar to the one used on sqlite
databases. The queries are run on tables which abstracts various operating system aspects, such as processes and services.
We can run the queries directly using the osqueryi
interactive shell, or we can schedule them via the osqueryd
daemon. Here is an example of a query to list all the available tables (the complete list with the tables description can also be found
online):
$ osqueryi osquery> .tables => acpi_tables => apt_sources => arp_cache => atom_packages => augeas => authorized_keys => block_devices => carbon_black_info => carves => chrome_extensions => cpu_time => cpuid => crontab => curl => curl_certificate => deb_packages => device_file => device_hash => device_partitions => disk_encryption => dns_resolvers => docker_container_labels => docker_container_mounts => docker_container_networks => docker_container_ports => docker_container_processes => docker_container_stats => docker_containers => docker_image_labels => docker_images => docker_info => docker_network_labels => docker_networks => docker_version => docker_volume_labels => docker_volumes => ec2_instance_metadata => ec2_instance_tags => elf_dynamic => elf_info => elf_sections => elf_segments => elf_symbols => etc_hosts => etc_protocols => etc_services => file => file_events => firefox_addons => groups => hardware_events => hash => intel_me_info => interface_addresses => interface_details => interface_ipv6 => iptables => kernel_info => kernel_integrity => kernel_modules => known_hosts => last => listening_ports => lldp_neighbors => load_average => logged_in_users => magic => md_devices => md_drives => md_personalities => memory_array_mapped_addresses => memory_arrays => memory_device_mapped_addresses => memory_devices => memory_error_info => memory_info => memory_map => mounts => msr => npm_packages => oem_strings => opera_extensions => os_version => osquery_events => osquery_extensions => osquery_flags => osquery_info => osquery_packs => osquery_registry => osquery_schedule => pci_devices => platform_info => portage_keywords => portage_packages => portage_use => process_envs => process_events => process_file_events => process_memory_map => process_namespaces => process_open_files => process_open_sockets => processes => prometheus_metrics => python_packages => routes => rpm_package_files => rpm_packages => selinux_events => shadow => shared_memory => shell_history => smart_drive_info => smbios_tables => socket_events => ssh_configs => sudoers => suid_bin => syslog_events => system_controls => system_info => time => ulimit_info => uptime => usb_devices => user_events => user_groups => user_ssh_keys => users => yara => yara_events => yum_sources
Running the osqueryi
command we enter the interactive shell; from it, we can issue our queries and instructions. Here is another example of a query, this time to list all the running processes pid
and name
. The query is performed on the process
table (the output of the query has been truncated for convenience):
osquery> SELECT pid, name FROM processes;
+-------+------------------------------------+
| pid | name |
+-------+------------------------------------+
| 1 | systemd |
| 10 | rcu_sched |
| 10333 | kworker/u16:5-events_unbound |
| 10336 | kworker/2:0-events |
| 11 | migration/0 |
| 11002 | kworker/u16:1-kcryptd/253:0 |
| 11165 | kworker/1:1-events |
| 11200 | kworker/1:3-events |
| 11227 | bash |
| 11368 | osqueryi |
| 11381 | kworker/0:0-events |
| 11395 | Web Content |
| 11437 | kworker/0:2-events |
| 11461 | kworker/3:2-events_power_efficient |
| 11508 | kworker/2:2 |
| 11509 | kworker/0:1-events |
| 11510 | kworker/u16:2-kcryptd/253:0 |
| 11530 | bash |
[...] |
+-------+------------------------------------+
It’s even possible to perform queries on joined tables using the JOIN
statement, just like we use to do in relational databases. In the example below we perform a query on the processes
table, joined with the users
one via the uid
column:
osquery> SELECT processes.pid, processes.name, users.username FROM processes JOIN
users ON processes.uid = users.uid;
+-------+-------------------------------+------------------+
| pid | name | username |
+-------+-------------------------------+------------------+
| 1 | systemd | root |
| 10 | rcu_sched | root |
| 11 | migration/0 | root |
| 11227 | bash | egdoc |
| 11368 | osqueryi | egdoc |
| 13 | cpuhp/0 | root |
| 14 | cpuhp/1 | root |
| 143 | kintegrityd | root |
| 144 | kblockd | root |
| 145 | blkcg_punt_bio | root |
| 146 | tpm_dev_wq | root |
| 147 | ata_sff | root |
[...]
| 9130 | Web Content | egdoc |
| 9298 | Web Content | egdoc |
| 9463 | gvfsd-metadata | egdoc |
| 9497 | gvfsd-network | egdoc |
| 9518 | gvfsd-dnssd | egdoc |
+-------+-------------------------------+------------------+
File Integrity Monitoring (FIM)
Until now we used osquery
via the interactive shell: osqueryi
. To use FIM
(File Integrity Monitoring), we want to use the osqueryd
daemon instead. Via the configuration file, we provide a list of the files we want to monitor. Events such as attribute changes involving the specified files and directories, are recorded in the file_events
table. The daemon runs a query on this table after a specified interval of time and notify in the logs when new records are found. Let’s see a configuration example.
Configuration setup
The main configuration file for osquery is /etc/osquery/osquery.conf
. The file doesn’t exists by default, so we must create it. The configuration is provided in Json
format. Suppose we want to monitor all the files and directories under /etc
; here is how we would configure the application:
{
"options": {
"disable_events": "false"
},
"schedule": {
"file_events": {
"query": "SELECT * FROM file_events;",
"interval": 300
}
},
"file_paths": {
"etc": [
"/etc/%%"
],
},
}
Let’s analyze the configuration above. First of all, in the options
section, we set disable_events
to "false"
, in order to enable file events.
After that, we created the schedule
section: inside this section we can describe and create various named scheduled queries. In our case we created a query which selects all columns from the file_events
table, which is meant to be executed every 300
seconds (5 minutes).
After scheduling the query, we created the file_paths
section, where we specified the files to be monitored. In this section, each key represents the name of a set of files to be monitored (a category in the osquery jargon). In this case the “etc” key references a list with only one entry, /etc/%%
.
What the %
symbol stands for? When specifying file paths we can use standard (*
) or SQL (%
) wildcards. If a single wildcard is provided, it selects all files and directories existing at the specified level. If a double wildcard is provided it selects all files and folders recursively. For example, the /etc/%
expression matches all files and folders one level under /etc
, while /etc/%%
matches all files and folders under /etc
recursively.
If we need to, we can also exclude specific files from the path we provided, using the exclude_paths
section in the configuration file. In the section we can only reference categories defined in the file_paths
section, (“etc” in this case). We provide the list of files to be excluded:
"exclude_paths": {
"etc": [
"/etc/aliases"
]
}
Just as an example, we excluded the /etc/aliases
file from the list. Here is how our final configuration looks like:
{
"options": {
"disable_events": "false"
},
"schedule": {
"file_events": {
"query": "SELECT * FROM file_events;",
"interval": 20
}
},
"file_paths": {
"etc": [
"/etc/%%"
]
},
"exclude_paths": {
"etc": [
"/etc/aliases"
]
}
}
Starting the daemon
With our configuration in place, we can start the osqueryd
daemon:
$ sudo systemctl start osqueryd
To make the daemon start automatically at boot we must run:
$ sudo systemctl enable osqueyd
Once the the daemon is running, we can check our configuration works. Just as an example, we will modify the permissions of the /etc/fstab
file, changing them from 644
to 600
:
$ sudo chmod 600 /etc/fstab
We can now verify the change to the file has been recorded by reading the /var/log/osquery/osqueryd.results.log
file. Here is the last line of the file (beautified):
{
"name":"file_events",
"hostIdentifier":"fingolfin",
"calendarTime":"Mon Dec 30 19:57:31 2019 UTC",
"unixTime":1577735851,
"epoch":0,
"counter":0,
"logNumericsAsNumbers":false,
"columns": {
"action":"ATTRIBUTES_MODIFIED",
"atime":"1577735683",
"category":"etc",
"ctime":"1577735841",
"gid":"0",
"hashed":"0",
"inode":"262147",
"md5":"",
"mode":"0600",
"mtime":"1577371335",
"sha1":"",
"sha256":"",
"size":"742",
"target_path":"/etc/fstab",
"time":"1577735841",
"transaction_id":"0",
"uid":"0"
},
"action":"added"
}
In the log above, we can clearly see that an ATTRIBUTES_MODIFIED
action (Line 10) occurred on the target_path
“/etc/fstab” (Line 23), which is part of the “etc” category
(Line 12). It’s important to notice that if we query the file_events
table from the osqueryi
shell, we will see no rows, since the osqueryd
daemon, and osqueryi
don’t communicate.
Conclusions
In this tutorial we saw the basic concepts involved in the use of the osquery
application, which abstracts various operating system concepts using tabular data we can query using SQL syntax. We saw how to install the application, how to perform basic queries using the osqueryi
shell, and finally how to setup file monitoring using the osqueryd
daemon. We just scratched the surface of what the application can do; as always, the advice is to take a look at the project documentation for a more in depth knowledge.