Title
#core
Antoinette

Antoinette

02/05/2020, 5:19 PM
It got kind of difficult to find solid information about the profile script and the stats it returns so I dug and documented. Dropping it here in case anyone else wants it The osquery
profile.py
script uses utils.py in
tools/tests/
which uses python’s
psutil
library to collect process stats for osqueryi as its running given queries.  The script turns 5 stats: Utilization (U): Utilization is calculated by taking the average of non-0 results of the
cpu_percent(interval=1)
function in
psutils.Process()
. This value can be greater than 100% for processes with threads running across different CPUs. The script sets an interval of 1 meaning that the function compares process time to system CPU times elapsed before and after the 1 second interval. This is a blocking call.  CPU time (C): CPU time uses the
psutils.Process()
's
cpu_times()
function. It returns a named tuple containing user, system, children_user, system_user, and iowait • user: time spent in user mode. • system: time spent in kernel mode. • children_user: user time of all child processes (always 0 on Windows and macOS). • system_user: user time of all child processes (always 0 on Windows and macOS). • iowait: (Linux) time spent waiting for blocking I/O to complete. This value is excluded from user and system times count (because the CPU is not doing any work). The profile script adds user and system together for the CPU Time output. Duration (D): Duration is calculated by taking the subtracting
start_time - 2
from the current time. The start time is set before the script starts the osqueryi process to run the query. (I’m not very sure about why the 2 is there. I don’t know how useful of a metric this is when compared to the cpu time, but someone might have better insights.) fds (F): Uses the
num_fds()
function and returns the file descriptors used by the osqueryi process during query execution Memory (M): Uses the
memory_info_ex()
command which is deprecated. Psutils documentation suggests using
memory_info()
instead. The function returns a named tuple and the script uses the
rss
value in the tuple. RSS stands for resident set size and is the non-swapped physical memory used by the process. This should match the RES column in
top
.
theopolis

theopolis

02/05/2020, 9:56 PM
Sorry for the lack of documentation and thanks for putting this explanation together, this rocks!
10:04 PM
The
start_time - 2
is odd, it comes from an esoteric option used by the profile script called
--profile_delay
https://github.com/osquery/osquery/blob/master/tools/analysis/profile.py#L72 This causes osquery to wait before and after running the code under test https://github.com/osquery/osquery/blob/master/osquery/main/main.cpp#L71 The rational for this, if I recal correctly, was that some test exercises would exit very quickly and we wanted to provide a buffer of time for the python process inspection.
Antoinette

Antoinette

02/05/2020, 10:05 PM
ooooooh