thread on the `unified log` table osquery #general

:thread: on the `unified_log` table

grahamgilbert

10/01/2022, 1:18 AM

🧵 on the

unified_log

table

grahamgilbert

10/01/2022, 1:19 AM

It looks like we are limited to 100 rows no matter what

timestamp

clause we use. How can we get all of the results out of this table? Or is the recommendation to run it every 60 or 30 seconds or something?

grahamgilbert

10/01/2022, 1:19 AM

Copy code

osquery> select count(*) from unified_log where subsystem="com.apple.SoftwareUpdate" AND timestamp > (SELECT unix_time from time) - 3600;
count(*) = 100

seph

10/01/2022, 12:34 PM

The

max_rows

rows column is how many it’ll fetch from the underlying API

seph

10/01/2022, 12:35 PM

But it’s worth noting the unified log is huge. I don’t think you can get everything out of it with a simple select.

seph

10/01/2022, 12:35 PM

So it has an awkward not-quite-event model.

seph

10/01/2022, 12:36 PM

The suggestion is to set timestamp to

-1

, and it will behave like it’s evented, and a suitable max_rows and query frequency.

seph

10/01/2022, 12:36 PM

How large

max_rows

can be depends a lot on your specific installation environment

grahamgilbert

10/01/2022, 5:37 PM

Can you expand on the “behave like it's evented” statement? So we should just set the time stamp to -1 and call it every x seconds?

grahamgilbert

10/01/2022, 5:38 PM

With the asl table we scheduled our query for five minutes, which worked okay.

seph

10/03/2022, 12:00 PM

The

unified_log

is unique as a table. If you query it normally, it searches the log. The log is very large, thus there is a max_rows parameter. Otherwise an exploratory

select * from unified_log

would overwhelm something. If you include

timestamp = -1

it behaves like a log follower. It tracks the last returned timestamp, and appends that. This feature was contributed to allow someone to pull the whole log. It’s a bit like it’s evented, but implemented very differently. (It’s somewhat beta, it may not work quite right, and it may change, etc) As for what you should do… I don’t know. I imagine up the max_rows to as high as your pipeline can handle, and set timestamp to -1. Then fetch on a suitable interval.

Brandon Kurtz

10/03/2022, 5:44 PM

If you include
timestamp = -1
it behaves like a log follower. It tracks the last returned timestamp, and appends that.

What do you mean when you say "the last returned timestamp"? we have to run the query multiple times in a row in order to get more and more (hence appened) results?

Brandon Kurtz

10/03/2022, 5:49 PM

In any case, I can't figure out how to get any results from unified_log right now:

Copy code

select * from unified_log WHERE process="kernel" AND timestamp=-1 AND max_rows=1000;

Console shows results from this as of 10s ago but not seeing it in the table

zwass

10/04/2022, 1:10 AM

cc @Daniel Bretón Suárez who wrote the table

ty 1

zwass

10/04/2022, 1:19 AM

IIRC the use case Daniel had in mind was to be able to continually query for the full unified log. Just looked at the code and Seph's syntax is slightly off. What you need is

timestamp > -1

. If you schedule that query with a sufficiently high

max_rows

you should eventually get the entire log and then be essentially "streaming" new results as they come in.

zwass

10/04/2022, 1:20 AM

(if your

max_rows

is too small then the log would grow faster than you were collecting it)

Daniel Bretón Suárez

10/04/2022, 7:02 AM

@Brandon Kurtz as Zach and Joseph said, if you add the condition

timestamp > -1

, the table uses a mode that saves the last log entry it has returned, so the next time you query the table, it will return logs starting from the entry after the last it has return. Imagine we have entries like: | ***last (0)*** | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 |

select * from unified_log where timestamp > -1 and max_rows = 10

It returns the entries ***1st to 10th*** | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 ***last*** | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | And again,

select * from unified_log where timestamp > -1 and max_rows = 10

It returns the entries ***11th to 20th*** | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 ***last*** | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | If the OS deletes entries (we can't control that), you will still be getting sequential logs, but you could lose some. | ***last (20)*** | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | My advice is to set max_rows so that eventually the number of rows returned is less than max_rows. I mean, if max_rows = 100 and you're always getting 100 entries, it means there were more entries in the buffer. If max_rows = 500, and eventually you get i.e. 450 entries, you know you have "emptied" the buffer.

Brandon Kurtz

10/04/2022, 6:15 PM

Is it possible to go “backwards” with the results or will the pointer moving through the “buffer” always go forward?

Brandon Kurtz

10/04/2022, 6:15 PM

I.e if I want to see historical info, I need to record the results of the query myself in soem external system?

zwass

10/04/2022, 10:27 PM

You can still run a historical search if you leave out the

timestamp

column. If you want the entire log, you would have to record it in an external system.

34 Views

Open in Slack

Previous Next