Hi all i would like to check the status on the work being do osquery #macos

Hi all, i would like to check the status on the wo...

Juan Alvarez

06/04/2021, 4:46 PM

Hi all, i would like to check the status on the work being done in https://github.com/osquery/osquery/pull/6904 , is there any chance to see this happening in the short term? or is it just looking for somebody with time to do the final changes?

Mike Myers

06/04/2021, 6:25 PM

yea it's lacking someone to carry it over the finish line

Juan Alvarez

06/04/2021, 6:42 PM

i see, thanks for the answer! it looks so close ! 😄

allister

06/07/2021, 12:48 AM

in the meantime this extension is available https://github.com/macadmins/osquery-extension/tree/main/tables/unifiedlog

👍 2

seph

06/08/2021, 8:13 PM

That macadmins extension is an exec. Might be okay but be cautious there.

allister

06/09/2021, 12:45 AM

I am not certain of the downside of an exec in general? Yes we've discussed pseudo 'core-tenets' of osquery being "performance as a feature" and "never shell out", there is a non-zero 'impedance/context switch' cost, but what brings to mind the need for caution?

seph

06/09/2021, 1:45 AM

Philosophy aside, it’s can have high resource utilization and unexpected failure modes. So it’s something to be wary of, but sometimes one pragmatically accepts it.

seph

06/09/2021, 1:46 AM

Launcher tries to avoid execing, but there’s a wealth of info, which we need to exec to get. So we do. Just gotta be aware of the risks

👍 2

puffycid

06/23/2021, 3:42 AM

i can take stab at finishing this, if this feature is still desired/wanted i may make a few changes though😄

🙌 1

allister

06/23/2021, 4:43 AM

considering TOB had removed their version a while back it would be great to get this finished being accepted in core, thanks!

seph

06/23/2021, 3:38 PM

It would be great if someone picked up pr 6904. I’m not sure where it was left.

seph

06/23/2021, 3:39 PM

There was never an extension or osquery release with the functionality in that PR.

puffycid

08/03/2021, 5:49 PM

This was mentioned/brought up in the office hours today I have this fixed I think Its a small change and should address all comments I also have tests for it as well I was able to run it multiple times on all 20 million+ log entries on my mac with no errors I hoping to comment/offer fixes/suggestions this week/weekend Unless someone else already fixed this Then no worries

🆒 1

seph

08/03/2021, 6:05 PM

@puffycid Are you saying you have a current version of https://github.com/osquery/osquery/pull/6904 or that you have a current version of something exec based?

seph

08/03/2021, 6:05 PM

If you have a current version of #6904, I’d love to see a PR!

puffycid

08/03/2021, 6:10 PM

It fixes the api version of #6904 No exec usage Its a small ~10 line change I hoping to comment/suggest changes later this week Unless a new PR is preferred? I've never edited an existing pr before

puffycid

08/03/2021, 6:10 PM

Whatever is easier

seph

08/03/2021, 6:10 PM

New PR seems easier?

👍 1

seph

08/03/2021, 6:11 PM

Editing existing ones is hard.

👍 1

puffycid

08/16/2021, 2:25 AM

i pushed a PR adding this to osquery it should address all issues/comments in the original PR 6904 Sorry for the delay (work/life stuff) I should have much more availability now and can quickly address any issues/comments for the PR ive opened (in addition to others i opened) also as fyi/shameless plug, I just wanted to mention that i am also trying to implement a raw UAL parsing support to osquery at https://github.com/puffyCid/osquery/tree/unifiedlogs. The benefit of raw parsing is that it will support older macos versions and it is slightly more forensic focus (raw parsing vs api). Its ~60% done and it still has lots of work to be done, but just wanted to mention it, if others want to monitor/contribute/comment. Im slowly working/chipping away on it between other osquery features im working on.

👌 1

Juan Alvarez

10/29/2021, 10:35 AM

Hi all, i wanted to check about https://github.com/osquery/osquery/pull/7259, since i see it is pending review, is there any chance this is coming soon to the next osquery version? Thanks!

Mike Myers

10/29/2021, 7:21 PM

Hi Juan -- that one has been reviewed but the feedback from the TSC has been: as implemented, this table allows a user to

select * from log

in a way that might cause osquery to run too long / collect too much / get its worker process terminated by its watchdog process. I think the consensus was that we needed some way to prevent this. I'm just relaying what the conversation was and the status of that.

👍 2

Daniel Bretón Suárez

05/03/2022, 3:25 PM

Puffycid asked how to implement the WHERE clause here: https://osquery.slack.com/archives/CP2RAJMU3/p1638367793007900?thread_ts=1634774133.001500&cid=CP2RAJMU3 The bad news is not around anymore (the slack account is disabled for some reason). I will happily finish the task (which is VERY close), but I don't want to disrespect anyone's job and I don't know how to reach Puffycid. Any ideas? Regarding the task itself, I would require to filter by timestamp. Requiring to filter by level (severity) could be a good idea too. Notice that error and fault messages have additional info attached, on the other hand, default and info messages are probably the most common, this way we don't mix large messages with lots of rows.

Mike Myers

05/03/2022, 8:58 PM

Querying a log filtered by timestamp is a good way to bound the query, but what if it were also paginated (you get up to some # of log entries). I am just thinking of how many REST APIs work where a service does not return everything all at once. https://www.moesif.com/blog/technical/api-design/REST-API-Design-Filtering-Sorting-and-Pagination/

Daniel Bretón Suárez

05/04/2022, 3:42 PM

I like it. Paginating could be a smart way to limit the number of logs returned. I'm thinking that we should order and pack the rows by timestamp establishing a limit of rows (maybe configurable by flags). A forensic tool would like to get a lot of logs and could iterate, in the case of continuous queries I don't know if this could cause the watchdog to raise the alarms. However, an aggregation tool may get the logs within a scheduled query so we must guarantee by design no logs are lost, so it must get all the logs that have been generated since the last query.

Mike Myers

05/04/2022, 4:01 PM

There's a

LIMIT

in SQL but I'm not sure if it's applied before osquery gathers the data, or after. If after, then osquery still tries to get all data first, which is what we're trying to avoid. I like the idea of a new CLI flag. This could be in a blueprint issue, as it probably applies equally to other types of log fetch queries.

Stefano Bonicatti

05/04/2022, 6:45 PM

yeah

LIMIT

applies after

Stefano Bonicatti

05/04/2022, 6:46 PM

it only works to stop further work when a

JOIN

is involved, because then sqlite will call the table logic multiple times

Stefano Bonicatti

05/04/2022, 6:46 PM

so the results are given incrementally

Stefano Bonicatti

05/04/2022, 6:48 PM

that been said I would think that there’s a way to have sqlite communicate that limit to the table, and then the table can respect it

Stefano Bonicatti

05/04/2022, 6:48 PM

I would go that route maybe, instead of a flag. That way the limit is more dynamic and can work for multiple tables

Stefano Bonicatti

05/04/2022, 6:49 PM

not as easy as adding a flag though, since it would involve changing some of the internals

Daniel Bretón Suárez

05/05/2022, 7:40 AM

So we are expecting the user to get the table like this: select * from unified_log where timestamp >= T0 and timestamp < T1 limit L In the case of scheduled queries, is it up to osquery to ensure that no logs are lost or is it up to the user? TODO: 0. Investigate if there is a way to apply the limit restraint before getting all the rows. If not, Shall we implement it on a new blueprint? 1. required=True, to the timestamp column 2. Is it possible to set a required limit? If not, Is it interesting enough to create a blueprint?

Stefano Bonicatti

05/05/2022, 8:27 AM

“In the case of scheduled queries, is it up to osquery to ensure that no logs are lost or is it up to the user?” I think it depends what you mean here and what is the query. Not knowing, I would say both in some cases and you could argue osquery only in some other. What I mean is that, first of all remember that table data is generated on the fly. When a scheduled query runs, it calls the table logic which start collecting data from the system, to memory. If osquery crashes or gets killed, then the collected data is lost, but the data in the system might still be there and unchanged. It’s not guaranteed obviously that it’s unchanged and a next query will get the same data again though. After the table logic has run though, the results are either written immediately in log files locally, and or written/buffered into RocksDB, so that they can be sent via TLS and not lost if the TLS service is temporarily unavailable or if osquery crashes. Now going back to the unified_log query, which logs are extracted depends on the query itself not on osquery. I see know that previously you stated “so we must guarantee by design no logs are lost, so it must get all the logs that have been generated since the last query.” There’s no such functionality in osquery. In that specific case it depends on the user to provide the timestamps that encompass the time range it’s missing, which also means that a scheduled query is not exactly ideal because the query is fixed. You could express the time range as relative to the current time, so if the scheduled query runs every 30mins, you get 1h worth of logs. Though if anything happens to osquery for more than 30mins, and it can’t extract the logs, you will have a hole.

Daniel Bretón Suárez

05/05/2022, 2:22 PM

ok, I understand, osquery is not responsible for that. Thank you for the clarification

Daniel Bretón Suárez

05/05/2022, 2:43 PM

Regarding the optimization of the LIMIT clause, I've tried to get the limit value from the context, but the context does not contain it. The context definition: https://github.com/osquery/osquery/blob/5b68086569d325c0ef776038ea3caf475519f821/osquery/core/tables.h#L451

Daniel Bretón Suárez

05/05/2022, 2:45 PM

I've followed the steps of a single query and I think the context it is filled by sqlite3.

Daniel Bretón Suárez

05/06/2022, 10:52 AM

It will be great if we can provide the table with the limit clause so it can perform a deeper optimization. Not only for this topic in particular but for all that require it. I imagine this is a completely different blueprint...

Stefano Bonicatti

05/06/2022, 10:55 AM

Yeah when I mentioned it I was actually supposing that sqlite was exposing that but we were not using it or properly receiving it. I’ve given a quick look yesterday and I couldn’t find a way to do it ^^’.

Stefano Bonicatti

05/06/2022, 10:55 AM

It’s seems it’s not exposed

Daniel Bretón Suárez

05/09/2022, 8:54 AM

It seems we won't be able to do it dynamically 😞 Shall we explore the possibility of setting a limit on the configuration?

Daniel Bretón Suárez

05/12/2022, 10:50 AM

I added a blueprint with an idea to limit the number of rows, but also provide a method for aggregators to get all the information when scheduling a query. Please, review and comment https://github.com/osquery/osquery/issues/7591

Daniel Bretón Suárez

05/20/2022, 9:42 AM

I created a pull request, but I'm currently working on the CLA with my organization which will take some days to solve

🍻 1

👌 1

allister

05/20/2022, 9:42 AM

Thank you for going the distance with it!

❤️ 1

Daniel Bretón Suárez

05/23/2022, 10:36 AM

CLA signed, please review https://github.com/osquery/osquery/pull/7598

Mike Myers

05/24/2022, 5:12 PM

Added it to the 5.4 milestone

🎉 1

Daniel Bretón Suárez

05/26/2022, 12:52 PM

Where should I document this new pseudo-event-mechanism? Would it be ok if I wrote some lines on the table description? I mean, on the schema

3 Views

Open in Slack

Previous Next