Title
#macos
j

Juan Alvarez

06/04/2021, 4:46 PM
Hi all, i would like to check the status on the work being done in https://github.com/osquery/osquery/pull/6904 , is there any chance to see this happening in the short term? or is it just looking for somebody with time to do the final changes?
Mike Myers

Mike Myers

06/04/2021, 6:25 PM
yea it's lacking someone to carry it over the finish line
j

Juan Alvarez

06/04/2021, 6:42 PM
i see, thanks for the answer! it looks so close ! 馃槃
a

allister

06/07/2021, 12:48 AM
s

seph

06/08/2021, 8:13 PM
That macadmins extension is an exec. Might be okay but be cautious there.
a

allister

06/09/2021, 12:45 AM
I am not certain of the downside of an exec in general? Yes we've discussed pseudo 'core-tenets' of osquery being "performance as a feature" and "never shell out", there is a non-zero 'impedance/context switch' cost, but what brings to mind the need for caution?
s

seph

06/09/2021, 1:45 AM
Philosophy aside, it鈥檚 can have high resource utilization and unexpected failure modes. So it鈥檚 something to be wary of, but sometimes one pragmatically accepts it.
1:46 AM
Launcher tries to avoid execing, but there鈥檚 a wealth of info, which we need to exec to get. So we do. Just gotta be aware of the risks
puffycid

puffycid

06/23/2021, 3:42 AM
i can take stab at finishing this, if this feature is still desired/wanted i may make a few changes though馃槃
a

allister

06/23/2021, 4:43 AM
considering TOB had removed their version a while back it would be great to get this finished being accepted in core, thanks!
s

seph

06/23/2021, 3:38 PM
It would be great if someone picked up pr 6904. I鈥檓 not sure where it was left.
3:39 PM
There was never an extension or osquery release with the functionality in that PR.
puffycid

puffycid

08/03/2021, 5:49 PM
This was mentioned/brought up in the office hours today I have this fixed I think Its a small change and should address all comments I also have tests for it as well I was able to run it multiple times on all 20 million+ log entries on my mac with no errors I hoping to comment/offer fixes/suggestions this week/weekend Unless someone else already fixed this Then no worries
s

seph

08/03/2021, 6:05 PM
@puffycid Are you saying you have a current version of https://github.com/osquery/osquery/pull/6904 or that you have a current version of something exec based?
6:05 PM
If you have a current version of #6904, I鈥檇 love to see a PR!
puffycid

puffycid

08/03/2021, 6:10 PM
It fixes the api version of #6904 No exec usage Its a small ~10 line change I hoping to comment/suggest changes later this week Unless a new PR is preferred? I've never edited an existing pr before
6:10 PM
Whatever is easier
s

seph

08/03/2021, 6:10 PM
New PR seems easier?
6:11 PM
Editing existing ones is hard.
puffycid

puffycid

08/16/2021, 2:25 AM
i pushed a PR adding this to osquery it should address all issues/comments in the original PR 6904 Sorry for the delay (work/life stuff) I should have much more availability now and can quickly address any issues/comments for the PR ive opened (in addition to others i opened) also as fyi/shameless plug, I just wanted to mention that i am also trying to implement a raw UAL parsing support to osquery at https://github.com/puffyCid/osquery/tree/unifiedlogs. The benefit of raw parsing is that it will support older macos versions and it is slightly more forensic focus (raw parsing vs api). Its ~60% done and it still has lots of work to be done, but just wanted to mention it, if others want to monitor/contribute/comment. Im slowly working/chipping away on it between other osquery features im working on.
j

Juan Alvarez

10/29/2021, 10:35 AM
Hi all, i wanted to check about https://github.com/osquery/osquery/pull/7259, since i see it is pending review, is there any chance this is coming soon to the next osquery version? Thanks!
Mike Myers

Mike Myers

10/29/2021, 7:21 PM
Hi Juan -- that one has been reviewed but the feedback from the TSC has been: as implemented, this table allows a user to
select * from log
in a way that might cause osquery to run too long / collect too much / get its worker process terminated by its watchdog process. I think the consensus was that we needed some way to prevent this. I'm just relaying what the conversation was and the status of that.
Daniel Bret贸n Su谩rez

Daniel Bret贸n Su谩rez

05/03/2022, 3:25 PM
Puffycid asked how to implement the WHERE clause here: https://osquery.slack.com/archives/CP2RAJMU3/p1638367793007900?thread_ts=1634774133.001500&cid=CP2RAJMU3 The bad news is not around anymore (the slack account is disabled for some reason). I will happily finish the task (which is VERY close), but I don't want to disrespect anyone's job and I don't know how to reach Puffycid. Any ideas? Regarding the task itself, I would require to filter by timestamp. Requiring to filter by level (severity) could be a good idea too. Notice that error and fault messages have additional info attached, on the other hand, default and info messages are probably the most common, this way we don't mix large messages with lots of rows.
Mike Myers

Mike Myers

05/03/2022, 8:58 PM
Querying a log filtered by timestamp is a good way to bound the query, but what if it were also paginated (you get up to some # of log entries). I am just thinking of how many REST APIs work where a service does not return everything all at once. https://www.moesif.com/blog/technical/api-design/REST-API-Design-Filtering-Sorting-and-Pagination/
Daniel Bret贸n Su谩rez

Daniel Bret贸n Su谩rez

05/04/2022, 3:42 PM
I like it. Paginating could be a smart way to limit the number of logs returned. I'm thinking that we should order and pack the rows by timestamp establishing a limit of rows (maybe configurable by flags). A forensic tool would like to get a lot of logs and could iterate, in the case of continuous queries I don't know if this could cause the watchdog to raise the alarms. However, an aggregation tool may get the logs within a scheduled query so we must guarantee by design no logs are lost, so it must get all the logs that have been generated since the last query.
Mike Myers

Mike Myers

05/04/2022, 4:01 PM
There's a
LIMIT
in SQL but I'm not sure if it's applied before osquery gathers the data, or after. If after, then osquery still tries to get all data first, which is what we're trying to avoid. I like the idea of a new CLI flag. This could be in a blueprint issue, as it probably applies equally to other types of log fetch queries.
Stefano Bonicatti

Stefano Bonicatti

05/04/2022, 6:45 PM
yeah
LIMIT
applies after
6:46 PM
it only works to stop further work when a
JOIN
is involved, because then sqlite will call the table logic multiple times
6:46 PM
so the results are given incrementally
6:48 PM
that been said I would think that there鈥檚 a way to have sqlite communicate that limit to the table, and then the table can respect it
6:48 PM
I would go that route maybe, instead of a flag. That way the limit is more dynamic and can work for multiple tables
6:49 PM
not as easy as adding a flag though, since it would involve changing some of the internals
Daniel Bret贸n Su谩rez

Daniel Bret贸n Su谩rez

05/05/2022, 7:40 AM
So we are expecting the user to get the table like this: select * from unified_log where timestamp >= T0 and timestamp < T1 limit L In the case of scheduled queries, is it up to osquery to ensure that no logs are lost or is it up to the user? TODO: 0. Investigate if there is a way to apply the limit restraint before getting all the rows. If not, Shall we implement it on a new blueprint?1. required=True, to the timestamp column 2. Is it possible to set a required limit? If not, Is it interesting enough to create a blueprint?
Stefano Bonicatti

Stefano Bonicatti

05/05/2022, 8:27 AM
鈥淚n the case of scheduled queries, is it up to osquery to ensure that no logs are lost or is it up to the user?鈥 I think it depends what you mean here and what is the query. Not knowing, I would say both in some cases and you could argue osquery only in some other. What I mean is that, first of all remember that table data is generated on the fly. When a scheduled query runs, it calls the table logic which start collecting data from the system, to memory. If osquery crashes or gets killed, then the collected data is lost, but the data in the system might still be there and unchanged. It鈥檚 not guaranteed obviously that it鈥檚 unchanged and a next query will get the same data again though. After the table logic has run though, the results are either written immediately in log files locally, and or written/buffered into RocksDB, so that they can be sent via TLS and not lost if the TLS service is temporarily unavailable or if osquery crashes. Now going back to the unified_log query, which logs are extracted depends on the query itself not on osquery. I see know that previously you stated 鈥渟o we must guarantee by design no logs are lost, so it must get all the logs that have been generated since the last query.鈥 There鈥檚 no such functionality in osquery. In that specific case it depends on the user to provide the timestamps that encompass the time range it鈥檚 missing, which also means that a scheduled query is not exactly ideal because the query is fixed. You could express the time range as relative to the current time, so if the scheduled query runs every 30mins, you get 1h worth of logs. Though if anything happens to osquery for more than 30mins, and it can鈥檛 extract the logs, you will have a hole.
Daniel Bret贸n Su谩rez

Daniel Bret贸n Su谩rez

05/05/2022, 2:22 PM
ok, I understand, osquery is not responsible for that. Thank you for the clarification
2:43 PM
Regarding the optimization of the LIMIT clause, I've tried to get the limit value from the context, but the context does not contain it. The context definition:https://github.com/osquery/osquery/blob/5b68086569d325c0ef776038ea3caf475519f821/osquery/core/tables.h#L451
2:45 PM
I've followed the steps of a single query and I think the context it is filled by sqlite3.
10:52 AM
It will be great if we can provide the table with the limit clause so it can perform a deeper optimization. Not only for this topic in particular but for all that require it. I imagine this is a completely different blueprint...
Stefano Bonicatti

Stefano Bonicatti

05/06/2022, 10:55 AM
Yeah when I mentioned it I was actually supposing that sqlite was exposing that but we were not using it or properly receiving it. I鈥檝e given a quick look yesterday and I couldn鈥檛 find a way to do it ^^鈥.
10:55 AM
It鈥檚 seems it鈥檚 not exposed
Daniel Bret贸n Su谩rez

Daniel Bret贸n Su谩rez

05/09/2022, 8:54 AM
It seems we won't be able to do it dynamically 馃槥 Shall we explore the possibility of setting a limit on the configuration?
10:50 AM
I added a blueprint with an idea to limit the number of rows, but also provide a method for aggregators to get all the information when scheduling a query. Please, review and comment https://github.com/osquery/osquery/issues/7591
9:42 AM
I created a pull request, but I'm currently working on the CLA with my organization which will take some days to solve
a

allister

05/20/2022, 9:42 AM
Thank you for going the distance with it!
Daniel Bret贸n Su谩rez

Daniel Bret贸n Su谩rez

05/23/2022, 10:36 AM
Mike Myers

Mike Myers

05/24/2022, 5:12 PM
Added it to the 5.4 milestone
Daniel Bret贸n Su谩rez

Daniel Bret贸n Su谩rez

05/26/2022, 12:52 PM
Where should I document this new pseudo-event-mechanism? Would it be ok if I wrote some lines on the table description? I mean, on the schema