Title
#macos
n

nyanshak

04/30/2020, 9:32 PM
👋 Hey - OpenBSM audit system is crashing for some users in my fleet on macOS 10.15.4 (reproduced on various 4.x.y osquery versions up to 4.3.0). I can detect this problem by looking for the presence of
<some_timestamp>.crash_recovery
log files in
/var/audit
. When the audit system crashes, osquery stops receiving events from
process_events
table. When the system is restarted,
process_events
will start going through again, since the audit subsystem is restarted. 1. (for a temporary fix) Is there a way to make the audit subsystem recover without rebooting the machine? The
man audit
suggests you should be able to do
sudo audit -i
to reinitialize the system. However, on doing this - it doesn't clear out the crash_recovery file, and process_events don't actually start getting processed again, including after restarting osquery. 2. (troubleshooting) Are there any good tools that can parse the audit binary log files? Trying to see if I can find any meaningful leads on why it crashed. 3. Has anyone else run into this and have any suggestions?
9:50 PM
2) on macOS, praudit can be used to view them. 🙂 so hey answered that question. If anyone has a Linux version / knows how to read on Linux, that would also be helpful.
theopolis

theopolis

05/01/2020, 12:42 AM
Interesting, so is it osquery’s usage of OpenBSM that influences the crash? Do you know if it then crashes for every process using audit/does it stop the logging on the system or just for osquery?
terracatta

terracatta

05/01/2020, 12:45 AM
Here is an additional data point. I currently run osquery with the disable-audit flag set to true and I have many crashes in my folder just from this month
12:51 AM
My
audit_control
file also has not been modified since the OS was first installed earlier this year
b

billcobbler

05/01/2020, 2:09 AM
We're not actually able to determine if osquery's usage is the cause. The data provided in the crash_recovery files does not indicate what caused the crash, just that a crash happened and it recovered. In the failure state, events continue to emit to the
.not_terminated
file, but log volume is severely reduced and with only events values of: • SecSrvr AuthEngine • user authentication Examples of those two events with user info redacted:
<record version="11" event="user authentication" modifier="0" time="Thu Apr 30 16:28:19 2020" msec=" + 158 msec" >
<subject audit-uid="502" uid="502" gid="20" ruid="502" rgid="20" pid="11031" sid="100011" tid="2686386 0.0.0.0" />
<text>Verify password for record type Users &apos;user1&apos; node &apos;/Local/Default&apos;</text>
<return errval="failure: Unknown error: 255" retval="5000" />
<identity signer-type="1" signing-id="com.apple.opendirectoryd" signing-id-truncated="no" team-id="" team-id-truncated="no" cdhash="0x1f5920de3532b6fae4f8050f2c7f507b5bbe838a" />
</record>

<record version="11" event="SecSrvr AuthEngine" modifier="0" time="Thu Apr 30 17:22:08 2020" msec=" + 661 msec" >
<subject audit-uid="-1" uid="0" gid="0" ruid="0" rgid="0" pid="16775" sid="100000" tid="2701830 0.0.0.0" />
<text>begin evaluation</text>
<return errval="success" retval="0" />
<identity signer-type="1" signing-id="com.apple.authd" signing-id-truncated="no" team-id="" team-id-truncated="no" cdhash="0xda52fe385f41ebc0f7fb14140bea0dfc97ac5644" />
</record>
n

nyanshak

05/01/2020, 2:51 PM
I'm not saying it's osquery's fault per se. I just am not that familiar with macOS internals and struggling to understand how to dig into it properly. And I hoped maybe there was someone around with a bit deeper expertise in this area. But regardless of it being caused by osquery or not, it certainly affects osquery and our usage of it a fair bit. I actually have seen it on other macOS versions, but the majority of our mac fleet is on 10.15.4. Seen (in decreasing order of frequency on our fleet): 10.15.4, 10.14.6, 10.15.3, 10.15.5, 10.13.6, 10.14.3, 10.15.1, 10.15.2 But also like... the vast majority of our fleet is on 10.15.4 and 10.14.6 so it's hard to say if it shows up on all versions and if so, if they're the same cause...
7:38 PM
https://github.com/osquery/osquery/issues/6431 Raised this for tracking purposes
11:11 PM
@terracatta - do you have any other process running that would read the audit socket that might trigger this crash? Trying to get more info about what's going on 🤷
terracatta

terracatta

05/06/2020, 11:59 PM
@nyanshak I just went over to my wife's iMac which is just used for web browsing, has a few games on it, adobe programs and it has about 15 crash reports in it
11:59 PM
never has run osquery or any other security software
n

nyanshak

05/07/2020, 2:30 AM
when you say "15 crash reports" - what do you mean? Like
/var/audit/*.crash_recovery
files? Or something else?
terracatta

terracatta

05/07/2020, 2:34 AM
Like 
/var/audit/*.crash_recovery
Yes
n

nyanshak

05/07/2020, 2:35 AM
Oh interesting, the ones I've checked have only had one crash file in the cases I checked. I didn't check all my hosts though