Title
#general
Dan Achin

Dan Achin

12/03/2020, 9:10 PM
Hello all, I'm wondering if anything is logged regarding the specific metrics tripped when a query is denylisted. We wrote some intensive queries (hashing files recursively) specifically to get them denylisted so that we could make sure we were catching those in our Splunk env. Now that we have queries getting denied, I want to understand specifically why they were denied, and if that is logged somewhere that's going to be the quickest way. Or is the recommended way to do this to run the profiling script?
zwass

zwass

12/03/2020, 10:51 PM
Schedule a query (or live query) to the
osquery_schedule
table and look at the
denylisted
column. Though I do see you say "why they were denied" and I don't have a good answer for that.
Dan Achin

Dan Achin

12/03/2020, 11:35 PM
@zwass - that's how we are finding them, we have a pack that runs
SELECT * FROM osquery_schedule WHERE denylisted='1';
11:37 PM
and we have all of this in Splunk. I can see which hosts have denied queries and which queries were denied. I was just hoping something might get logged by the watchdog that said 'denied, over RAM allocation' or something like that
zwass

zwass

12/04/2020, 12:01 AM
There is some logging around that (https://github.com/osquery/osquery/blob/master/osquery/core/watcher.cpp#L527-L537) but I'm not sure it will tell you which query was executing.
Dan Achin

Dan Achin

12/04/2020, 12:06 AM
Cool, thanks for that.
12:10 AM
We really wanted an easy way to tell if a query exceeded CPU or RAM allocations if denylisted. Might have to start playing around with the profiler script I think since we aren't even logging locally, we are currently only using the TLS logger. Or we can add the filesystem logger plugin I guess.... need to discuss with the team. Thanks again
12:41 AM
Actually, @zwass - can you confirm if that watchdog logging should also be sent via the TLS logger, or would it only write if filesystem was enabled?
zwass

zwass

12/04/2020, 1:23 AM
I do believe it should go to TLS but haven't looked at that in a while. Let me know if you can confirm.
Dan Achin

Dan Achin

12/04/2020, 5:56 PM
@zwass, will do. We definitely haven't seen it in Splunk, but let me look directly in Fleet logs a bit later
zwass

zwass

12/04/2020, 5:57 PM
This would be in the "status" logs. Are you also forwarding those to splunk?
Dan Achin

Dan Achin

12/04/2020, 6:04 PM
yep
6:04 PM
and I looked at status logs for the hostnames / uuids with denied queries and didn't see anything, but maybe we have a field we need to extract
6:30 PM
well, I'm not finding anything in the logs when i grep for deny, exceeded, or limits
zwass

zwass

12/04/2020, 6:59 PM
Hmm, perhaps I'm wrong on that front. I could easily get nerd sniped into digging into this for the next hours but I have other things I gotta get to!
Dan Achin

Dan Achin

12/04/2020, 7:03 PM
haha. well, I see nothing in staus.log about it
7:04 PM
i think we'll enable the FS plugin on a couple hosts and see if we can find it there
8:22 PM
@zwass - I'm thinking ensure the logging is written via TLS logger seems like it could be considered an issue, or at least an enhancement. Do you think it would get any traction if I opened a git issue on it?
zwass

zwass

12/04/2020, 8:23 PM
Yeah I think documenting it in an issue makes sense. As far as traction, not sure.
Dan Achin

Dan Achin

12/04/2020, 8:52 PM
ūüôā
8:52 PM
ok, thanks
10:58 PM
@zwass, just thought I'd share what we found when looking into this deeper. I had one of our guys that knows C/C++ take a look at the code and it turns out that it does a syslog() then immediately writes to the a WARNING log call to the osquery logging system. So, even though we haven't quite figured out why we don't see these in the /var/log/osquery/osquery.WARNING logs on fleet, we can still find the information we need in /var/log/messages on the hosts that have denylisted queries, so that's something. Once we enable the filesystem plugin on some test hosts, I'm guessing we'll start to see the data in the local osquery.WARNING files generated, though I still think there needs to be a way to get these written to the tls logger as well.
zwass

zwass

12/13/2020, 6:02 PM
Thank you for the update. I imagine this has something to do with the RocksDB connection. Maybe the worker is holding that connection and the watcher is the one logging the reason?
Dan Achin

Dan Achin

12/14/2020, 5:44 PM
yes, I think it's definitely the parent process logging it (i.e. the watcher)
8:17 PM
also, unfortunately, this is right in the code:
// Since the watchdog cannot use the logger plugin the error message
      // should be logged to stderr and to the system log.
Looks like we might not be able to use tls to capture this, but I'm thinking the syslog plugin could be an option, though that's going to increase our Splunk usage a lot.
6:10 PM
@zwass - looks like I'll probably have to change my git issue to ask if the watcher can be updated to write to plugins.