Title
#kolide
defensivedepth

defensivedepth

08/06/2020, 4:30 PM
Fleet 2.6.0 / Osquery 4.4.0 on W10 with Launcher v0.11.11 I have a scheduled query for
windows_events
, but am getting the following errors in the W10 application logs:
caller=level.go:63 level=info caller=extension.go:494 err="sending string logs: writing logs: transport error sending logs: rpc error: code = Internal desc = grpc: error while marshaling: proto: field \"kolide.agent.LogCollection.Log.Data\" contains invalid UTF-8"
Any thoughts as to where to look next?
zwass

zwass

08/06/2020, 4:32 PM
@seph is this the same UTF8 bug we've seen occasionally over quite some time?
4:32 PM
IIRC it seemed to be that osquery outputs invalid UTF8
defensivedepth

defensivedepth

08/06/2020, 4:33 PM
I wasnt sure if this error was related to not getting windows_events logs or something else...
zwass

zwass

08/06/2020, 4:36 PM
I think it could definitely be related to that
s

seph

08/06/2020, 5:17 PM
There are handful of utf8 issues in this ecosystem. Osquery sometimes screws up the encoding. This may be a rocksdb issue. There is some info about it at https://github.com/kolide/launcher/pull/481 some more at https://github.com/osquery/osquery/issues/5288
5:17 PM
There is also another chunk of windows string encoding issues. The windows sstring functions have been slowly getting reworked.
5:18 PM
I haven’t looked at this in awhile though
defensivedepth

defensivedepth

08/06/2020, 5:23 PM
Looks like a major pr related to this was merged a couple weeks ago: https://github.com/osquery/osquery/pull/6338
5:24 PM
And this issue looks like what I am running into: https://github.com/kolide/launcher/issues/445
s

seph

08/06/2020, 5:25 PM
Yeah, that’s the windows rework I alluded to. (thanks for finding the PR)
5:27 PM
We never really came to a good solution for how to handle bad utf8 that osquery emits
defensivedepth

defensivedepth

08/06/2020, 5:28 PM
So that windows rework should make it more unlikely that osquery would emit bad utf8, but not impossible. And when it does, errors like this will show up and the scheduled query will be blocked until the offending log is removed from rocks... Is that accurate?
s

seph

08/06/2020, 5:30 PM
Probably. I haven’t thought about this in awhile, but that sounds right.
5:31 PM
I think either:* status quo * launcher reworks the logs * launcher drops the logs * server (fleet in this case) accepts them * server drops them
5:31 PM
Pretty sure the SaaS side accepts them.
5:32 PM
Less sure what the right approach here is. I’d review/accept a PR to launcher to drop a log on marshalling error. Or attempt to repair it. Not sure how trivial that logic would be. Repair is easy, but the conditional on potential failure seems harder.
defensivedepth

defensivedepth

08/06/2020, 5:33 PM
From an ops side, I would rather see the server (fleet) accept it, and then let me fix it within the parsing pipeline
s

seph

08/06/2020, 5:34 PM
Feel free to comment in any of the linked kolide github issues. this is unlikely to bubble up my list, but the comments would be noted
defensivedepth

defensivedepth

08/06/2020, 5:34 PM
Will do
s

seph

08/06/2020, 5:34 PM
my guess is having fleet pass it along unmodified is relatively hard. (a bunch of the plumbing there is typed)_
defensivedepth

defensivedepth

08/06/2020, 5:35 PM
hmmmmm
s

seph

08/06/2020, 5:35 PM
Not sure though. I don’t work much on fleet
defensivedepth

defensivedepth

08/06/2020, 5:36 PM
@zwass thoughts?
zwass

zwass

08/06/2020, 7:18 PM
I don't think the logs ever hit Fleet. It's an issue with gRPC encoding on the client side. See my comment https://github.com/kolide/launcher/issues/445#issuecomment-601867302
s

seph

08/06/2020, 7:19 PM
Question is what we should do. Launcher could attempt to repair, but we were dicy about that too
zwass

zwass

08/06/2020, 7:19 PM
Oh
7:20 PM
Drop the offending character IMO
7:20 PM
Unless someone has a strategy for repairing it.
zwass

zwass

08/06/2020, 7:21 PM
Heh I voted against it
s

seph

08/06/2020, 7:22 PM
I know 😛
zwass

zwass

08/06/2020, 7:22 PM
I don't think I realized in that context that it prevents logs from sending entirely via Launcher
s

seph

08/06/2020, 7:22 PM
We might not have realized it at the time.
7:22 PM
Anyhow, tht’s the code Nick wrote for herd. (extracted and dropped into launcher)
zwass

zwass

08/06/2020, 7:22 PM
Given that information, I'd say go with this strategy but only for logs that fail send the first time with this error.
s

seph

08/06/2020, 7:23 PM
Want to comment? Using that error handling seems possible, but I’m not sure how yet, Haven’t read the code with that in mind
7:25 PM
(comment and re-opened)
zwass

zwass

08/06/2020, 7:26 PM
Do you prefer my comment on this issue or PR?
s

seph

08/06/2020, 7:26 PM
Probably not needed. I think I captured this sentiment in the PR commnt. If you have anything additional feel free to add it
zwass

zwass

08/06/2020, 7:27 PM
Looks good. Thank you
7:27 PM
s

seph

08/06/2020, 7:28 PM
Unclear. There are probably multiple places that cause this
defensivedepth

defensivedepth

08/06/2020, 8:26 PM
So a temp fix is to clear rocks and restart launcher?