I'm trying very hard to understand <https://github...
# general
I'm trying very hard to understand https://github.com/osquery/osquery/pull/6982 and what, exactly, it breaks, and I'm having some trouble.
I think that in <5.0,
select * from augeas where path LIKE '/etc/hosts%';
returns a bunch of stuff but in >=5.0 it's going to return nothing.
I can't see anything else that looks breaking-ish. Anyone have any clues to dole out? 🙂
I am, in particular, deeply confused by this part:
Copy code
       SQL("select * from augeas where path LIKE '/etc/hosts/%'").rows().size(),
AFAICT, that should get converted to
match /files/etc/hosts/*
, which totally works and does not return zero results.
It's always possible that I introduced a bug. I'll try to look harder sometime in the next couple days.
Oh hello PR author. 😄
I don't actually even know if our (internal) customers use LIKEs with augeas; I was just trying to understand the change for producing a report on 5.0 in general.
Okay, I’m trying to remember all the things.
The old version of this table basically ran a
match /files/*
and let sqlite filter the output. This presented two issues. The bigger issue was that there was no way to access things outside the files tree. (eg,
). Second, and smaller, was that the “return everything” approach just feels wrong to me. I think practically speaking the performance was okay, but it feels like there are dragons about. I’d usually rather call underlying APIs narrowly.
Wildcard handling between the two was problematic. And TBH, I’m not sure I got it right.
In augeas, wildcards are tree based.
is a single level, and
is recursive. But in sql, wildcards are simple strings
And there’s some magic in how
is treated.
The end results is, I think: •
is converted to `/files/etc/hosts/%’. So augeas returns data. But sql filters it. (because the augeas return is is missing that trailing slash) •
is converted to
which augeas has no matches for, because it’s a weird postfix search. •
is converted to
which is a full recursion return, and it will get passed back through the sql filter
I don’t think it’s very meaningful to wildcard a file. Wildcarding a directory is more meaningful. Compare
select * from augeas where path LIKE '/etc/%';
select * from augeas where path LIKE '/etc/%%';
Does that help any?
It does, thank you!
Some questions still though. 😄
 is converted to `/files/etc/hosts/%’.
^^ Why doesn't that get converted to
? Like, not "why did you make that decision?" but "where in the code does that happen?".
(because the augeas return is is missing that trailing slash)
^^ I didn't follow that part at alll.
I still don't feel like I have a handle on what's breaking. I understand the changes and why you made them and I think they make sense, but I can't come up with any examples of queries that work in 4.9.0 that'll break in 5.0
Oh, actually, in the PR thread there's:
This seems reasonable but we should mark it as an API change due to change with queries like 
select * from augeas where path LIKE '/etc/hosts%';
, where before this would full-scan and have SQL apply the 
; do I correctly understand that that query currrently returns stuff (which I just checked) but it won't in 5.0 because it gets converted to
match /files/etc/hosts*
If that's the only breaking example we have, that seems like it's no going to come up very much. 🙂 Which is yay.
> /etc/hosts/% is converted to `/files/etc/hosts/%’.
^^ Why doesn’t that get converted to /filles/etc/hosts/*?  Like, not “why did you make that decision?” but “where in the code does that happen?”
Typo, I mean to
And all the conversion is in
This is a specific case of the the only breaking example I know. Namely, I introduced
akin to the existing file pattern as single wild card, vs recursive. Thus breaking a couple of places
So if you did
path like '/etc/hosts/%
, it would map to
might work. (don’t remember). But the data that the table implementation returns would be
path: /etc/hosts
which won’t match the sql expression.