Stefano Bonicatti
10/10/2023, 6:18 PMStefano Bonicatti
10/10/2023, 6:19 PMStefano Bonicatti
10/10/2023, 6:19 PMQuery Planning In SQLite
Stefano Bonicatti
10/10/2023, 6:24 PMStefano Bonicatti
10/10/2023, 6:30 PMconstraint.first
remains the same between multiple constraint
and constraint.second
instead changes)Stefano Bonicatti
10/10/2023, 6:31 PMStefano Bonicatti
10/10/2023, 6:33 PMStefano Bonicatti
10/10/2023, 6:33 PMStefano Bonicatti
10/10/2023, 6:34 PMStefano Bonicatti
10/10/2023, 6:42 PMIN
where it would call the table multiple times for each value in the constraint; but now, only does a single query.Stefano Bonicatti
10/10/2023, 6:43 PMselect pid from processes WHERE name IN ("launchd", "zsh", "whatever");
osquery planner: xBestIndex Evaluating constraints for table: processes [index=0 column=1 term=0 usable=1]
osquery planner: xBestIndex Recording constraint set for table: processes [cost=1000000.000000 size=0 idx=1]
osquery planner: xOpen Opening cursor (1) for table: processes
osquery planner: xFilter Filtering called for table: processes [constraint_count=1 argc=0 idx=1]
osquery planner: Scanning rows for cursor (1)
osquery planner: xFilter processes generate returned row count:710
osquery planner: Closing cursor (1)
+-------+
| pid |
+-------+
| 1 |
| 40429 |
| 41257 |
| 47874 |
| 75073 |
| 75516 |
+-------+
osquery 4.0.2:
select pid from processes WHERE name IN ("launchd", "zsh", "whatever");
osquery planner: Opening cursor (0) for table: processes
osquery planner: Adding constraint to cursor (0): name = launchd
osquery planner: Scanning rows for cursor (0)
osquery planner: Adding constraint to cursor (0): name = whatever
osquery planner: Scanning rows for cursor (0)
osquery planner: Adding constraint to cursor (0): name = zsh
osquery planner: Scanning rows for cursor (0)
osquery planner: Closing cursor (0)
+-------+
| pid |
+-------+
| 1 |
| 40429 |
| 41257 |
| 47874 |
| 75073 |
| 75516 |
+-------+
Stefano Bonicatti
10/10/2023, 6:43 PMScanning rows for cursor
amountsStefano Bonicatti
10/10/2023, 7:05 PMIN
, if you do pass an index column (like pid for processes), it then starts calling xFilter multiple times again.
This is somewhat controlled by this part of code:
https://github.com/osquery/osquery/blob/8e8d596564e981ddac689e2d772f56fdf6ba25ec/osquery/sql/virtual_table.cpp#L764-L774Stefano Bonicatti
10/10/2023, 7:09 PMcontinue
; this is based on the fact that sqlite will always filter again the rows/results you give it. It's what makes things work when our tables do not have any index/primary key that we can query, and they always return all the rows.Stefano Bonicatti
10/10/2023, 7:26 PMIN
case on an index column, you pass 3 constraint values, xFilter will be called 3 times and each time you collect the constraint and don't call the table, only at the last call you pass the entire result.
This though it's tricky obviously and would only work if the constraint set that xFilter is called with contains 1 constraint, unless we know for sure that sqlite will call xFilter multiple times with the same set, and when the set changes, then we will never get the same set again in the same query.
Meaning, if the query doesn't only have a IN, but also an AND, and an OR with another list of constraints.. we want to group all the values from the IN + AND, and then group the constraint after the OR on their own.Stefano Bonicatti
10/10/2023, 7:27 PMStefano Bonicatti
10/10/2023, 8:10 PMStefano Bonicatti
10/10/2023, 8:11 PMThen, under the usual mode of handling IN operators, SQLite generates bytecode that invokes the xFilter() method once for each value on the right-hand side of the IN operator. Thus the virtual table only sees a single value from the right-hand side of the IN operator at a time.
In some cases, however, it would be advantageous for the virtual table to see all values on the right-hand of the IN operator all at once
Stefano Bonicatti
10/10/2023, 8:16 PMIN
constraint causes a single call to xFilter is not because https://osquery.slack.com/archives/C08VA3XQU/p1696964943771509?thread_ts=1696961882.399859&cid=C08VA3XQU causes the xFilter to be called once with all the values of the constraint; no, we cause the constraint to become unusable (because the cost is set to the maximum), and since there aren't any other constraints, then sqlite is forced to call xFilter once, with no constraint, which means the table will return all the rows.
But then when receiving those rows from the xFilter, sqlite will filter them as usual.Stefano Bonicatti
10/11/2023, 12:07 PMsqlite3_vtab_in
or changing how the query is formulated.
It should also be possible to do some internal caching and/or as I was saying yesterday collect all the constraints ourselves, but it's not something directly supported by sqlite; it definitely seems that sqlite expects virtual tables to be fast on their own with retrieving data.seph
seph
IN
, there is still complex boolean logic around AND
and OR
makes me think that we should be playing with some internal cache mechanism instead of trying to adjust the xFilter behavior