Hi all:wave: I just wanted bring attention to thi...
# core
p
Hi all👋 I just wanted bring attention to this issue I opened (browser history support in core) about a week ago https://github.com/osquery/osquery/issues/7177 there has been a little bit of discussion on the feature I think the feature would useful from a forensic perspective Others mentioned possible privacy concerns Just wanted to get further feedback (either here or on github) As commented in the issue if there is an absolute 0% chance of it being added I can close the issue (or someone else can), no worries But as I mentioned I think the data would be valuable for analysts for forensic analysis
s
I understand the utility. But I tend to draw the line ore towards privacy than having this table in code. I sounds like the general TSC sentiment is similar. While not 0%, I think chances are pretty low. I think there are several viable ways for people to get this functionality. Asmentioned extensions and ATC tables.
👍 1
t
Yeah, this is one of the reasons the extensions framework was created, so that these types of tables that provide forensic value but need heavy privacy scrutiny could exist but not within core. I realize this is tribal knowledge and I recommend we codify it in the contributing documentation. Browser history, email, messages caches, contact information, while forensically important, should be implemented in extensions.
osquery 2
p
Yeah, it seems privacy is a concern for some people. But in IMO i think the privacy concern is user/organization specific? (some may not see a privacy concern) and i just think a committee blocking a useful feature is not really fair for users/organizations that would like this? (again imo). but i understand that lines need to be drawn (otherwise osquery would be flooded by requested features). But i think browser history is a good feature as for ATC/extensions, the issue/problem i see with them is the overhead for deploying and managing extensions ive never really seen osquery in a prod environment so im not 100% familiar with how its used but the overhead of deploying and starting extensions i think would be difficult for organizations? (ive seen osquery talks about organizations using osquery, but ive never seen osquery used) they need to deploy the extension to specific file paths, update config files, possibly update flags, make sure watchdog doesnt kill the extension, check permission issues, etc there may be third-party tools out there that can deploy extensions but im not sure how they work also while browser history is important i dont think organizations will go through the above steps to deploy such a small extension? as mentioned in the issue ATC only partially solves the issue, not all browsers use sqlite i would also be curious how many organizations that use osquery also use extensions (outside of products that use osquery, ex: launcher, orbit, etc)? the only major extension that organizations may consider deploying along osquery would the ntfs trail of bits extension (which should be core imo, but that could be discussed later 😉 ) again my viewpoint is mainly from a forensic perspective other organizations may use osquery for different purposes and may use different extensions (if any)
also browser info can already be indirectly acq in osquery via the carver table 1. carve the browser files 2. view in third party tool a browser table would just make that a one step process perhaps adding an additional flag that disables the table just like the carver table would satisfy privacy concerns?
d
I believe the EU GDPR requires user consent for collecting browsing history, though IANAL. I would be cautious about adding something like this in core, though I think a compromise could work: tables like this are disabled by default, but the ability to pass a flag that enables them.
👍 1
a
I think it is unfortunate that ATC can be used to read user data by accessing the sqlite database of Chrome-based browsers, it was never intended to be used that way
I think it is well established that osquery should not implement features that are purposefully reading user information
The grey area is when something is really useful but can be exploited to leak user data
s
100%. I think the consensus of the TSC is what Alessandro is enumerating. I don’t know that we have a clear definition, it’s a bit of “we’ll know it when we see it”
a
which is sort of twisting what the table was originally intended for. I think it would be helpful to document these rules and potentially re-evaluate our decisions in the past if we see that something is too easy to abuse
like for example ATC could refuse to open Chrome databases (or censor parts of its content)
(just my personal opinion, provided as an example)
d
So even off by default would still not be acceptable?
s
I sorta see a couple of nebulous questions… 1. Where is the line that we won’t merge past? 2. Are there ways we could make compile or runtime options to change (1) 3. Do we reevaluate what’s already in core? I think we have pretty good process for (1) though it probably looks very ad hoc. I don’t think we’ve had a good proposal for (2)
a
Adding that kind of feature behind a flag does not help much, since osquery is not able to inform the user on the machine that their browser history is being collected
it is too easy for someone who deploys it to just enable the flag and not tell users
and (IMHO) you risk damaging osquery's reputation
👆 1
s
One might imagine compile time options.
a
Maybe, but that is still not letting the user know that the data is being collected
s
I think it’s a bit unavoidable that site administrators can set things users might not like. I’m not sure how to resolve that.
d
I dont disagree.... however monitoring user networking traffic has been a cornerstone of security for many years
that was in reference to @alessandrogario RE: risk damaging osquery rep
a
that is true, but the amount of information you can gather by sniffing the network traffic (with HTTPS) is different compared to looking directly at the browser history
s
ebpf views into DNS. Browser internals. These are sharper knives.
d
ie Security Onion, the open source platform I work on has been in use since 2008, and most network traffic was not encrypted then
a
and yeah i think it's unavoidable that people will collect whatever they want, I just wish that osquery had a way to inform users before something like browser history is added
d
imho I dont think that is the responsibility of a security toolkit... but I could be convinced otherwise
just because its been that way for a long time doesnt mean there is not a better way to do it
a
I think that what I don't particularly like is that you either have no access or full access on the system
d
Agreed
a
I also think it is hard to implement secure permissions without baking them inside the security toolkit
Example 1. Having osquery always offer full permissions, then building something on top of it that attempts to authenticate users && scan SQL queries looking for something that should not be in there 2. Having osquery do a handshake with an ID that only publishes X tables for that specific connection; attempting to use a table that is not within the pre-determined allowed ones will result in a sqlite error
s
A long time I thought about adding some kind of signature based RBAC to queries along the lines of (2) there. But it would be a lot of work, for very little gain.
Practically speaking, osquery is installed in trusted environments.
a
That is true, but I would love to see the principle of least privilege applied here
s
But what does that mean? “Don’t ship things” is one meaningful. But if you tried to imagine something like RBAC for queries, it gets out of hand fast. And given the normal deployment, feels like complexity for little gain.
a
I'd like to have the user directly authorize which tables are allowed and which ones are not, with no way for the other end to even reference those tables
s
The disabled tables flag is somewhat in that direction
a
then you can easily tie that to a specific identity (user) if RBAC is needed at the admininistration level
maybe, I think I just don't like the fact that the user has to ask whoever deployed osquery to know what they are collecting
i would prefer if no one had to trust anyone here, the user is informed or allows a set of tables, and there is no way to go around that
s
Sure. I agree, kinda, in theory. But I’m not sure it’s really possible with how operating systems work.
a
it is too easy for someone to change the flag without notifying the user and there is also no way to audit that
it's not that hard, we don't have to attach all the tables all the time
if the permission only allows a specific table, we just build a sqlite database with only that table plugin
and if those permissions can only be changed on the client, then it's easier to determine if something that could be potentially be misused should be merged or not
s
Sure, I can think about how to implement rbac tecnically. But the larger system doesn’t feel right yet. Is it per connection? How does that work in osquery’s model. Is it per query? I could imagine sticking
where _signature = xxxx
for example. But that can get heavy. How does this stuff get managed? Do we compiled a list of sensitive vs not? Or is this a giant runtime configuration? Who would actually use this?
a
I would do it per-connection, because per-query would not be really useful
s
per-connection is gnarly though, the connections are client -> server. So you’d have to add an server-auth layer to osquery. Which I think would be weird if you imagine a TLS server that had multiple users.
per-query handles the existing idioms cleanly, and lets things get saved into the config file.
a
then the tables can be categorized, with the worst possible scenario. For example, file carving would present "arbitrary file access"
then if the user does not allow that, file carving (and anything else under that category) gets disabled and won't even be available in the database namespace
s
What do you see as the mechanism for “user allows”?
a
I think the problem is that a query can be easily edited by whoever has access to it; a query could have a number of required access as metadata sure
but it shouldn't be validated there
ideally it would be a popup (I realize this could be an unpopular opinion) if it is a machine with a real user
otherwise a file that osquery can't touch and that the user can open to verify what's allowed and what's banned
i think I am talking about the user having to explicitly consent to specific types of data collections
with no way to go behind the user's back and obtain those information regardless of their consent
s
I think the problem is that a query can be easily edited by whoever has access to it; a query could have a number of required access as metadata sure
Oh, the auth tokens would have to have signed the query. Obviously 😛
otherwise a file that osquery can’t touch and that the user can open to verify what’s allowed and what’s banned
I think you’re asking for things not really possible in current OS design
a
That would be better, but it's still taking vs allowing/asking for
Permissions are not hard to implement, currently we attach all tables to the sqlite database but it does not have to be that way
s
I don’t think I agree with that assessment. At least, I think there are different parts. One part is about how you grant what permissions when. The other is closer to how you implement it. I guess I’m talking about how to implement it.
a
the user can decide to only attach one or two tables, i.e. we don't load the table plugins in the database that is connected to the TLS
s
I don’t think there’s a good solution for how to manage the asking/permission assignment. I don’t see popups as being feasible
a
those table will simply not exist, and save for a sqlite exploit the operator on the other end has no way to push a config/flagfile update to access things they are not authorized to access
s
Except… if you’re imagining it inside the same osquery process, they do exist. But not for that connection. So you’re ultimately trusting the osquery configuration to maintain that isolation.
a
What do you mean? There should be one database per connection, and no databases outside of that
(and user permissions should not be stored in the same osquery configuration that can be pushed remotely)
s
How would a connection seek additional privledges?
Anyhow… I think we’re way off in the weeds.
I don’t know if we want to think about a better defined line about what we want in core vs not. Or a mechanism to have tables we don’t routinely build.
p
Looks like there have been lots of discussion on this My thoughts/comments/questions: Osquery reputation, it was mentioned/opinionated that browser history would damage its reputation. Just wanted to add/counter that I think the feature would enhance its reputation 😄 User notifications, I'm not really sure what this means? R talking about alerting an enduser when a query is executed via an email, laptop notification, or something else? Ive never really seem an security agent do notifications, usually third party tools (ex: remote management servers) would handle that? And since variety of remote servers/web apps can use osquery wouldnt be better to let the third party server handle how it decides to alert users (if it even does) I've always considered osquery to be neutral it regards to how organizations use it And that the goal of osquery is to show as much relevant data as possible And let downstream (other organizations) decide what the policies r in regards to queries, notifications, etc
Im not really sure what the benefit and allowing a user to allow/deny tables would be? Unless ur talking about allow the user to choose which tables to allow on their system during the initial agent install? (Ex: they allow files, event logs, and processes. And disallow everything else?) Or u talking about allowing the user to disallow/cancel specific queries at runtime/schedule time? (Ex: an analyst schedules an event log query on a users laptop. The user receives a notification then they have the option to disallow/cancel the query) Couldn't that allow an attacker/compromised user to cancel any query they want? Again a lot of stuff was discussed, I may be misunderstanding stuff I'm hoping maybe adding flags the disable a browser history table by default (similar to carver table) would increase the chance of the feature being accepted?
a
ah well this was more about user consent and transparency, while trying to reduce the chance for admins to abuse the visibllity they have on systems that have users on them
👍 1
it's mostly a side-topic
👍 1
z
In regards to ethics, I don't believe that reaching out to the user for consent about every action a security/admin tool would like to take is the appropriate level of engagement for organization-owned/managed devices. I do think it's important to enable better transparency of in particular (1) what the potential capabilities are on the device (2) which capabilities have been used, when, how, and by whom. As we've started adding RBAC features to Fleet, we have been considering how to monitor and restrict access to certain features (possibly down to the table level). I'd be interested in a conversation about how osquery could assist with this kind of access control, but my feeling is that it mostly needs to be managed by a server that has a better view of users/authorization.
We are also discussing these issues 1 and 2 with the terms [scope transparency](https://github.com/fleetdm/fleet/issues/465) and [audit transparency](https://github.com/fleetdm/fleet/issues/466).
m
like for example ATC could refuse to open Chrome databases (or censor parts of its content)
this is a cool idea and something that hadn't occurred to me. sounds like it could be a lot of work?
p
IMO (again from a forensic viewpoint) Any sort of forensic tool that refuses to loading files or data base on a small committee decisions is would be a negative In regards to censoring data, for a forensic analyst this would be a massive issue as it could misinform an analyst and investigations could be misled If loading arbitrary files via ATC is an issue I think it would be better to remove than to censor data Refusing to load data would be weird and anti-user IMO. But displaying incorrect/censoring info is so much worse
Sorry, I realize osquery is used for alot of stuff So maybe censoring/refusing to load data would nice to have for other use cases Which is fine I'm just pointing out the forensic implications I do agree the technical idea of refusing/censoring data may be cool from a technical capability But From a forensic capability it would not be great Just my 2 cents
m
If redacted by core, I think it could be gotten at with extensions
Which could make osquery’s scope more explicit. Like, instead of “no browser history access... unless you’re clever 😉” It could become: “No browser history access.... unless you install puffycid’s popular browser history extension 😮
p
thats a interesting thought but im not sure how that would be viewed by other organizations/analysts? basically it would be "osquery wont show u this stuff if you want to see u need to deploy this random extension" again i havent really seen how osquery is used in prod but my very small experience with extensions is that they require quite a bit of overhead as i mentioned earlier for extensions an organization would need to deploy the extension to specific file paths, update config files, possibly update flags, make sure watchdog doesnt kill the extension, check permission issues, etc there may be third-party tools out there that can deploy extensions but im not sure how they work its alot of overhead just to "uncensor data" that analyst may want to see and if its an active attacker requiring an analyst to deploy an extension to view the "censored" data would probably not be ideal?
m
That def doesn’t sound ideal. Maybe an organization could decide to roll out with the extension installed from the get-go?
p
i guess that could work i would be curious how often third-party extensions are used alongside osquery (other than orbit, launcher, or any product that uses osquery) the only big third-party extension im aware of would be the ntfs trail of bits extensions. Again from a forensic perspective. Maybe non-forensic perspectives see value in or use other extensions. also it kind of seems silly to have deploy a security tool that actively censors stuff and then deploy a second tool along side it that uncensors the security tool? it kind seems a little silly and odd? lol
👍 1
m
It reminds me of how in a Node.js app, you might install dependencies with npm or yarn. That has helped Node core remain lightweight. Instead of including a core library for parsing xml, users pick the one they want
p
i like the discussion about this issue/feature i hope im not sounding like a jerk/overtly critical about this feature or anyones comments/concerns about it everyone has made good comments/concerns about it just trying provide a forensic perspective/viewpoint thats all. and i will respect any final decision that is made, regardless thanks
❤️ 1
g
Fwiw our organization uses osquery precisely because it isn’t creepy. And I managed to write an extension and I’m not that smrt so it’s not that hard to do.
😆 1
p
yep, writing an extension to parse browser history probably would not be that difficult (except for Edge and IE, the ESE format is pretty very complex 🙂 ) i think the complex/challenging part is deploying the extension to a huge amount of systems (ex: 10,000 laptops) and making sure all the flags, permission, configs, and watchdog is updated? it sounds like your organization extension deployment is pretty straightforward (which is awesome!) but i think that capability could vary from organization to organization?
a
It should not be hard to update the packaging scripts to include arbitrary files in the installer
and it's something people have expressed interest for
in fact, it should be really easy right now; combined with the ability of merge multiple cpp extensions into a single binary, I think extensions are a viable alternative for things that don't fit osquery core
👍 1
With the latest changes: 1. osquery/osquery: generates a package data file (.zip) containing binaries (i.e.: osqueryd) and control data (i.e.: files required by WiX) 2. osquery/osquery-packaging: takes the package data file and creates the final package (msi, nupkg, deb, rpm...) It should be trivial to update osquery-packaging to add a setting that includes your own .cmake that installs additional files
then you can just download the package data from the artifacts/website and just re-run the packaging code to include your additional files
This is more effective than merging this kind of functionality into core and will likely also benefit everyone
bonus feature: if you have a table that comes from an extension (it's blocked in core, you can't make it work) you can have writable tables
and support INSERT/DELETE/UPDATE statements
p
I agree that third party companies that use osquery could bundle extensions in their deployment script/app bundles and ship to customers. But I think most of those companies would only use their extensions? Or maybe really powerful ones like the NTFS extension (which should be in core 😉 ) But once osquery is deployed the only to add an extension then is through an upgrade or redeploying the whole agent again Which is OK, but kind of odd/more work for an organization? Ideally imo I think an osquery web app/management application would handle extension deployments and all an organization has to do is click "install extension" and the management app handles everything. But I'm not aware of any web app that offers it, I think something like would be difficult I still think its best to include into core as it is still less overhead. I think that organizations that use osquery for forensics/incident response (IR) would see value in browser history data when doing investigations into attacker activity
a
I don't think we are considering merging this into core, an extension will have to be used for that
Copy code
But once osquery is deployed the only to add an extension then is through an upgrade or redeploying the whole agent again
This is not true, and using an extension actually provides more upgrade options. If it is in core, you have to redeploy everything. If it is an extension, you have a chance to just upgrade that instead of the whole agent
👍 1
Once osquery-packaging has been patched (should be easy) to support file inclusion, I don't think anyone would even notice if it bundled extensions
We can help out with the osquery-packaging changes, so that you can ideally easily add your extensions to the .msi/.deb/.rpm/etc packages
p
Just to confirm when u say bundle extensions ur talking osquery packages distributed outside the osquery org itself My understanding was that the osquery package on the website will still just be osquery no extensions (which is good IMO) Changing the packaging scripts sounds like a good idea, that others may like Doesn't really solve the issue completely from my perspective 🙃 I don't use osquery in prod or use it for work. I only work on it on my personal time and use it personally I also don't think I would bundle and distribute an osquery bundle on my own Orgs should probably get osquery from the official site or their vendor if purchasing it? And if i did decide to distribute a osquery package I think it would be easy to just include the tables natively IMO 😅 But it sounds like others would benefit from a more robust packaging script and I agree it would be nice to have and there r benefits to it I Still think browser history should be in core 😁 Anyone can close the issue if u think the issue has been discussed enough Or if a final discussion can happen during office hours that would be fine (can also continue to chat here) and a final decision could made Unfortunately office hours conflict during my day Otherwise I would like to join them😀
a
If you bundle an extension you can keep the digital signatures from the osquery foundation; if you include the tables in core and build from scratch you will lose them
😮 1
p
That's good to know! And a good point!
I read the office hours notes and watched video, it looks like a good discussion happened, but the consensus was inconclusive/no consensus?...which is fine :) Unfortunately office hours time and conflicts with my schedule and I won't ever be able to join them (even though I would like to) If osquery would like to hold a special office hours/discussion I could join that? Anytime on weekends or weekdays after 5pm EST would work for me. Or if one or two TSC members (and anyone else) want to have a small discussion/video chat about the browser history support that works for me to. Or we can continue to chat here? Or if TSC has decided to not implement/accept the feature that is fine to, just let me know :) I saw some notes on tribal knowledge and overal goals, I agree with most (except the browser history part 😀) Some comments/thoughts I have:
Copy code
Browser history - From a forensic/incident response perspective browser history is a valuable artifact/data that can help an analyst investigating a system for malicious activity. Its an artifact that may be able to help answer the question "what happened on this system, what did the attacker do?". From the office hours/chat it was mentioned that there was concern for abuse. 
Would adding a flag similar to the carver table partially address this issue? Also wouldn't the remote management software be responsible for maintaining table access and monitoring for abuse? Osquery is just an agent, and would it be best if osquery had the capability to grab data that is relevant but leave it to the management software/organization/company to handle which table is sensitive?

For example a management tool could allow the table to be used or enabled if an "incident" has been declared or malware is detected on a system the tool could allow the osquery analyst to collect the data. But if malware or an "incident" is not active then that tool could disable the table or not allow it be query.

Though this idea would be tool/organization specific? 
Ive always considered osquery to be a "neutral" agent/security software and it lets other companies decide how it is used whether for forensics, device management/visibility, vulnerability management, etc and lets companies decide its own policies on its features. 
 
email - While email it could be forensically valuable, it doesn't really provide answers on "what happened on this system, what did the attacker do?" With the exception of maybe phishing evidence. IMO any sort of email parsing would be best as extension. For example, outlook files (PST/OST) are often huge (ive seen 6GB - 20+GB file sizes). Parsing those files would be difficult and would likely require a third party library like libpff and would return huge amounts of data. 

messages caches - Again while it could forensically valuable its also very application specific. There are probably 100+ messaging applications and creating tables for each application is unreasonable imo. I think the carver table would be best to handle message caches. Just carve the file and view in another tool.

contact information - I don't think there is a real forensic value for contact info so im ok with not including the table/feature or at least I don't really see a need for it if there are privacy concerns. I think the carver could handle getting this type of info.

Private Key Data - Again I think the carver table would handle this type of data a dedicated table is probably not needed.
i appreciate the discussion about this issue/feature its been great!👍
a
I think the consensus is - No to the table in core - The current workaround (ATC) to get the data anyway can stay
👍 1
😢 1
(note: the person suggesting to update ATC to prevent it from accessing browsing history in the video was me)
f
interesting, @alessandrogario you were advocating for limiting the ATC function?
a
I know it has been there forever, and has been used to read chrome profiles for a long time; I think it's strange that a browser_history table is banned (could allow osquery to filter/limit what is exposed) but using ATC to do the same (with full profile access) is allowed
f
I always viewed ATC configurations as akin to extensions in a way.
a
This is just my personal (as a normal community member) opinion but this does not feel like a grey area to me; despite the intent that osquery is trying to convey (allowing explicit fetching browser history VS offering it anyway without explaining how to in the docs) the end result is the same EDIT: trying to fix my poor grammar
f
The one consideration worth mentioning with regards to the Chrome
History
sqlite DB is that it does not just contain browser history but things like Download History
this can be useful when tightly scoped in the actual atc configuration, eg. omit any download entry which does not match the following source_url: https://production.database.backups/%
a
The problem I see is that the same could be said for a core table that returns the same data; since the information is the same, why is one allowed and the other one is not?
f
I believe it is no different from Carving or Extensions, the core collection capabilities of osquery should aim to collect the least privacy-concerning data possible.
a
I think that the distinction that osquery has historically made between things that are allowed in core vs things that are not is based on whether it can be easily abused & turned on without user consent
f
Extending that collection capability should be at the discretion and effort of the Security/IT team within an organization
I fundamentally agree with the statement that user's deserve better transparency regarding what is happening on their device.
But I believe the ATC:browser history example represents more of a slippery slope of cat and mouse.
Like I can pull all of the phone numbers you have contacted with timestamps if you are on a Mac and have linked facetime/messages using core tables.
👆 1
So should we go ahead and ban the utility tables that would allow that creepy behavior? How does that work when it is something as basic as the
file
table and reading the names of directories/files.
a
Yes, there other features that can be abused; my sentiment is more generic and not just focused on ATC
f
In an ideal world, everyone would be well-versed in SQL and could review/audit the contents of any query run on their device as well as the data that query transmitted.
a
I don't think that would work, in order to be effective it needs support for permissions
f
You want to pilfer my browsing history? Great, I will know you did it, and you can explain why you felt obligated to do that, and maybe I will look for another company to work for where they aren't creeps.
a
i.e. having a document somewhere that states what is being collected does not really mean a lot if it is easy for that document to be different than the actual truth
f
I think permissions are once again a game of cat and mouse, they require incredible amounts of domain expertise to understand whether you are actually protecting yourself from every hypothetical abuse.
a
Do you think it is not worth to try and protect resources if possible? Even starting with just documentation would be helpful
I think there are things that are worth pursuing even if they look hard at a first glance, we have many experts in the community that can help out
I think it's a really cool step that this (transparency) information was included, but I don't think it's effective since it's advisory and not mandatory EDIT: Previous post deleted, context is now missing
f
Sorry need to clean up those screenshots to omit my serial numbers 😅
💥 1
I think protecting resources is important but I think it should be done through exhaustive transparency resources. That was the major motivator for Kolide's updated Privacy Center:
a
Quoting my reply before the message I was responding to was deleted: transparency is not protection, and protections should be mandatory (i.e. enforced) instead of advisory
Meaning that this still allows someone to go behind the UI and do things not covered/documented in the privacy center
f
a
I think a good analogy here of mandatory vs advisory could be: • Advisory file locking; processes need to cooperate. One can decide to ignore it (or misbehave) and accidentally (or on purpose) bypass the lock • Mandatory file permissions (or mandatory locking): enforced by the kernel (in this case the kernel would be osquery). The program a the other end could decide to bypass the protection, but the kernel would return an error and prevent the operation