Title
#general
t

tokcum

06/13/2022, 10:36 AM
Hi, we are facing the challenge to understand if software found by osquery is legitimate or not. Legitimate means to us: defined on a list of approved software (DSL). We could put this DSL into the query, however, we are wondering if osquery could contact an external source to gather DSL. What are you thoughts? Has anybody solved this and how? I'm not looking for a ready-to-deploy solution but helpful hints how we could approach this.
j

Jason

06/13/2022, 12:17 PM
I would probably use the fleet API to populate your saved query / policy. You could script something to pull the allowed list from your source and then "push" that into Fleet via the API. A SOAR tool like Tines.io could probably help with that too.
i

Ibra

06/13/2022, 12:22 PM
Hi @Jason and @tokcum I would also be interested, is there a guide that explains how to implement these authorized software checklists and a guide on how to connect it to fleet? it would also be convenient to send an automatic email as soon as unauthorized software is installed.
j

Jason

06/13/2022, 12:23 PM
No guide really, but Tines.io is free for 3 playbooks and you could definitely build a single playbook ("Story") in that that pulls from something like a Google Sheet and then assembles the query and pushes that into Fleet. A bit of work, but not monumental at all.
i

Ibra

06/13/2022, 12:31 PM
the steps to connect tines.io to fleet? the steps to connect tines.io to fleet? i use the free version installed on internal server
defensivedepth

defensivedepth

06/13/2022, 12:54 PM
Using subquerys, you could also use the
curl
table to pull down data from a trusted site, parse out the list & use that in a subsequent query.
j

Jason

06/13/2022, 1:05 PM
Thats a cool idea
i

Ibra

06/13/2022, 1:07 PM
unfortunately I am not expert, the only thing I have done is instalre fleet on centos (https://fleetdm.com/docs/deploying/server-installation) and fleetctl for generating msi packages for windows pc, can you tell me how to implement this part of osquery and how to have these custom queries besides the ones in the demo?
s

seph

06/13/2022, 3:58 PM
A lot here depends on what’s easy for you, and how you might distribute this list. At the core, you’re trying to merge information from a table, with some allowlist. So you need to expose that allowlist to osquery, and then join the two. Offhand, possibilities are things like:1. curl table 2. plist table (mac only) 3. distribute a sqlite database, and use an ATC table 4. kolide launcher ships tables for json and some other formats Last time someone talked about this on slack, I think they ended up with (3).
t

tokcum

06/13/2022, 5:19 PM
Thank you so much for these insights. Today I worked on the query itself and I would like to share with you the result. It might look overly complicated to you at first glance, however, while implementing this I realized that we will need different kinds of lists to keep the results actionable.
WITH whitelisted_software_1 AS (
    SELECT 'Microsoft Visual C++ % Redistributable %' AS pattern
  ),
  whitelisted_publisher_1 AS (
    SELECT 'Microsoft Corporation' AS pattern
  ),
  whitelisted_software AS ( 
    SELECT 'Adobe Acrobat DC (64-bit)' AS name, 'Adobe' as publisher, '22.001.20117' AS version, 'ge' AS mode, 'whitelist' AS list
    UNION
    SELECT 'Fleet osquery' AS name, 'Fleet Device Management (<http://fleetdm.com|fleetdm.com>)' as publisher, '0.5.0' AS version, 'eq' AS mode, 'whitelist' AS list
    UNION
    SELECT 'Local Administrator Password Solution' AS name, 'Microsoft Corporation' as publisher, '6.2.0.0' AS version, 'eq' AS mode, 'whitelist' AS list
    UNION
    SELECT 'Notepad++ (64-bit x64)' AS name, 'Notepad++ Team' as publisher, '8.0' AS version, 'ge' AS mode, 'whitelist' AS list
  ),
  graylisted_software AS (
    SELECT 'Cisco Webex Meetings' AS name, '0' AS version, 'ge' AS mode, 'graylist' AS list
  ),
  blacklisted_software AS (
    SELECT 'Foxit PDF Reader' AS name, '0' AS version, 'ge' AS mode, 'blacklist' AS list
  )
SELECT p.name,
       p.publisher,
       p.version,
       p.install_date,
       COALESCE( black.list, gray.list, white.list ) AS list
FROM programs p,
     whitelisted_software_1,
     whitelisted_publisher_1
LEFT JOIN whitelisted_software white
ON p.name = white.name
AND p.publisher = white.publisher
LEFT JOIN graylisted_software gray
ON p.name = gray.name
LEFT JOIN blacklisted_software black
ON p.name = black.name
WHERE
  p.name NOT LIKE whitelisted_software_1.pattern /* Software is whitelisted by pattern */
  AND
  p.publisher NOT LIKE whitelisted_publisher_1.pattern /* Publisher is whitelisted by pattern */
  AND (
    white.list IS NULL /* Software is not whitelisted */
    OR white.list IS NOT NULL /* Software is whitelisted but version does not comply */
      AND (
        ( white.mode = 'eq' AND p.version != white.version )
        OR
        ( white.mode = 'ge' AND p.version < white.version )
      )
    OR gray.list IS NOT NULL /* Software is graylisted */
      AND (
        ( gray.mode = 'eq' AND p.version = gray.version )
        OR
        ( gray.mode = 'ge' AND p.version >= gray.version )
      )
    OR black.list IS NOT NULL /* Software is blacklisted */
      AND (
        ( black.mode = 'eq' AND p.version = black.version )
        OR
        ( black.mode = 'ge' AND p.version >= black.version )
      )
  )
I'll try the curl approach starting tomorrow to be able to define the list externally to the query.
6:27 PM
@defensivedepth I got the JSON file via curl but how could I parse this? Today, I'm using fleet's osquery. It doesn't seem to support dissecting JSON. Do I miss something or do I've to take kolides launcher, as proposed by @seph?
6:52 PM
Got it. We have to use json_each and json_extract functions from sqlite to parse the JSON. This is how my JSON looks like:
{
  "whitelist": [
    {
      "name": "Adobe Acrobat DC (64-bit)",
      "publisher": "Adobe",
      "version": "22.001.20117"
    },
    {
      "name": "Fleet osquery",
      "publisher": "Fleet Device Management (<http://fleetdm.com|fleetdm.com>)",
      "version": "0.5.0"
    }
  ]
}
And this is the corresponding query:
SELECT json_extract( value, '$.name' ) AS name, json_extract( value, '$.publisher' ) AS publisher, json_extract( value, '$.version' ) AS version FROM curl, json_each(curl.result, '$.whitelist') WHERE URL = "https://..."
Great! Thank you all.
s

seph

06/13/2022, 9:39 PM
The
kolide_json
is for reading json files from disk. To parse a curl response, json extract seems pretty good.
i

Ibra

06/14/2022, 1:59 PM
is it possible to make the witelist with the list of allowed software using mysql database and not sqlite? can anyone point me to the steps they have taken? is it possible to deny the user installation if the software is not in the witelist and send an automatic email to the administrator?
2:00 PM
@tokcum
2:00 PM
@Keith Swagler @Jason
s

seph

06/14/2022, 2:00 PM
Why are you @ messaging people?
2:01 PM
osquery neither has a mysql client built in, nor does it have a native mechanism to block a software installation.
t

tokcum

06/14/2022, 3:31 PM
@Ibra please make yourself familiar with what osquery is and also what it is not.
3:39 PM
P.S. The results can now easily filtered by list or installed version compared to defined version according to mode stated in JSON file.
i

Ibra

06/14/2022, 3:42 PM
Hi tokcum thanks for your helpfulness, sorry but I'm not too familiar with osquery, I just installed it and I'm still learning about it: my question was more referring not to the json file format but from the complete system you use to implement this function. I would like to understand the systems used and how to connect them to osquery/fleet server via the api or if you used other way
s

seph

06/14/2022, 4:25 PM
I can't absorb that query, but that seems like the same direction I'd develop in. I suspect you could drop the
mode
field, and extend the dsl to have a pattern field, and a ge field, and whatever else.
t

tokcum

06/14/2022, 5:38 PM
@Ibra I suggest you make yourself familiar with osquery first. Use the cli with osqueryi at first. If you feel comfortable and want to go further look at fleet. Their documentation is really helpful and getting started with their containers from docker hub is straight forward. If you are not familiar with docker: you could also start fleet from it's binary. For testing / learning this should suffice.
8:27 AM
Thanks @seph for sharing your thoughts. I was able to simplify the query I came up with yesterday. Again, I would like to share the results with the community. Black-, gray- and whitelist of software products are defined in a single JSON file (dsl = defined software list). Here is how it looks like: it supports different lists, allows to use patterns and to define how versions are compared.
{
  "dsl": [
    {
      "name": "Foxit %",
      "publisher": "%",
      "version": "%",
      "mode": "pattern",
      "list": "blacklist",
      "description": ""
    },
    {
      "name": "Cisco Webex %",
      "publisher": "%",
      "version": "%",
      "mode": "pattern",
      "list": "graylist",
      "description": ""
    },
    {
      "name": "7-Zip %",
      "publisher": "Igor Pavlov",
      "version": "19.00.00.0",
      "mode": "pattern",
      "list": "whitelist",
      "description": ""
    },
    {
      "name": "Adobe Acrobat DC (64-bit)",
      "publisher": "Adobe",
      "version": "22.001.20117",
      "mode": "ge",
      "list": "whitelist",
      "description": ""
    }
  ]
}
And this is the corresponding query filtered for records not being fully whitelisted according to definition in JSON file. When you copy and paste this, please adjust URL to JSON file.
WITH
  patterns AS (
    SELECT json_extract( value, '$.name' ) AS name,
           json_extract( value, '$.publisher' ) AS publisher,
	       json_extract( value, '$.version' ) AS version,
           json_extract( value, '$.mode' ) AS mode,
		   json_extract( value, '$.description' ) AS description,
           json_extract( value, '$.list' ) AS list
      FROM curl,
	       json_each(curl.result, '$.dsl')
	  WHERE URL = "https://..."
	  AND mode = "pattern"
  ),
  records AS ( 
    SELECT json_extract( value, '$.name' ) AS name,
	       json_extract( value, '$.publisher' ) AS publisher,
		   json_extract( value, '$.version' ) AS version,
		   json_extract( value, '$.mode' ) AS mode,
   		   json_extract( value, '$.description' ) AS description,
		   json_extract( value, '$.list' ) AS list
    FROM curl,
	     json_each(curl.result, '$.dsl')
	WHERE URL = "https://..."
	AND mode != "pattern"
  )
SELECT p.name,
       p.publisher,
       p.version,
       p.install_date,
	   COALESCE( patterns.name, records.name ) AS l_name,
	   COALESCE( patterns.publisher, records.publisher ) AS l_publisher,
	   COALESCE( patterns.version, records.version ) AS l_version,
	   COALESCE( patterns.mode, records.mode ) AS l_mode,
       COALESCE( patterns.description, records.description ) AS l_description,
	   COALESCE( patterns.list, records.list ) AS l_list
FROM programs p
LEFT JOIN patterns
  ON p.name LIKE patterns.name
 AND p.publisher LIKE patterns.publisher
 AND p.version LIKE patterns.version
LEFT JOIN records
  ON p.name = records.name
 AND p.publisher = records.publisher
WHERE
  l_list IS NULL
  OR l_list IN ( "graylist", "blacklist" )
  OR (
    l_list = "whitelist"
	AND (
      ( l_mode = 'eq' AND p.version != l_version )
      OR
      ( l_mode = 'ge' AND p.version < l_version )
    )
  )
I'm not 100% satisfied yet, but its a good enough 1.0. @All : let me know your thoughts. 😃