We send our osquery data to Firehose, which then is crawled by Glue and then we query it via Athena.
Osquery sends up JSON, but that JSON is a bit different for each osquery task. This makes Athena angry since Glue will crawl things and create tables how it think things look. For example the "columns" key in the JSON differes, so the generated struct won't work for all resluts osquery sends up, so Athena throws an error. We have a workaround at the moment where we just handle "columns" as a string, but then ewe of course can't search in it the same way.
Do you have any better solutions for this? Am I missing something or how do people usually do this?