noob here trying to get set up. Ran out of ideas, ...
# kolide
b
noob here trying to get set up. Ran out of ideas, just wondering what I should do to troubleshoot further. I have a great demo running on my laptop with osquery agents, fleet, and elk stack but now trying to implement in the real world. I'm betting some acl/firewall/security group type issue, but not having much luck with indications on why the following. • haproxy in front of fleet just doing tcp passthrough • fleet running in aws ec2 - verified connection with redis and mysql. • mysql is aws rds • redis is aws elasticache In the fleet ui I see and interact with most of it. am using a wildcard ssl cert but seems to be working - remote agent enrolled and looks to be working as expected. osquery logs on remote agent do not give any indication that there is anything wrong (that jump out at me). • main problem is:
live_query
and
run
just hang and eventually result in
net::ERR_HTTP2_PROTOCOL_ERROR
• fleetctl just hangs on
--query "SELECT * FROM osquery_info"
aswell, even with --timeout set and --debug set, I see nothing. • as far as I can tell everything else working fine. fleet is behind a tcp passthrough haproxy. • also tried accessing via ssh tunnel direct to port 8080, still seeing same problem here too.
z
Does the proxy support websockets?
b
just wanted to follow up... removed proxy from the picture and still have same issue. ( spent most of yesterday trying various haproxy suggestions with no success). pretty sure I had the proxy config right there. tcp passthrough is supposed to be underlying protocol agnostic. at any rate, to rule it out, I moved fleet to a public interface, osqueryd clients look like they are working as expected. seeing scheduled checks executing and result logs etc no errors in either osquery logs or fleet logs. It's got to be something we do to our ubuntu18 image or something i missed somewhere. security/hardening related that just isn't presenting itself in a way that's obvious. also ensured iptables was flushed and secutity groups wide open - still no dice. confirmed - went back and used an untouched image from aws community and voila. now to hunt down what the heck we're doing in our "hardened" image.
z
Good hunting. Please let us know what you find so that we can learn from you 🙂