11/06/2019, 9:55 PM
could anyone help me logic around what i think is a bug in the provided init.d script? it looks like
service osqueryd stop
is supposed to wait for all of the processes to die but does not; i'm pretty sure i see the issue but a set of eyes familiar would be great. i'll thread here with some details
9:57 PM
it looks like here: https://github.com/osquery/osquery/blob/e6fe15eb49660725e65dba1549932ed96e0a8c6e/tools/deployment/osqueryd.initd#L105 we're looping kill -0 to see if the daemon PID exists, and https://github.com/osquery/osquery/blob/e6fe15eb49660725e65dba1549932ed96e0a8c6e/tools/deployment/osqueryd.initd#L107 is supposed to pkill the whole tree after 5 seconds if #105 is still running
9:59 PM
unfortunately it looks like in some cases the daemon is dieing before the worker processes:
$ sudo service osqueryd stop;ps aux | grep osq
Stopping osqueryd (via systemctl):                         [  OK  ]
root     21243 99.0  0.0 930504 65780 ?        SNl  21:58   0:15 /usr/bin/osqueryd
10:00 PM
i'm kinda assuming that we should be using pgrep -g $PID on line 105 instead of kill -0 to see if any of the processes in the group are running, not just the daemon (which does fix this issue); does that seem to match the original intent?