Hey, I’m setting up gitops with GHA and trying to ...
# fleet
t
Hey, I’m setting up gitops with GHA and trying to figure out how to secure the API tokens. Our Fleet is self-hosted on GCP with something pretty close to the reference architecture (cloud run, cloud sql, memorystore, LB with mTLS to Cloudflare). First attempt was to put a gitops token in an environment and only allow access to this from the main branch, and otherwise expose a read-only token to be used for dry runs in PRs. This didn’t work though, as it seems even in dry run mode
fleetctl
submits a POST to the backend:
Copy code
Run ./gitops.sh
+ FLEET_GITOPS_DIR=.
+ FLEET_GLOBAL_FILE=./default.yml
+ FLEETCTL=fleetctl
+ FLEET_DRY_RUN_ONLY=false
+ FLEET_DELETE_OTHER_TEAMS=true
+ grep -Exq '^org_settings:.*' ./default.yml
+ compgen -G ./teams/no-team.yml
+ sort
+ perl -nle 'print $1 if /^name:\s*(.+)$/' ./teams/no-team.yml
+ uniq -d
+ grep . -cq
+ args=(-f "$FLEET_GLOBAL_FILE")
+ for team_file in "$FLEET_GITOPS_DIR"/teams/*.yml
+ '[' -f ./teams/no-team.yml ']'
+ args+=(-f "$team_file")
+ '[' true = true ']'
+ args+=(--delete-other-teams)
+ fleetctl gitops -f ./default.yml -f ./teams/no-team.yml --delete-other-teams --dry-run
Error: applying no-team scripts: POST /api/latest/fleet/scripts/batch received status 403 forbidden: forbidden
Since the main gitops token is very powerful I want to avoid exposing it to arbitrary workloads, since it’s trivial for an insider to extract this by modifying the workflow file and could then make arbitrary changes via the api. But not sure how to achieve this if read-only tokens can’t be used for dry runs, any suggestions? The alternatives I’ve considered so far: 1. Using a separate workflow file, put the secret in GCP secrets manager and expose this to the workload if it authenticates via OIDC and make an assertion on the specific workflow used to prevent exfiltrating the token. Adds lots of redirection and complexity. 2. Require approval before running the workflow, but that feature is gated behind GitHub Enterprise. 3. Custom OIDC endpoint that can pull the repo and validate the workflow hash before generating short-lived access tokens for a set service account, but that’s also on the complex side. All of these would also need to validate the gitops.sh script though, otherwise an insider could also just modify that trivially to exfiltrate the token 🤔 I guess ideal solution would be a dedicated role that is only allowed to perform gitops dry runs, and not anything else, as I wouldn’t want it to be able to run “observer can run” queries and similar. But don’t think a role like that exists, right?
k
You're correct that a role like that doesn't currently exist, but we could certainly discuss it!
I haven't dug too much into the feasibility of extracting the secret, my impression was that GH wouldn't expose the secret variables. Digging in to that a bit.
t
To explain one vector to extract the secret, if a contributor to the repo can create a PR that causes the dry run job to run, they can modify the
gitops.sh
script to add a line that posts the token to a URL they control (without logging), and then force push over that commit to remove it from the git history, leaving some innocuous change instead. It’s possible to find the original commit from the github logs, but if the attacker doesn’t use it immediately but rather saves it for a couple weeks or months before exploiting it, identifying where the token leaked could be very hard. One could also add a remote login step like https://github.com/mxschmitt/action-tmate after the
fleetctl
has been authenticated, this would look very innocent in logs, but could be used to grab the contents of
~/.fleet/config
without leaving any log entries that can pinpoint the secret being exfiltrated. One could also use branch protection to reduce the amount of people with permissions to start workflows on matching branches and restrict the environment secret to the same branches, but I’d like the ability for anyone to propose changes and have them validated by the dry runs.
k
Locking down the actual run makes a lot of sense, but it definitely makes more sense for the dry-run to run on push to surface issues faster.
I don't see any existing Feature Requests for a GitOps read-only user. I'd definitely recommend submitting one in GitHub! https://github.com/fleetdm/fleet/issues/new/choose If you share the ticket with me, I can share it with some team members for you.
t
I think this is the same issue, which ended up with the same conclusion, that dry run can’t practically be used in a secure context atm: https://github.com/fleetdm/fleet/issues/28367
Would you prefer a new PR that more clearly targets a read-only/dry-run only role request, with a reference to the previous discussions, or should I jump in to the existing discussion with a +1 and some elaboration?
Took the liberty to open the PR since I had already written most of it before finding the other issue, but mentioned the older issue to at least keep the link: https://github.com/fleetdm/fleet/issues/29839 Feel free to close or merge if that simplifies tracking and (hopefully) resolution.