Title
#fleet
k

KK

05/24/2021, 11:43 AM
Morning everyone, I'm trying to stream my ECS Fargate Fleet's osquery logs to Firehose, but my containers are failing to initialize with the error message:
Error initializing service: initializing osquery logging: create firehose status logger: create Firehose writer: describe stream arn:aws:firehose:xx:xx:deliverystream/test_stream: NoCredentialProviders: no valid providers in chain. Deprecated.
My assumption is that the ECS task will first assume the role provided in
FLEET_FIREHOSE_STS_ASSUME_ROLE_ARN
using its default credential, which would be the ECS task role that it was configured to run with. Once assumed, the task will then be able to call
DescribeDeliveryStream
using the newly granted role. However, based on the code here, my guess is that the task could not find the default credentials(?). I'd like to avoid passing the access keys to the task, could anyone please take a look and see where I went wrong? These are the environment variables/actions that I have configured so far: • FLEET_OSQUERY_RESULT_LOG_PLUGIN:
firehose
FLEET_OSQUERY_STATUS_LOG_PLUGIN:
firehose
FLEET_FIREHOSE_REGION
:
xx
FLEET_FIREHOSE_RESULT_STREAM
:
arn:aws:firehose:xx:xx:deliverystream/test_stream
FLEET_FIREHOSE_STATUS_STREAM
:
arn:aws:firehose:xx:xx:deliverystream/test_stream
FLEET_FIREHOSE_STS_ASSUME_ROLE_ARN
:
arn:aws:iam:xx:role/firehoseRole
• An ECS task role permission to assume the role
arn:aws:iam:xx:role/firehoseRole
• A new IAM role
arn:aws:iam:xx:role/firehoseRole
with permissions to call
firehose:DescribeDeliveryStream
and
firehose:PutRecordBatch
against
arn:aws:firehose:xx:xx:deliverystream/test_stream
zwass

zwass

05/24/2021, 3:17 PM
@nyanshak @billcobbler any ideas what might be going on here? I know y'all use the assume role.
n

nyanshak

05/24/2021, 3:21 PM
err... I use osquery's assume role but not fleet so not very familiar with that aspect. Also don't use ECS Fargate 😐 So... not very familiar with fleet's errors / what they mean here. But based on the info provided, I would assume that this should work. I'm not sure if maybe one of the things you think is configured is maybe not configured as you expect (e.g., a typo). But afaict, the config looks okay.
3:22 PM
I don't know if there's maybe an issue specific to fargate or fleet's code though
k

KK

05/24/2021, 3:46 PM
Possibly related to a troubleshooting hiccup of mine, but I'm now seeing that my ECS tasks cannot pull the Docker Hub fleet image and are failing to start with the following "Stopped reason":
CannotPullContainerError: ref pull has been retried 1 time(s): failed to copy: httpReaderSeeker: failed open: unexpected status code <https://registry-1.docker.io/v2/fleetdm/fleet/manifests/sha256:a6694d267a20c0f656304c6392efbec9f9bb88a2499f61c0e6f0dab2>...
and
CannotPullContainerError: inspect image has been retried 5 time(s): httpReaderSeeker: failed open: unexpected status code <https://registry-1.docker.io/v2/fleetdm/fleet/manifests/sha256:a6694d267a20c0f656304c6392efbec9f9bb88a2499f61c0e6f0dab25e993a20>: 4...
Has anyone encountered the issue before? My tasks run on a private subnet with a route to a NAT gateway. My execution role and role policy follow what was suggested by AWS:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "<http://ecs-tasks.amazonaws.com|ecs-tasks.amazonaws.com>"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken",
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*"
    }
  ]
}
z

Zach Zeid

05/24/2021, 4:46 PM
you could be running into docker hub's rate limiting
k

KK

05/24/2021, 4:49 PM
@Zach Zeid that seemed to be the case. I lowed the desired state to 0 and bumped it back up, the task has no issues pulling the image now. Appreciate your comment!
5:42 PM
As for the original post, I managed to resolve the error by adding the 
firehose:DescribeDeliveryStream
 and 
firehose:PutRecordBatch
permissions directly in the ECS task role. The task role still contains the permission to assume the
firehoseRole
role that grants the same permission, which likely is redundant. I'm now looking into whether I can completely do without the firehoseRole and point
FLEET_FIREHOSE_STS_ASSUME_ROLE_ARN
directly to the ECS task role.
n

nyanshak

05/24/2021, 5:43 PM
(not being super familiar with this aspect) I would generally assume that STS is optional and that if you already have the correct permissions on the ECS task role, you could forego setting
FLEET_FIREHOSE_STS_ASSUME_ROLE_ARN
entirely.
5:43 PM
but I would love to get confirmation of that
5:44 PM
basically - you're already in AWS, able to give yourself all the right permissions, no reason to use STS assume role if you don't have to
5:44 PM
obv there are cases where that's not true, but ... 🤷‍♀️
k

KK

05/24/2021, 6:18 PM
I can confirm my ECS task can successfully push to Firehose without the ARN env var. Good call @nyanshak!