2

Problem

I have deployed an ECS cluster and am running a job orchestration platform on the cluster. One of the containers of this platform uses the python docker api to pull a container from our private ECR repo and execute a job within the container. When the job starts running, it eventually hits an issue where it cannot find the assume role credentials defined inside the container in /root/.aws/config as credential_source=EcsContainer. This happens after the code tries to make a call to S3.

Why might this happening? The credential source is defined in the container. Why is it not found?

Details

Error

......

The above exception was caused by the following exception:
botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from EcsContainer: No credentials found in credential_source referenced in profile default
  File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/utils.py", line 42, in solid_execution_error_boundary
    yield
  File "/usr/local/lib/python3.6/site-packages/dagster/utils/__init__.py", line 383, in iterate_with_context
    next_output = next(iterator)
  File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator
    result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
  File "/opt/dagster/app/solids/files.py", line 33, in stream_url_to_s3
    with smart.open(f's3://{s3_bucket}/{s3_key}', 'wb', transport_params=tp) as s3location:
  File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 235, in open
    binary = _open_binary_stream(uri, binary_mode, transport_params)
  File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 398, in _open_binary_stream
    fobj = submodule.open_uri(uri, mode, transport_params)
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 224, in open_uri
    return open(parsed_uri['bucket_id'], parsed_uri['key_id'], mode, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 308, in open
    writebuffer=writebuffer,
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 757, in __init__
    _initialize_boto3(self, client, client_kwargs, bucket, key)
  File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 528, in _initialize_boto3
    client = boto3.client('s3', **init_kwargs)
  File "/usr/local/lib/python3.6/site-packages/boto3/__init__.py", line 91, in client
    return _get_default_session().client(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/boto3/session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 826, in create_client
    credentials = self.get_credentials()
  File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 431, in get_credentials
    'credential_provider').load_credentials()
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1962, in load_credentials
    creds = provider.load()
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1395, in load
    return self._load_creds_via_assume_role(self._profile_name)
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1410, in _load_creds_via_assume_role
    role_config, profile_name
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1566, in _resolve_source_credentials
    credential_source, profile_name
  File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1623, in _resolve_credentials_from_source
    'in profile %s' % profile_name

Configuration

Container Role:

  EcsTaskRole:
    Type: AWS::IAM::Role
    Properties:
      Description: The role assumed by the containers, allowing them to call AWS services.
      RoleName: !Sub ecs-task-trans-role-development
      AssumeRolePolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - ecs-tasks.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
      - PolicyName: !Sub 's3-access-${EnvironmentName}-${AWS::StackName}'
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action:
              - s3:*
            Resource:
              - "*"

/root/.aws/config in the container:

[default]
role_arn = arn:aws:iam::<my account>:role/ecs-task-trans-role-development
credential_source = EcsContainer

There is no /root/.aws/credentials file because the point of assuming a role from the config file is to retrieve the temporary credentials. https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-role.html

Partial TaskDefinition:


  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      ...
      ContainerDefinitions:
          ...
          MountPoints:
            - ContainerPath: "/var/run/docker.sock"
              SourceVolume: docker_sock
              ReadOnly: true
            - ContainerPath: "/root/.docker"
              SourceVolume: docker_dir
              ReadOnly: true
            - ContainerPath: "/usr/bin/docker-credential-ecr-login"
              SourceVolume: docker_creds
              ReadOnly: true

What I have tried

  1. Use the taskExecutionRole rather than the container role.
  2. Exporting AWS_PROFILE=default in the container
nickewound
  • 21
  • 1

0 Answers0