3

Similar to what is described in this article[0], the company I work for uses a bastion AWS account to store IAM users and other AWS accounts to separate different running environments (prod, dev, etc.). The reason this is important is that we have multiple AWS accounts and in some unique cases these AWS accounts need access to a single S3 bucket.

A way to enable this to work correctly is to set a bucket policy that allows access to the bucket from the S3 Endpoint from a particular AWS Account's VPC.

  1. Bucket Policy for data-warehouse

    {
        "Sid": "access-from-dev-VPCE",
        "Effect": "Allow",
        "Principal": "*",
        "Action": "s3:*",
        "Resource": [
            "arn:aws:s3:::data-warehouse",
            "arn:aws:s3:::data-warehouse/*"
        ],
        "Condition": {
            "StringEquals": {
                "aws:sourceVpce": "vpce-d95b05b0"
            }
        }
    }
    
  2. Role policy for role EMRRole

     {
        "Sid": "AllowRoleToListBucket",
        "Effect": "Allow",
        "Action": "s3:ListBucket",
        "Resource": [
            "arn:aws:s3:::data-warehouse",
        ]
    },
    {
        "Sid": "AllowRoleToGetBucketObjects",
        "Effect": "Allow",
        "Action": [
            "s3:GetObject",
            "s3:GetObjectVersion"
        ],
        "Resource": "arn:aws:s3:::data-warehouse/*"
    }
    

Unfortunately this doesn't work until I've explicitly set the ACL for each object to allow full control to that object by the owner of the AWS account I'm accessing from. If I don't do this, I get:

fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden

My instance that I'm running this on (EMR) has the correct role:

[hadoop@ip-10-137-221-91 tmp]$  aws sts get-caller-identity
{
    "Account": "1234567890",
    "UserId": "AROAIGVIL6ZDI6SR87KXO:i-0eaf8a5ca52876835",
    "Arn": "arn:aws:sts::1234567890:assumed-role/EMRRole/i-0eaf8a5ca52876835"
}

The ACL for an object in the data-warehouse bucket look like this:

aws s3api get-object-acl --bucket=data-warehouse --key=content_category/build=2017-11-23/part0000.gz.parquet
{
    "Owner": {
        "DisplayName": "aws+dev",
        "ID": "YXJzdGFyc3RhcnRzadc6frYXJzdGFyc3RhcnN0"
    },
    "Grants": [
        {
            "Grantee": {
                "Type": "CanonicalUser",
                "DisplayName": "aws+dev",
                "ID": "YXJzdGFyc3RhcnRzadc6frYXJzdGFyc3RhcnN0"
            },
            "Permission": "FULL_CONTROL"
        }
    ]
}

In the above ACL, the dev AWS Account will be able to read the object but another AWS account, say prod, will not be able to read the object until they've been added as a "Grantee".

My question: Is there a way to read/write objects to an S3 bucket from multiple AWS accounts without having to set ACLs on each individual object?

Note: we use spark to write to s3 using s3a.

[0] https://engineering.coinbase.com/you-need-more-than-one-aws-account-aws-bastions-and-assume-role-23946c6dfde3

c4urself
  • 5,270
  • 3
  • 25
  • 39
  • 1
    What about `x-amz-acl: bucket-owner-full-control`? The uploading account has exclusive control of read permissions of objects without this, not the bucket owner account, so your bucket policy has no effect, since it is granting a privilege the bucket owner lacks authority to grant. – Michael - sqlbot Dec 13 '17 at 23:47

1 Answers1

2

While I have not found a way around setting ACLs on a per-object basis, there is a way to enforce that ACLs are correctly set on upload using a Bucket Policy. This example policy shows how to allow an AWS account to upload objects to your bucket and requires that the bucket owner is granted full control of all uploaded objects:

{
"Version": "2012-10-17",
"Statement": [
    {
        "Sid": "AllowSourceAccount0123456789ToPutObjects",
        "Effect": "Allow",
        "Principal": {
            "AWS": "arn:aws:iam::0123456789:root"
        },
        "Action": "s3:PutObject",
        "Resource": "arn:aws:s3:::data-warehouse/*"
    },
    {
        "Sid": "RequireAllUploadedObjectsToAssignFullControlToBucketOwner",
        "Effect": "Deny",
        "Principal": {
            "AWS": "*"
        },
        "Action": "s3:PutObject",
        "Resource": "arn:aws:s3:::data-warehouse/*",
        "Condition": {
            "StringNotEquals": {
                "s3:x-amz-acl": "bucket-owner-full-control"
            }
        }
    }
]

}

The key is the explicit deny which checks for the x-amz-acl: bucket-owner-full-control header (mentioned by Michael-sqlbot in the comments) and fails any upload where this is not set. When using the AWS CLI to upload files this requires the --acl bucket-owner-full-control flag to be set.

Example:

aws s3 cp example-file.txt s3://data-warehouse/example-file.txt --profile aws-profile-name --acl bucket-owner-full-control

Hopefully AWS will provide a way to address ACLs more gracefully at some point.

0x574F4F54
  • 161
  • 5