13

I can't figure out how AWS sets up their Docker 'thin pool' on ElasticBeanstalk and how it is getting filled. My docker thin pool is filling up somehow and causing my apps to crash when they try to write to disk.

This is from inside the container:

>df -h
>     /dev/xvda1                  25G  1.4G   24G   6%

The EBS does, in fact, have a 25GB disk apportioned to it; 1.6 gb is what du -sh / returns.

Outside in EC2, it starts off innocuously enough... (via lvs)

LV          VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
docker-pool docker twi-aot--- 11.86g             37.50  14.65

However, the filesystem will soon re-mount as read-only. via dmesg:

[2077620.433382] Buffer I/O error on device dm-4, logical block 2501385
[2077620.437372] EXT4-fs warning (device dm-4): ext4_end_bio:329: I/O error -28 writing to inode 4988708 (offset 0 size 8388608 starting block 2501632)
[2077620.444394] EXT4-fs warning (device dm-4): ext4_end_bio:329: I/O error     [2077620.473581] EXT4-fs warning (device dm-4): ext4_end_bio:329: I/O error -28 writing to inode 4988708 (offset 8388608 size 5840896 starting block 2502912)

[2077623.814437] Aborting journal on device dm-4-8.
[2077649.052965] EXT4-fs error (device dm-4): ext4_journal_check_start:56: Detected aborted journal
[2077649.058116] EXT4-fs (dm-4): Remounting filesystem read-only

Back out in EC2 instance-land, Docker reports this: (from docker info)

Pool Name: docker-docker--pool
Pool Blocksize: 524.3 kB
Base Device Size: 107.4 GB
Backing Filesystem: ext4
Data file:
Metadata file:
Data Space Used: 12.73 GB
Data Space Total: 12.73 GB
Data Space Available: 0 B
Metadata Space Used: 3.015 MB
Metadata Space Total: 16.78 MB
Metadata Space Available: 13.76 MB
Thin Pool Minimum Free Space: 1.273 GB

LVS dumps this info:

  --- Logical volume ---
  LV Name                docker-pool
  VG Name                docker
  LV UUID                xxxxxxxxxxxxxxxxxxxxxxxxxxxx
  LV Write Access        read/write
  LV Creation host, time ip-10-0-0-65, 2017-03-25 22:37:38 +0000
  LV Pool metadata       docker-pool_tmeta
  LV Pool data           docker-pool_tdata
  LV Status              available
  # open                 2
  LV Size                11.86 GiB
  Allocated pool data    100.00%
  Allocated metadata     17.77%
  Current LE             3036
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2

What is this thin pool, why does it fill up, and how do I stop it from doing so? Also, if I have 20+ GB free from inside the container on my / volume, why does it stop new writes? As far as I can tell it is not connected to files that my programs are writing to.

Thank you!

std''OrgnlDave
  • 359
  • 1
  • 3
  • 8

6 Answers6

9

The .ebextensions suggested by David Ellis worked for me. I'm unable to comment on his answer, but I wanted to add that you can create a new EBS volume instead of using a snapshot. To mount a 40GB EBS volume, I used the following:

option_settings:
  - namespace: aws:autoscaling:launchconfiguration
    option_name: BlockDeviceMappings
    value: /dev/xvdcz=:40:true

See also this documentation, which has an example of mapping a new 100GB EBS volume to /dev/sdh.

The true at the end means "delete on terminate".

I created a new .ebextensions directory containing an ebs.config file with the above code, then zipped that directory together with my Dockerrun.aws.json. Note that the Dockerrun file must be at the top level of the zip, not inside a subdirectory.

To find where Elastic Beanstalk is mounting the volume, use lsblk on the failing instance. It was also /dev/xvdcz for me, so maybe that is the standard.

joko
  • 91
  • 1
  • 1
3

We got hit by the same issue. The root cause seems to be Docker not mounting its storage engine (thin-provisioned devicemapper by default in Elastic Beanstalk) with the discard options, which in turn fills blocks until it breaks.

I wasn't able to find a definite solution to this, but here is a workaround (see this comment) that I was able to use on affected instances :

docker ps -qa | xargs docker inspect --format='{{ .State.Pid }}' | xargs -IZ fstrim /proc/Z/root/
F.X.
  • 241
  • 2
  • 8
  • 1
    Thanks. I came to the same conclusion and ended up changing all data storage over to EBS. I think that's a little silly for truly transient/temporary files (that keep getting overwritten) but hey what can you o? – std''OrgnlDave May 16 '17 at 20:13
  • It turns out that a cronjob for this is in the EC2 documentation, but it is not mentioned in Beanstalk docs. On Beanstalk you would have to see if you could add a hook for a special crontab or something. – std''OrgnlDave May 19 '17 at 17:47
  • Oh, nice to know! Would you mind copying the link here as a reference? – F.X. May 19 '17 at 17:54
  • 1
    http://docs.aws.amazon.com/AmazonECS/latest/developerguide/troubleshooting.html search for "trim". Not exactly a straightforward mention of a very obvious thing – std''OrgnlDave May 19 '17 at 18:23
  • How do you persist this to your elastic beanstalk config? – Thomas Grainger Sep 19 '17 at 13:34
  • 2
    @ThomasGrainger .ebextensions files. One of the most pain in the butt annoying possible creations in the world. They run on system bootup. – std''OrgnlDave Dec 01 '17 at 04:17
  • Does anyone have link to the documentation above? Searching from trim doesn't seem to come up with anything and we have the same issue (solved by the command above) – jleck Jan 22 '18 at 14:19
3

I followed the suggestions provided on AWS documentation and everything is working now.
But I had to combine two solutions: increase space and add cronjob to remove old files.
Here's what I did.

First, I changed the volume xvdcz to use 50GB instead of 12GB. That's the storage that we can see on docker system info. In my case it was always full because I upload lots of files every day.

.ebextensions/blockdevice-xvdcz.config

option_settings:
  aws:autoscaling:launchconfiguration:
    BlockDeviceMappings: /dev/xvdcz=:50:true

After I added a cronjob to clean my deleted files that were not used anymore. It was required because Docker still kept them for some reason. In my case once a day it's enough. If you have more uploads than me, you can configure the cronjob to run how many times you need.

.ebextensions/cronjob.config

files:
    "/etc/cron.d/mycron":
        mode: "000644"
        owner: root
        group: root
        content: |
            0 23 * * * root /usr/local/bin/remove_old_files.sh

     "/usr/local/bin/remove_old_files.sh":
        mode: "000755"
        owner: root
        group: root
        content: |
            #!/bin/bash
            docker ps -q | xargs docker inspect --format='{{ .State.Pid }}' | xargs -IZ sudo fstrim /proc/Z/root/
            exit 0

 commands:
    remove_old_cron:
        command: "rm -f /etc/cron.d/*.bak"

Source: https://docs.aws.amazon.com/pt_br/elasticbeanstalk/latest/dg/create_deploy_docker.container.console.html#docker-volumes

1

AWS elasticbeanstalk docker section Environment Configuration documents how it works:

For improved performance, Elastic Beanstalk configures two Amazon EBS storage volumes for your Docker environment's EC2 instances. In addition to the root volume provisioned for all Elastic Beanstalk environments, a second 12GB volume named xvdcz is provisioned for image storage on Docker environments.

If you need more storage space or increased IOPS for Docker images, you can customize the image storage volume by using the BlockDeviceMapping configuration option in the aws:autoscaling:launchconfiguration namespace.

For example, the following configuration file increases the storage volume's size to 100 GB with 500 provisioned IOPS:

Example .ebextensions/blockdevice-xvdcz.config

option_settings:
  aws:autoscaling:launchconfiguration:
    BlockDeviceMappings: /dev/xvdcz=:100::io1:500

If you use the BlockDeviceMappings option to configure additional volumes for your application, you should include a mapping for xvdcz to ensure that it is created. The following example configures two volumes, the image storage volume xvdcz with default settings and an additional 24 GB application volume named sdh:

Example .ebextensions/blockdevice-sdh.config

option_settings:
  aws:autoscaling:launchconfiguration:
    BlockDeviceMappings: /dev/xvdcz=:12:true:gp2,/dev/sdh=:24
JavaRocky
  • 481
  • 2
  • 4
  • 15
0

I beat my head against this problem for over a day and finally figured it out.

AWS is using the devicemapper backend and creates a 12GB SSD volume that it mounts and uses for the docker images. You have to override the volume it would mount through the elasticbeanstalk extensions concept and deploy via the CLI (there is no way to do this via their GUI, unfortunately).

In the directory you have your Dockerrun.aws.json file, create a directory called .ebextensions and then create a file that ends in .config inside of it. I called mine 01.correctebsvolume.config. Then put the following contents in there:

option_settings: - namespace: aws:autoscaling:launchconfiguration option_name: BlockDeviceMappings value: /dev/xvdcz=snap-066cZZZZZZZZ:40:true:gp2

I ssh'ed into one of my failing boxed directly and found it was mounting /dev/xvdcz. It may be different for you. The snap-066cZZZZZZZZ needs to be a valid snapshot ID. I created an AMI image of the failing instance and used the snapshot that it created in the process. The 40 is how many GB the volume will be, so substitute in what you need. I don't know what the true or gp2 do, but they came from the AMI image block device data, so I kept them.

The magic namespace and option_name come from here in the documentation.

  • So...this mounts the root Docker volume on EBS instead of the thin pool? – std''OrgnlDave Dec 01 '17 at 04:16
  • The docker thinpool is set up to run on an EBS volume (of exactly 12GB). This replaces that volume with a larger one, and is the least-invasive way to get it working. –  Dec 02 '17 at 06:13
  • Oh, the thinpool configuration Amazon sets up is for 100GB, so that's the upper limit for this answer, and I'm not sure if that can be adjusted. –  Dec 02 '17 at 06:13
0

Just increasing the size of the disk will not solve the problem, it will just error later. AWS Recommends mapping a new disk to your container so that any create file /delete file does not affect the Docker Poll Layer.

I am currently looking on it, I've not tested yet but the soluction I come across is having this on m y blockdevice.config

commands:
  01mount:
    command: "mount /dev/sdh /tmp"
option_settings:
  aws:autoscaling:launchconfiguration:
    BlockDeviceMappings: /dev/xvda=:16:true:gp2,/dev/xvdcz=:12:true:gp2,/dev/sdh=:12:true:ephemeral0

Appreciate any comments.

neisantos
  • 1
  • 1