Why is the Alpine Docker image over 50% slower than the Ubuntu image?

40

16

I noticed that my Python application is much slower when running it on python:2-alpine3.6 than running it without Docker on Ubuntu. I came up with two small benchmark commands and there's a huge difference visible between the two operating systems, both when I'm running them on an Ubuntu server, and when I'm using Docker for Mac.

$ BENCHMARK="import timeit; print(timeit.timeit('import json; json.dumps(list(range(10000)))', number=5000))"
$ docker run python:2-alpine3.6 python -c $BENCHMARK
7.6094589233
$ docker run python:2-slim python -c $BENCHMARK
4.3410820961
$ docker run python:3-alpine3.6 python -c $BENCHMARK
7.0276606959
$ docker run python:3-slim python -c $BENCHMARK
5.6621271420

I also tried the following 'benchmark', which doesn't use Python:

$ docker run -ti ubuntu bash
root@6b633e9197cc:/# time $(i=0; while (( i < 9999999 )); do (( i ++
)); done)

real    0m39.053s
user    0m39.050s
sys     0m0.000s
$ docker run -ti alpine sh
/ # apk add --no-cache bash > /dev/null
/ # bash
bash-4.3# time $(i=0; while (( i < 9999999 )); do (( i ++ )); done)

real    1m4.277s
user    1m4.290s
sys     0m0.000s

What could be causing this difference?

Underyx

Posted 2017-06-15T15:23:09.283

Reputation: 631

1@Seth look again: timing starts after bash is installed, inside the launched bash shell – Underyx – 2017-07-20T11:55:38.997

You should really run them both under perf(1) to determine where the time is spent. – R.. GitHub STOP HELPING ICE – 2020-01-29T20:27:50.577

Note that the result of that kind of benchmark can change dramatically from one version to another, and doesn't reflect in any way the actual performance of the underlying system. I know, I know, that sounds very handwavy, but here are the results on my system right now:

[jp@hex ~]$ docker run python python -c "$BENCHMARK" 5.577920467942022 [jp@hex ~]$ docker run python:alpine python -c "$BENCHMARK" 5.051835118094459

Alpine ends up being almost 10% faster, but I don't think it means anything :) – jpetazzo – 2020-02-16T17:42:25.340

Answers

51

I've run the same benchmark as you did, using just Python 3:

$ docker run python:3-alpine3.6 python --version
Python 3.6.2
$ docker run python:3-slim python --version
Python 3.6.2

resulting in more than 2 seconds difference:

$ docker run python:3-slim python -c "$BENCHMARK"
3.6475560404360294
$ docker run python:3-alpine3.6 python -c "$BENCHMARK"
5.834922112524509

Alpine is using a different implementation of libc (base system library) from the musl project(mirror URL). There are many differences between those libraries. As a result, each library might perform better in certain use cases.

Here's an strace diff between those commands above. The output starts to differ from line 269. Of course there are different addresses in memory, but otherwise it's very similar. Most of the time is obviously spent waiting for the python command to finish.

After installing strace into both containers, we can obtain a more interesting trace (I've reduced the number of iterations in the benchmark to 10).

For example, glibc is loading libraries in the following manner (line 182):

openat(AT_FDCWD, "/usr/local/lib/python3.6", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, /* 205 entries */, 32768)   = 6824
getdents(3, /* 0 entries */, 32768)     = 0

The same code in musl:

open("/usr/local/lib/python3.6", O_RDONLY|O_DIRECTORY|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
getdents64(3, /* 62 entries */, 2048)   = 2040
getdents64(3, /* 61 entries */, 2048)   = 2024
getdents64(3, /* 60 entries */, 2048)   = 2032
getdents64(3, /* 22 entries */, 2048)   = 728
getdents64(3, /* 0 entries */, 2048)    = 0

I'm not saying this is the key difference, but reducing the number of I/O operations in core libraries might contribute to better performance. From the diff you can see that executing the very same Python code might lead to slightly different system calls. Probably the most important could be done in optimizing loop performance. I'm not qualified enough to judge whether the performance issue is caused by memory allocation or some other instruction.

  • glibc with 10 iterations:

    write(1, "0.032388824969530106\n", 210.032388824969530106)
    
  • musl with 10 iterations:

    write(1, "0.035214247182011604\n", 210.035214247182011604)
    

musl is slower by 0.0028254222124814987 seconds. As the difference grows with number of iterations, I'd assume the difference is in memory allocation of JSON objects.

If we reduce the benchmark to solely importing json we notice the difference is not that huge:

$ BENCHMARK="import timeit; print(timeit.timeit('import json;', number=5000))"
$ docker run python:3-slim python -c "$BENCHMARK"
0.03683806210756302
$ docker run python:3-alpine3.6 python -c "$BENCHMARK"
0.038280246779322624

Loading Python libraries looks comparable. Generating list() produces bigger difference:

$ BENCHMARK="import timeit; print(timeit.timeit('list(range(10000))', number=5000))"
$ docker run python:3-slim python -c "$BENCHMARK"
0.5666235145181417
$ docker run python:3-alpine3.6 python -c "$BENCHMARK"
0.6885563563555479

Obviously the most expensive operation is json.dumps(), which might point to differences in memory allocation between those libraries.

Looking again at the benchmark, musl is really slightly slower in memory allocation:

                          musl  | glibc
-----------------------+--------+--------+
Tiny allocation & free |  0.005 | 0.002  |
-----------------------+--------+--------+
Big allocation & free  |  0.027 | 0.016  |
-----------------------+--------+--------+

I'm not sure what is meant by "big allocation", but musl is almost 2× slower, which might become significant when you repeat such operations thousands or millions of times.

Tombart

Posted 2017-06-15T15:23:09.283

Reputation: 982

16Just few corrections. musl is not Alpine’s own implementation of glibc. 1st musl is not a (re)implementation of glibc, but different implementation of libc per POSIX standard. 2nd musl is not Alpine’s own thing, it’s a standalone, unrelated project and musl is not used just in Alpine. – Jakub Jirutka – 2017-10-25T19:56:37.280

given that musl libc seems like a better more standards based*, not to mention newer implementation why does it seem to underperform glibc in these cases?

*cf. https://wiki.musl-libc.org/functional-differences-from-glibc.html

– Forest – 2018-10-19T13:41:45.930

2Is the difference of 0.0028 seconds statistically significant? The relative deviation is only 0.0013% and you are taking 10 samples. What was the (estimated) standard deviation for those 10 runs (or even the max-min difference)? – Peter Mortensen – 2019-03-28T23:30:46.067

@PeterMortensen For questions regarding benchmark results you should refer to the Eta Labs code: http://www.etalabs.net/libc-bench.html E.g. malloc stress test is repeated 100k times. The results could be heavily dependent on library version, GCC version and CPU used, just to name a few aspects.

– Tombart – 2019-03-29T15:00:13.037

@Tombart: The empirical figures there (as opposed to the qualitative comparisons) are outdated by many years and probably should not be used to make decisions. – R.. GitHub STOP HELPING ICE – 2020-01-29T20:26:27.757