I'm writing a small prometheus exporter in Go to publish network metrics for docker containers.
There is a goroutine which gathers the values in the following way:
1. Get all docker containers using docker SDK
2. Locks the goroutine in its current thread
3. Remembers current namespace
4. For each container
4.1. Switches network namespace into the contanier's one (setns() syscall)
4.2. Reads the file /proc/net/netstat
4.3. Parses the contend and makes it available in a shared map
5. Restores the namespace to the remembered one
The server request handler look up the shared map and format the data in prometheus metrics.
The problem is that it works as expected in ca. 20% of the cases and it seems that the namespace switch doesn't work as expected as if either the namespace switch is done asynchronously, or the content of the file /proc/net/netstat is cached somewhere. The rest of the times the content of /proc/net/netstat is from the "root" (or starting) namespace or sticks to one of the namespaces of the docker contaniers.
Any suggestions where I could look further to make it work reliably? I'm puzzled by this behavior.
EDIT: better formulation