spark-submit
seems to require two-way communication with a remote Spark cluster in order to run jobs.
This is easy to configure between machines (10.x.x.x to 10.x.x.x and back) but becomes confusing when Docker adds an extra layer of networking (172.x.x.x through 10.x.x.x to 10.x.x.x and then back to 172.x.x.x through 10.x.x.x somehow).
Spark adds an extra layer of complexity with its SPARK_LOCAL_IP
and SPARK_LOCAL_HOSTNAME
configuration parameters for the client.
How should Docker networking be configured to allow this?