如何在Docker容器中设置执行器IP?

最后3天,我试图设置一个Docker机器,包含3个组件:Spark Master,Spark Worker和Driver(Java)应用程序

从Docker启动驱动程序OUTSIDE时,一切正常。 然而,启动这三个组件都会导致port-firewall-host-nightmare

为了保持它(起初)简单,我使用docker-compose – 这是我的docker-compose.yml:

driver: hostname: driver image: driverimage command: -Dexec.args="0 192.168.99.100" -Dspark.driver.port=7001 -Dspark.driver.host=driver -Dspark.executor.port=7006 -Dspark.broadcast.port=15001 -Dspark.fileserver.port=15002 -Dspark.blockManager.port=15003 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory ports: - 10200:10200 # Module REST Port - 4040:4040 # Web UI (Spark) - 7001:7001 # Driver Port (Spark) - 15001:15001 # Broadcast (Spark) - 15002:15002 # File Server (Spark) - 15003:15003 # Blockmanager (Spark) - 7337:7337 # Shuffle? (Spark) extra_hosts: - sparkmaster:192.168.99.100 - sparkworker:192.168.99.100 environment: SPARK_LOCAL_IP: 192.168.99.100 #SPARK_MASTER_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" #SPARK_WORKER_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" SPARK_JAVA_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=15001 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" sparkmaster: extra_hosts: - driver:192.168.99.100 image: gettyimages/spark command: /usr/spark/bin/spark-class org.apache.spark.deploy.master.Master -h sparkmaster hostname: sparkmaster environment: SPARK_CONF_DIR: /conf MASTER: spark://sparkmaster:7077 SPARK_LOCAL_IP: 192.168.99.100 SPARK_JAVA_OPTS: "-Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" SPARK_WORKER_OPTS: "-Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" SPARK_MASTER_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" #SPARK_WORKER_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" #SPARK_JAVA_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" expose: - 7001 - 7002 - 7003 - 7004 - 7005 - 7006 - 7077 - 6066 ports: - 6066:6066 - 7077:7077 # Master (Main Port) - 8080:8080 # Web UI #- 7006:7006 # Executor sparkworker: extra_hosts: - driver:192.168.99.100 image: gettyimages/spark command: /usr/spark/bin/spark-class org.apache.spark.deploy.worker.Worker -h sparkworker spark://sparkmaster:7077 # volumes: # - ./spark/logs:/log/spark hostname: sparkworker environment: SPARK_CONF_DIR: /conf SPARK_WORKER_CORES: 4 SPARK_WORKER_MEMORY: 4g SPARK_WORKER_PORT: 8881 SPARK_WORKER_WEBUI_PORT: 8081 SPARK_LOCAL_IP: 192.168.99.100 #SPARK_MASTER_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" SPARK_JAVA_OPTS: "-Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" SPARK_MASTER_OPTS: "-Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" SPARK_WORKER_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=15003 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" #SPARK_JAVA_OPTS: "-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory" links: - sparkmaster expose: - 7001 - 7002 - 7003 - 7004 - 7005 - 7006 - 7012 - 7013 - 7014 - 7015 - 7016 - 8881 ports: - 8081:8081 # WebUI #- 15003:15003 # Blockmanager+ - 7005:7005 # Executor - 7006:7006 # Executor #- 7006:7006 # Executor 

我什至不知道哪个端口实际上使用等。我知道的是,我目前的问题是以下。 司机可以和师父沟通,师父可以和员工沟通,我想司机可以跟员工沟通! 司机不能与执行者沟通。 我也发现了这个问题。 当我打开应用程序用户界面并打开exectuors选项卡时,它显示“执行器0 – 地址172.17.0.1:7005”。

所以问题是,驱动程序使用Docker网关地址来寻址执行程序,这不起作用。 我尝试了几件事情(SPARK_LOCAL_IP,使用显式主机名等),但是驱动程序总是试图与Docker Gateway进行通信…任何想法如何实现驱动程序可以与执行者/工作者进行通信?

这是由于Spark提供的configuration选项不足造成的。 Spark将绑定到SPARK_LOCAL_HOSTNAME上进行SPARK_LOCAL_HOSTNAME ,并将此确切的主机名传播到集群。 不幸的是,如果驱动程序在NAT后面,例如Docker容器,此设置不起作用。

你可以使用下面的设置来解决这个问题(我已经成功地使用了这个hack):

  • 转发所有必要的端口1对1(和你一样)
  • 使用驱动程序的自定义主机名:设置例如SPARK_LOCAL_HOSTNAME: mydriver
  • 对于主节点和工作节点,将192.168.99.100 mydriver添加到/etc/hosts ,以便它们可以到达Spark驱动程序。
  • 对于Docker容器,将mydriver映射到0.0.0.0 。 这将使Spark驱动程序绑定到0.0.0.0 ,所以主和工作人员都可以访问它:

要在docker-compose.yml中做到这一点,只需添加以下几行:

  extra_hosts: - "mydriver:0.0.0.0"