Docker上的DataStax Enterprise:无法启动,因为/ hadoop / conf目录不可写

我已经遵循了DataStax关于在Docker中使用DSE的最佳实践的指南 ,但是我使用了DataStax提供的所有默认安装脚本和Dockerfiles来运行以下错误。

错误日志

Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310) ~[dse-hadoop-5.0.3.jar:5.0.3] at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174) ~[dse-hadoop-5.0.3.jar:5.0.3] at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20) ~[dse-hadoop-5.0.3.jar:5.0.3] at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377) ~[dse-core-5.0.3.jar:5.0.3] at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306) ~[dse-core-5.0.3.jar:5.0.3] ... 7 common frames omitted Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveConfiguration(CassandraJobConf.java:466) ~[dse-hadoop-5.0.3.jar:5.0.3] at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveDseHadoopConfiguration(CassandraJobConf.java:345) ~[dse-hadoop-5.0.3.jar:5.0.3] at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:300) ~[dse-hadoop-5.0.3.jar:5.0.3] ... 11 common frames omitted Unable to start DSE server: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin com.datastax.bdp.plugin.PluginManager$PluginActivationException: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:327) at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:259) at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:169) at com.datastax.bdp.plugin.PluginManager.preStart(PluginManager.java:77) at com.datastax.bdp.server.DseDaemon.preStart(DseDaemon.java:490) at com.datastax.bdp.server.DseDaemon.start(DseDaemon.java:462) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:583) at com.datastax.bdp.DseModule.main(DseModule.java:91) Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310) at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174) at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20) at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377) at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306) ... 7 more Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf 

错误是非常直接的,试图通过在Dockerfile添加一些额外的chmod调用来解决它无济于事。

Dockerfile

 # Provided without any warranty, these files are intended # to accompany the whitepaper about DSE on Docker and are # not intended for production and are not actively maintained. # Loosely based on docker-cassandra by the fine folk at Spotify # -- https://github.com/spotify/docker-cassandra/ # Loosely based on cassandra-docker by the one and only Al Tobey # -- https://github.com/tobert/cassandra-docker/ # base yourself on any ubuntu 14.04 image containing JDK8 # official Docker Java images are distributed with OpenJDK # Datastax certifies its product releases specifically # on the Oracle/Sun JVM, so YMMV with OpenJDK FROM nimmis/java:oracle-8-jdk # Avoid ERROR: invoke-rc.d: policy-rc.d denied execution of start. RUN echo "#!/bin/sh\nexit 0" > /usr/sbin/policy-rc.d RUN export DEBIAN_FRONTEND=noninteractive && \ apt-get update && \ apt-get -y install adduser \ curl \ lsb-base \ procps \ zlib1g \ gzip \ python \ python-support \ sysstat \ ntp bash tree && \ rm -rf /var/lib/apt/lists/* # grab gosu for easy step-down from root RUN curl -o /bin/gosu -SkL "https://github.com/tianon/gosu/releases/download/1.4/gosu-$(dpkg --print-architecture)" \ && chmod +x /bin/gosu # DSE tarball can be download into the folder where Dockerfile is # wget --user=$USER --password=$PASS http://downloads.datastax.com/enterprise/dse-5.0.0-bin.tar.gz # you may want to replace dse-5.0.0-bin.tar.gz with the corresponding downloaded package name. When # downloaded, please remove the version number part of the filename (or create a symlink), so the # resulting file is named dse-bin.tar.gz (that way the docker file itself remains version independent). # # DataStax Agent debian package can be downloaded from # wget --user=$USER --password=$PASS http://debian.datastax.com/enterprise/pool/datastax-agent_6.0.0_all.deb # you may want to replace the specific version with the corresponding downloaded package name. When # downloaded, please remove the version number part of the filename (or create a symlink), so the # resulting file is named datastax-agent_all.deb (that way the docker file itself remains version # independent). ADD dse.tar.gz /opt ADD datastax-agent_all.deb /tmp ENV DSE_HOME /opt/dse RUN ln -s /opt/dse* $DSE_HOME # keep data here VOLUME /data # and logs here VOLUME /logs VOLUME /opt/dse # create a dedicated user for running DSE node RUN groupadd -g 1337 cassandra && \ useradd -u 1337 -g cassandra -s /bin/bash -d $DSE_HOME cassandra && \ chown -R cassandra:cassandra /opt/dse* RUN chmod r+w -R /opt/dse/ # install the agent RUN dpkg -i /tmp/datastax-agent_all.deb # starting node using custom entrypoint that configures paths, interfaces, etc. COPY scripts/dse-entrypoint /usr/local/bin/ RUN chmod +x /usr/local/bin/dse-entrypoint ENTRYPOINT ["/usr/local/bin/dse-entrypoint"] # Running any other DSE/C* command should be done on behalf dse user # Perform that using a generic command laucher COPY scripts/dse-cmd-launcher /usr/local/bin/ RUN chmod +x /usr/local/bin/dse-cmd-launcher # link dse commands to the launcher RUN for cmd in cqlsh dsetool nodetool dse cassandra-stress; do \ ln -sf /usr/local/bin/dse-cmd-launcher /usr/local/bin/$cmd ; \ done # the detailed list of ports # http://docs.datastax.com/en/datastax_enterprise/5.0/datastax_enterprise/sec/secConfFirePort.html # Cassandra EXPOSE 7000 9042 9160 # Solr EXPOSE 8983 8984 # Spark EXPOSE 4040 7080 7081 7077 # Hadoop EXPOSE 8012 50030 50060 9290 # Hive/Shark EXPOSE 10000 # Graph 

最后一个可能有解决此问题的答案可能是在此容器启动时用于实际启动DSE的启动脚本。

DSE启动脚本(启动时由Docker容器调用)

 #!/bin/sh # Provided without any warranty, these files are intended # to accompany the whitepaper about DSE on Docker and are # not intended for production and are not actively maintained. # Bind the various services # These should be updated on every container start if [ -z ${IP} ]; then IP=`hostname --ip-address` fi echo $IP > /data/ip.address # create directories for holding the node's data, logs, etc. create_dirs() { local base_dir=$1; mkdir -p $base_dir/data/commitlog mkdir -p $base_dir/data/saved_caches mkdir -p $base_dir/data/hints mkdir -p $base_dir/logs } # tweak the cassandra config tweak_cassandra_config() { env="$1/cassandra-env.sh" conf="$1/cassandra.yaml" base_data_dir="/data" # Set the cluster name if [ -z "${CLUSTER_NAME}" ]; then printf " - No cluster name provided; skipping.\n" else printf " - Setting up the cluster name: ${CLUSTER_NAME}\n" regexp="s/Test Cluster/${CLUSTER_NAME}/g" sed -i -- "$regexp" $conf fi # Set the commitlog directory, and various other directories # These are done only once since the regexep matches will fail on subsequent # runs. printf " - Setting up directories\n" regexp="s|/var/lib/cassandra/|$base_data_dir/|g" sed -i -- "$regexp" $conf regexp="s/^listen_address:.*/listen_address: ${IP}/g" sed -i -- "$regexp" $conf regexp="s/rpc_address:.*/rpc_address: ${IP}/g" sed -i -- "$regexp" $conf # seeds if [ -z "${SEEDS}" ]; then printf " - Using own IP address ${IP} as seed.\n"; regexp="s/seeds:.*/seeds: \"${IP}\"/g"; else printf " - Using seeds: $SEEDS\n"; regexp="s/seeds:.*/seeds: \"${IP},${SEEDS}\"/g" fi sed -i -- "$regexp" $conf # JMX echo "JVM_OPTS=\"\$JVM_OPTS -Djava.rmi.server.hostname=127.0.0.1\"" >> $env } tweak_dse_in_sh() { # point C* logs dir to the created volume sed -i -- "s|/var/log/cassandra|/logs|g" "$1/dse.in.sh" } tweak_spark_config() { sed -i -- "s|/var/lib/spark/|/data/spark/|g" "$1/spark-env.sh" sed -i -- "s|/var/log/spark/|/logs/spark/|g" "$1/spark-env.sh" mkdir -p /data/spark/worker mkdir -p /data/spark/rdd mkdir -p /logs/spark/worker } tweak_agent_config() { [ -d "/var/lib/datastax-agent" ] && cat > /var/lib/datastax-agent/conf/address.yaml <<EOF stomp_interface: ${STOMP_INTERFACE} use_ssl: 0 local_interface: ${IP} hosts: ["${IP}"] cassandra_install_location: /opt/dse cassandra_log_location: /logs EOF chown cassandra:cassandra /var/lib/datastax-agent/conf/address.yaml } setup_node() { printf "* Setting up node...\n" printf " + Setting up node...\n" create_dirs tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf" tweak_dse_in_sh "$DSE_HOME/bin" tweak_spark_config "$DSE_HOME/resources/spark/conf" tweak_agent_config chown -R cassandra:cassandra /data /logs /conf # mark that we tweaked configs touch "$DSE_HOME/tweaked_configs" printf "Done.\n" } # if marker file doesn't exist, setup node [ ! -f "$DSE_HOME/tweaked_configs" ] && setup_node [ -f "/etc/init.d/datastax-agent" ] && /etc/init.d/datastax-agent start exec gosu cassandra "$DSE_HOME/bin/dse" cassandra -f "$@" 

Docker容器命令行参数

以下是我用来通过Docker启动单个DSE实例的命令行参数:

 #!/bin/bash # Used to start a single DSE node that has both Spark and Cassandra running on it OPSC_CONTAINER=$1 if [ -z "$OPSC_CONTAINER" ]; then echo "usage: start_docker_cluster.sh OPSCContainerName" echo " OPSCContainerName mandatory name of the container running OpsCenter" exit 1 fi [ -z "$CLUSTER_NAME" ] && CLUSTER_NAME="Test_Cluster" STOMP_INTERFACE=`docker exec $OPSC_CONTAINER hostname -I` docker run -p 7080:7080 -p 4040:4040 -p 7077:7077 -p 9042:9042 --link $OPSC_CONTAINER -d -e CLUSTER_NAME="$CLUSTER_NAME" -e STOMP_INTERFACE="$STOMP_INTERFACE" --name dse dse -k -t 

-k -t标志表明我们将为此容器启动Hadoop和Spark 。 我已经放弃了-t标志,即使没有它,仍然会发生此configuration错误。

我需要做什么才能使/opt/dse/resources/hadoop/conf目录可写,以便DSE能够成功启动?

我在DSE启动脚本setup_node()部分(在启动时由Docker容器调用 setup_node()添加了chown -RHh cassandra:cassandra /opt/dse ,并解决了问题。 查看chown --help获取更多关于这些选项的信息。

注意:现在我得到一个ERROR 04:15:04,789 SPARK-WORKER Logging.scala:74 - Failed to create work directory /var/lib/spark/worker ,但至less我的修复会让你超过你的初始问题。

 setup_node() { printf "* Setting up node...\n" printf " + Setting up node...\n" create_dirs tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf" tweak_dse_in_sh "$DSE_HOME/bin" tweak_spark_config "$DSE_HOME/resources/spark/conf" tweak_agent_config tweak_dse_config "$DSE_HOME/resources/dse/conf" chown -R cassandra:cassandra /data /logs /conf chown -RHh cassandra:cassandra /opt/dse # mark that we tweaked configs touch "$DSE_HOME/tweaked_configs" printf "Done.\n" } 

将“chown -RHh cassandra:cassandra / opt / dse”添加到入口点脚本解决了我无法写入/ opt / dse / resources / hadoop / conf的问题。

回覆。 错误04:15:04,789 SPARK-WORKER Logging.scala:74 – 无法创build工作目录/ var / lib / spark / worker

检查您的spark-env.sh ,并查看您的目录映射。 在我的情况下,我已经挂载了两个外部卷 – /数据和/日志。 这两个目录都属于cassandra:cassandra。

 # This is a base directory for Spark Worker work files. if [ "x$SPARK_WORKER_DIR" = "x" ]; then export SPARK_WORKER_DIR="/data/spark/worker" fi if [ "x$SPARK_LOCAL_DIRS" = "x" ]; then export SPARK_LOCAL_DIRS="/data/spark/rdd" fi # This is a base directory for Spark Worker logs. if [ "x$SPARK_WORKER_LOG_DIR" = "x" ]; then export SPARK_WORKER_LOG_DIR="/logs/spark/worker" fi # This is a base directory for Spark Master logs. if [ "x$SPARK_MASTER_LOG_DIR" = "x" ]; then export SPARK_MASTER_LOG_DIR="/logs/spark/master" fi 

该video展示了在Docker上运行的全functionDSE Enterprise: https : //vimeo.com/181393134

这样做:

我在DSE启动脚本的setup_node()部分添加了chown -RHh cassandra:cassandra / opt / dse(启动时由Docker容器调用)

正如Max回答我所做的那样,而不是他的问题

 Unable to activate plugin com.datastax.bdp.plugin.DseFsPlugin (...) java.io.IOException: Failed to create work directory: /var/lib/dsefs 

所以我不得不把我的setup_node()这个

 setup_node() { printf "* Setting up node...\n" printf " + Setting up node...\n" create_dirs tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf" tweak_dse_in_sh "$DSE_HOME/bin" tweak_spark_config "$DSE_HOME/resources/spark/conf" tweak_agent_config chown -R cassandra:cassandra /data /logs /conf mkdir /var/lib/dsefs chown -RHh cassandra:cassandra /opt/dse /var/lib/dsefs # mark that we tweaked configs touch "$DSE_HOME/tweaked_configs" printf "Done.\n" }