pthread_create失败:MongoDB上的资源暂时不可用
目前,我在16GB RAM Ubuntu 16.04.1 x64的物理机上使用Docker运行独立模式的Spark Cluster
Spark Cluster容器的RAMconfiguration:master 4g,slave1 2g,slave2 2g,slave3 2g
docker run -itd --net spark -m 4g -p 8080:8080 --name master --hostname master MyAccount/spark &> /dev/null docker run -itd --net spark -m 2g -p 8080:8080 --name slave1 --hostname slave1 MyAccount/spark &> /dev/null docker run -itd --net spark -m 2g -p 8080:8080 --name slave2 --hostname slave2 MyAccount/spark &> /dev/null docker run -itd --net spark -m 2g -p 8080:8080 --name slave3 --hostname slave3 MyAccount/spark &> /dev/null docker exec -it master sh -c 'service ssh start' > /dev/null docker exec -it slave1 sh -c 'service ssh start' > /dev/null docker exec -it slave2 sh -c 'service ssh start' > /dev/null docker exec -it slave3 sh -c 'service ssh start' > /dev/null docker exec -it master sh -c '/usr/local/spark/sbin/start-all.sh' > /dev/null
我的MongoDB数据库中有大约170GB的数据。 我使用./mongod
运行MongoDB,没有在本地主机上使用./mongod
进行复制和分片。
使用Stratio / Spark-Mongodb连接器
以下命令我运行在“主”容器上:
/usr/local/spark/bin/spark-submit --master spark://master:7077 --executor-memory 2g --executor-cores 1 --packages com.stratio.datasource:spark-mongodb_2.11:0.12.0 code.py
code.py:
from pyspark import SparkContext from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() spark.sql("CREATE TEMPORARY VIEW tmp_tb USING com.stratio.datasource.mongodb OPTIONS (host 'MyPublicIP:27017', database 'firewall', collection 'log_data')") df = spark.sql("select * from tmp_tb") df.show()
我修改了/etc/security/limits.conf
和/etc/security/limits.d/20-nproc.conf
ulimit值
* soft nofile unlimited * hard nofile 131072 * soft nproc unlimited * hard nproc unlimited * soft fsize unlimited * hard fsize unlimited * soft memlock unlimited * hard memlock unlimited * soft cpu unlimited * hard cpu unlimited * soft as unlimited * hard as unlimited root soft nofile unlimited root hard nofile 131072 root soft nproc unlimited root hard nproc unlimited root soft fsize unlimited root hard fsize unlimited root soft memlock unlimited root hard memlock unlimited root soft cpu unlimited root hard cpu unlimited root soft as unlimited root hard as unlimited
$ ulimit -a
core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 63682 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 131072 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
另外,添加
kernel.pid_max=200000 vm.max_map_count=600000
在/etc/sysctl.conf
然后,重新启动后再次运行spark程序。
我仍然有以下错误说pthread_create failed: Resource temporarily unavailable
和com.mongodb.MongoException$Network: Exception opening the socket
。
错误快照:
pyspark错误
mongodb错误
物理内存不够吗? 或configuration的哪一部分我做错了?
谢谢。
- 从运行Spark的另一个Docker容器写入Docker中运行的HDFS
- Spark Job Server是否必须与Spark Master部署在同一台主机上?
- Spark Docker – 无法访问资源pipe理器的Web UI – Mac PC
- 将一个jar提交到sequenceiq docker-spark容器
- 连接齐柏林docker与蜂巢
- boot2docker:port forwording通过Mac OS上的Web UI来pipe理火花工作者
- Spark应用程序无法写入docker中运行的elasticsearch集群
- 参考Zeppelin到Spark Master(在Docker容器中)
- 如何在dockerized Apache Zeppelin后面展示Spark Driver?