匿名UID的Apache Spark独立(无用户名)

我正在OpenShift平台上启动Apache Spark从节点。 OpenShift内部启动泊坞窗图像作为匿名用户(用户没有名字,但只是UID)。 我得到以下exception

 17/07/17 16:46:53 INFO SignalUtils: Registered signal handler for INT 12 17/07/17 16:46:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 13 Exception in thread "main" java.io.IOException: failure to login 14 at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:824) 15 at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761) 16 at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634) 17 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 18 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 19 at scala.Option.getOrElse(Option.scala:121) 20 at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391) 21 at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:221) 22 at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:714) 23 at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:696) 24 at org.apache.spark.deploy.worker.Worker.main(Worker.scala) 25 Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name 26 at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71) 27 at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133) 28 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 29 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 30 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 31 at java.lang.reflect.Method.invoke(Method.java:497) 32 at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) 33 at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) 34 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) 35 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) 36 at java.security.AccessController.doPrivileged(Native Method) 37 at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) 38 at javax.security.auth.login.LoginContext.login(LoginContext.java:587) 39 at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:799) 40 at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761) 41 at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634) 42 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 43 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 44 at scala.Option.getOrElse(Option.scala:121) 45 at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391) 46 at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:221) 47 at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:714) 48 at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:696) 49 at org.apache.spark.deploy.worker.Worker.main(Worker.scala) 50 51 at javax.security.auth.login.LoginContext.invoke(LoginContext.java:856) 52 at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) 53 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) 54 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) 55 at java.security.AccessController.doPrivileged(Native Method) 56 at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) 57 at javax.security.auth.login.LoginContext.login(LoginContext.java:587) 58 at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:799) 59 ... 10 more 

我试着在spark-default.conf上设置下面的属性仍然没有用。

 spark.eventLog.enabled false spark.ui.enabled false spark.acls.enable false spark.admin.acls * spark.modify.acls * spark.modify.acls.groups * spark.ui.view.acls.groups * spark.ui.enabled false 

你能帮我解决这个问题吗?

谢谢

纳文

这是一个不需要nss_wrapper的替代方法。

默认情况下,OpenShift容器使用匿名用户ID和组ID 0 (即“root”组)运行。 首先,设置您的映像,以便/etc/passwd由group-id 0拥有,并具有组写入权限,例如此Dockerfile片段:

 RUN chgrp root /etc/passwd && chmod ug+rw /etc/passwd 

然后,您可以在容器启动时添加以下逻辑,例如以下脚本可以用作ENTRYPOINT

 #!/bin/bash myuid=$(id -u) mygid=$(id -g) uidentry=$(getent passwd $myuid) if [ -z "$uidentry" ] ; then # assumes /etc/passwd has root-group (gid 0) ownership echo "$myuid:x:$myuid:$mygid:anonymous uid:/tmp:/bin/false" >> /etc/passwd fi exec "$@" 

这个入口点脚本会自动为匿名uid提供一个passwd文件入口,这样需要它的工具就不会失败。

关于这个和OpenShift中的匿名用户的相关主题,有一篇很好的博客文章: https ://blog.openshift.com/jupyter-on-openshift-part-6-running-as-an-assigned-user-id/

(我保持这个答案,因为它是有用的知道nss_wrapper ,但是这个其他的答案工作,而不必安装或用户nss_wrapper)

Spark希望能够在passwd中查找它的UID。 这个集成扭结可以使用nss_wrapper来解决; 在图像的入口点使用这个解决scheme的一个很好的例子可以在这里find:

https://github.com/radanalyticsio/openshift-spark/blob/master/scripts/spark/added/entrypoint

 # spark likes to be able to lookup a username for the running UID, if # no name is present fake it. cat /etc/passwd > /tmp/passwd echo "$(id -u):x:$(id -u):$(id -g):dynamic uid:$SPARK_HOME:/bin/false" >> /tmp/passwd export NSS_WRAPPER_PASSWD=/tmp/passwd # NSS_WRAPPER_GROUP must be set for NSS_WRAPPER_PASSWD to be used export NSS_WRAPPER_GROUP=/etc/group export LD_PRELOAD=libnss_wrapper.so exec "$@" 

如果您对可以在Openshift上使用的预制Spark图像感兴趣,我build议从这里开始:

https://github.com/radanalyticsio/openshift-spark

这些图像是作为Radanalytics.io社区项目工具的一部分生成的,该项目已经在Openshift中生成了很多用于轻松创build火花集群的工具。 您可以在这里了解更多有关该项目的信息

https://radanalytics.io/get-started