将yamr作业提交给远程集群时发生ClassNotFoundException

我有一个伪分布的hadoop集群,作为docker容器运行

docker run -d -p 50070:50070 -p 9000:9000 -p 8032:8032 -p 8088:8088 --name had00p sequenceiq/hadoop-docker:2.6.0 /etc/bootstrap.sh -d 

它的configuration在这里: https : //github.com/sequenceiq/docker-hadoop-ubuntu/

我可以成功处理hdfs,访问ui,但坚持从java提交工作,我得到了

ClassNotFoundException:Class com.github.mikhailerofeev.hadoop.Script $ MyMapper not found

以下是示例代码:

  @Override public Configuration getConf() { String host = BOOT_TO_DOCKER_IP; int nameNodeHdfsPort = 9000; int yarnPort = 8032; String yarnAddr = host + ":" + yarnPort; String hdfsAddr = "hdfs://" + host + ":" + nameNodeHdfsPort + "/"; Configuration configutation = new Configuration(); configutation.set("yarn.resourcemanager.address", yarnAddr); configutation.set("mapreduce.framework.name", "yarn"); configutation.set("fs.default.name", hdfsAddr); return configutation; } private void simpleMr(String inputPath) throws IOException { JobConf conf = new JobConf(getConf(), Script.class); conf.setJobName("fun"); conf.setJarByClass(MyMapper.class); conf.setMapperClass(MyMapper.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, inputPath); String tmpMRreturn = "/user/m-erofeev/map-test.data"; Path returnPath = new Path(tmpMRreturn); FileOutputFormat.setOutputPath(conf, returnPath); AccessUtils.execAsRootUnsafe(() -> { FileSystem fs = FileSystem.get(getConf()); if (fs.exists(returnPath)) { fs.delete(returnPath, true); } }); AccessUtils.execAsRootUnsafe(() -> { RunningJob runningJob = JobClient.runJob(conf); runningJob.waitForCompletion(); }); } 

这里是AccessUtils.execAsRootUnsafe – 环绕UserGroupInformation,它与hdfs工作正常。

我错在哪里?

upd :我意识到,它应该失败,因为hadoop使用Java 7,但我的Java 8,并计划稍后检查。 但在这种情况下,我期望另一个失败的消息… upd2切换到java7没有区别。

我的错误,我运行脚本没有打包到jar(从IDE),所以方法getJarByClas()没有意义。