package learn.spark

import org.apache.spark.{SparkConf, SparkContext}

object MasterLocal2 {
  def main(args: Array[String]): Unit = {

    val conf = new SparkConf()

    conf.setAppName("spark-k8s")
    conf.setMaster("k8s://https://192.168.99.100:16443")

    conf.set("spark.driver.host", "192.168.99.1")
    conf.set("spark.executor.instances", "5")
    conf.set("spark.kubernetes.executor.request.cores", "0.1")
    conf.set("spark.kubernetes.container.image", "spark:latest")

    val sc = new SparkContext(conf)

    println(sc.parallelize(1 to 5).map(_ * 10).collect().mkString(", "))

    sc.stop()
  }
}

我正在尝试加速Spark程序的本地运行,但我有一些例外。我不知道如何配置将JVM事物传递给执行程序。

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 8, 10.1.1.217, executor 4): java.lang.ClassNotFoundException: learn.spark.MasterLocal2$$anonfun$main$1
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
分析解答

Mount将Idea编译结果目录发送到执行程序,然后将spark.executor.extraClassPath设置为该目录。

conf.set("spark.kubernetes.executor.volumes.hostPath.any1.options.path", "/path/to/your/project/out/production/examples")
conf.set("spark.kubernetes.executor.volumes.hostPath.any1.mount.path", "/classpath")
conf.set("spark.executor.extraClassPath", "/classpath")

确保您的编译目录可以通过K8S音量挂载到执行程序容器,这涉及到Kubernetes的使用。