0
So I have been looking around the internet and this same type of issue has popped up before where in pyspark when the function SparkContext is called. It throws this error:
FileNotFoundError Traceback (most recent call last) in ----> 1 sc = SparkContext() 2 sc
~\Anaconda3\lib\site-packages\pyspark\context.py in init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls) 131 " note this option will be removed in Spark 3.0") 132 --> 133 SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) 134 try: 135 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
~\Anaconda3\lib\site-packages\pyspark\context.py in _ensure_initialized(cls, instance, gateway, conf) 314 with SparkContext._lock: 315 if not SparkContext._gateway: --> 316 SparkContext._gateway = gateway or launch_gateway(conf) 317 SparkContext._jvm = SparkContext._gateway.jvm 318
~\Anaconda3\lib\site-packages\pyspark\java_gateway.py in launch_gateway(conf) 44 :return: a JVM gateway 45 """ ---> 46 return _launch_gateway(conf) 47 48
~\Anaconda3\lib\site-packages\pyspark\java_gateway.py in _launch_gateway(conf, insecure) 99 else: 100 # preexec_fn not supported on Windows --> 101 proc = Popen(command, stdin=PIPE, env=env) 102 103 # Wait for the file to appear, or for the process to exit, whichever happens first.
~\Anaconda3\lib\subprocess.py in init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text) 773 c2pread, c2pwrite, 774 errread, errwrite, --> 775 restore_signals, start_new_session) 776 except: 777 # Cleanup if the child failed starting.
~\Anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session) 1176
env, 1177 os.fspath(cwd) if cwd is not None else None, -> 1178 startupinfo) 1179 finally: 1180 # Child is launched. Close the parent's copy of those pipeFileNotFoundError: [WinError 2] The system cannot find the file specified
My full python code so far, with python version Python 3.7.3 and java 11.0.2 2019-01-15 LTS
import pyspark
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
sc = SparkContext()
It points to the line in my code -> sc = SparkContext()
which says that there is either cant call the function or a function within it cant be called.
I'm not quite sure if I am forgetting to download a package or using a wrong version.
Why are you importing
pyspark
twice e.g.import pyspark
andfrom pyspark import SparkContext
? This seems like either 1) you need all ofpyspark
withimport pyspark
(and therefore likelysc = pyspark.SparkContext()
) or justSparkContext()
withfrom pyspark import SparkContext
(and thereforesc = SparkContext()
). – Anaksunaman – 2019-06-05T00:19:17.750