sc = SparkContext() error python jupyter notebook

0

So I have been looking around the internet and this same type of issue has popped up before where in pyspark when the function SparkContext is called. It throws this error:

FileNotFoundError Traceback (most recent call last) in ----> 1 sc = SparkContext() 2 sc

~\Anaconda3\lib\site-packages\pyspark\context.py in init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls) 131 " note this option will be removed in Spark 3.0") 132 --> 133 SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) 134 try: 135 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

~\Anaconda3\lib\site-packages\pyspark\context.py in _ensure_initialized(cls, instance, gateway, conf) 314 with SparkContext._lock: 315 if not SparkContext._gateway: --> 316 SparkContext._gateway = gateway or launch_gateway(conf) 317 SparkContext._jvm = SparkContext._gateway.jvm 318

~\Anaconda3\lib\site-packages\pyspark\java_gateway.py in launch_gateway(conf) 44 :return: a JVM gateway 45 """ ---> 46 return _launch_gateway(conf) 47 48

~\Anaconda3\lib\site-packages\pyspark\java_gateway.py in _launch_gateway(conf, insecure) 99 else: 100 # preexec_fn not supported on Windows --> 101 proc = Popen(command, stdin=PIPE, env=env) 102 103 # Wait for the file to appear, or for the process to exit, whichever happens first.

~\Anaconda3\lib\subprocess.py in init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text) 773 c2pread, c2pwrite, 774 errread, errwrite, --> 775 restore_signals, start_new_session) 776 except: 777 # Cleanup if the child failed starting.

~\Anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session) 1176
env, 1177 os.fspath(cwd) if cwd is not None else None, -> 1178 startupinfo) 1179 finally: 1180 # Child is launched. Close the parent's copy of those pipe

FileNotFoundError: [WinError 2] The system cannot find the file specified

My full python code so far, with python version Python 3.7.3 and java 11.0.2 2019-01-15 LTS

import pyspark

from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession

sc = SparkContext()

It points to the line in my code -> sc = SparkContext() which says that there is either cant call the function or a function within it cant be called.

I'm not quite sure if I am forgetting to download a package or using a wrong version.

caitlyn carver

Posted 2019-06-04T13:59:11.290

Reputation: 1

Why are you importing pyspark twice e.g. import pyspark and from pyspark import SparkContext? This seems like either 1) you need all of pyspark with import pyspark (and therefore likely sc = pyspark.SparkContext()) or just SparkContext() with from pyspark import SparkContext (and therefore sc = SparkContext()). – Anaksunaman – 2019-06-05T00:19:17.750

No answers