3

I have been trying to get PySpark to work. I use the PyCharm IDE on a Windows 10 machine. For the setup I took these steps:

  • installed PySpark
  • installed Java 8u211
  • downloaded and pasted the winutils.exe
  • declared SPARK_HOME, JAVA_HOME and HADOOP_HOME in Path
  • added spark folder and zips to Content Root
  • already tried: export SPARK_LOCAL_IP="127.0.0.1" in load-spark-env.sh and other hostname related solutions

Below error occurs when starting from cmd (running from inside PyCharm yields the same). How can I fix this?

error message:

Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32  
Type "help", "copyright", "credits" or "license" for more information.  
19/05/14 21:33:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable  
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties  
Setting default log level to "WARN".  
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 ERROR SparkContext: Error initializing SparkContext.  
java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.  
        at sun.nio.ch.Net.bind0(Native Method)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)  
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)  
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)  
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)  
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)  
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)  
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)  
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)  
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)  
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)  
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)  
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)  
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)  
        at java.lang.Thread.run(Unknown Source)  
19/05/14 21:33:21 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor).  This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:  
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)  
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)  
sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)  
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)  
java.lang.reflect.Constructor.newInstance(Unknown Source)  
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)  
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)  
py4j.Gateway.invoke(Gateway.java:238)  
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)  
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)  
py4j.GatewayConnection.run(GatewayConnection.java:238)  
java.lang.Thread.run(Unknown Source)  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 ERROR SparkContext: Error initializing SparkContext.  
java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.  
        at sun.nio.ch.Net.bind0(Native Method)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)  
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)  
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)  
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)  
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)  
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)  
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)  
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)  
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)  
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)  
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)  
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)  
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)  
        at java.lang.Thread.run(Unknown Source)  
W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\shell.py:45: UserWarning: Failed to initialize Spark session.  
  warnings.warn("Failed to initialize Spark session.")  
Traceback (most recent call last):  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\shell.py", line 41, in <module>  
    spark = SparkSession._create_shell_session()  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\sql\session.py", line 583, in _create_shell_session  
    return SparkSession.builder.getOrCreate()  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\sql\session.py", line 173, in getOrCreate  
    sc = SparkContext.getOrCreate(sparkConf)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 367, in getOrCreate  
    SparkContext(conf=conf or SparkConf())  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 136, in __init__  
    conf, jsc, profiler_cls)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 198, in _do_init  
    self._jsc = jsc or self._initialize_context(self._conf._jconf)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 306, in _initialize_context  
    return self._jvm.JavaSparkContext(jconf)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1525, in __call__  
    answer, self._gateway_client, None, self._fqn)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_return_value  
    format(target_id, ".", name), value)  
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.  
: java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.  
        at sun.nio.ch.Net.bind0(Native Method)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)  
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)  
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)  
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)  
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)  
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)  
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)  
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)  
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)  
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)  
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)  
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)  
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)  
        at java.lang.Thread.run(Unknown Source)  
Moritz
  • 111

3 Answers3

8

In case anyone runs into the same problem:

conf = pyspark.SparkConf().set('spark.driver.host','127.0.0.1')
sc = pyspark.SparkContext(master='local', appName='myAppName',conf=conf)

did the trick.

Moritz
  • 111
1

Not sure why the Bishu's response got a negative vote -- this it right answer for Windows users....It worked for me.

Windows Steps

For folks not aware how to designate system variables in Windows, here's the steps:

  1. In an open folder (with left-hand folder nav window open) locate "This PC"
  2. Right-click on "This PC" and choose "Properties"
  3. Within left nav menu, choose "Advanced system settings"
  4. Within this new menu, choose the bottom item "Environment Variables..."
  5. Within the 2nd window (on the bottom), choose "New..."
  6. For "Variable" prompt, type: SPARK_LOCAL_IP
  7. For "Value" prompt, type: localhost
  8. Note: it might be 127.0.0.1 or some other value on your system-- you should really check what's listed in this file here C:\Windows\System32\Drivers\etc\hosts
  9. Once done, leave this whole area
  10. Note: only NEW cmd prompts will load/recognize the new system variable -- please load another cmd prompt

-- Hopes that helps

0

On Windows

Create an environment variable as below

Variable -> SPARK_LOCAL_IP Value -> localhost

Bishu
  • 11