1. Home
2. Questions
3. AI Assist
4. Tags
5. Challenges
6. Chat
7. Articles
8. Users
9. Companies
11. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Bring the best of human thought and AI automation together at your work. Learn more

Connectivity issues in standalone Spark 4.0

Asked 1 month ago

Viewed 64 times

-1

In Azure VM, I have installed standalone Spark 4.0. On the same VM I have Python 3.11 with Jupyter deployed. In my notebook I submitted the following program:

from pyspark.sql import SparkSession
spark = SparkSession.builder.remote("sc://192.168.2.5:15002").getOrCreate()
df = spark.range(10)
df.show()

Everything works fine. Now I'm trying to read sample data, submitting the following program:

UsersDF=spark.read.load("examples/src/main/resources/users.parquet","parquet")
UsersDF.show()

This program generates the following error message:

UnknownException: (java.net.ConnectException) Call From vm-name/192.168.2.5 to vm-name.internal.cloudapp.net:9001 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

I'd be grateful for any suggestions on how to fix it!

Improve this question

edited Nov 25, 2025 at 13:07

DarkBee's user avatar

DarkBee

14.4k9 gold badges86 silver badges135 bronze badges

asked Nov 24, 2025 at 16:16

Ziggy's user avatar

Ziggy

434 bronze badges

You are trying to connect to: "sc://192.168.2.5:15002" do you have the Spark cluster with connect service running?

Frank
– Frank

2025年11月24日 20:23:04 +00:00
Commented Nov 24, 2025 at 20:23

Add a comment |

2 Answers 2

Sorted by: Reset to default

The error message is completely misleading. The path should be written in the following way. It works without any further problems

UsersDF=spark.read.load("file:///examples/src/main/resources/users.parquet","parquet")

Improve this answer

answered Nov 27, 2025 at 23:32

Ziggy's user avatar

Ziggy

434 bronze badges

Comments

-1

Fort standalone Spark see this example, do not connect to sc://192.168.2.5:15002 which is Spark connect port. If you want Spark connect, then you need to make sure the service is running.

Improve this answer

answered Nov 24, 2025 at 20:26

Frank's user avatar

Frank

6366 silver badges16 bronze badges

2 Comments

Ziggy

Ziggy Nov 24, 2025 at 23:06

Yes Spark connect is up and running

2025年11月24日T23:06:08.097Z+00:00

Frank

Frank Nov 24, 2025 at 23:16

Then you need to open 15002 port on the VM. use lsof -i -P -n | grep LISTEN to list open ports.

2025年11月24日T23:16:34.397Z+00:00

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Connectivity issues in standalone Spark 4.0

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related