Want to Stop Rebuilding "Expensive" Parts of your Flow? Explicit Builds are the Answer!READ MORE

"Connection Refused" when querying value via Pandas on Spark

Rushil09
Level 2
Level 2
"Connection Refused" when querying value via Pandas on Spark

HI,

When I try to find a value via pandas on spark, I get some error saying Connection refused. 

Here the size of the data is 24lakh rows. Plus when I run on pandas or core pyspark it works fine.

Can someone help?


Operating system used: Ubuntu

(Topic title edited by moderator to be more descriptive. Original title "Spark")

0 Kudos
1 Reply
AlexT
Dataiker
Dataiker

Hi @Rushil09 ,

Can you please share a snippet of your actual code and what spark version you are using? 

Are you trying to Pandas API on Spark? 

use https://spark.apache.org/docs/3.2.1/api/python/getting_started/quickstart_ps.html

When interacting with pyspark you should use Dataiku API and Spark APIs:

https://doc.dataiku.com/dss/latest/code_recipes/pyspark.html#anatomy-of-a-basic-pyspark-recipe

Thanks

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku