Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

How to read the text file using pyspark in Dataiku

Solved!
dsahu
Level 1
How to read the text file using pyspark in Dataiku

I'm new to Dataiku and trying to read the text file using Pyspark in Dataiku. Tried creating dataframe using spark.read.text() & used sparl context to create RDD but both methods throw some error. Now when I'm creating spark context it throws error like "RuntimeError: Java gateway process exited before sending its port number". Also when i use Spark session it says that "device has no space left". Below are codes I'm using.

 

Spark context:

# Import Dataiku APIs, including the PySpark layer
import dataiku
from dataiku import spark as dkuspark

# Import Spark APIs, both the base SparkContext and higher level SQLContext
from pyspark import SparkContext
from pyspark.sql import SQLContext

sc = SparkContext()
sqlContext = SQLContext(sc)

dataset1 = dataiku.Dataset("Dataset")
df1 = dkuspark.get_dataframe(sqlContext, dataset1)

 

 

Spark session:

#Initialize SparkSession
spark = SparkSession.builder.appName('test').getOrCreate()

 

Your assistance would really help me a lot. Thanks!

0 Kudos
1 Solution
dsahu
Level 1
Author

Hi @AlexTAlexT, thanks for replying. Seems it was a temporary issue.

View solution in original post

0 Kudos
2 Replies
AlexT
Dataiker

Hi,

Could please open a support ticket:
V
with the job diagnostics :
https://doc.dataiku.com/dss/latest/troubleshooting/problems/job-fails.html#getting-a-job-diagnosis

Thanks

0 Kudos
dsahu
Level 1
Author

Hi @AlexTAlexT, thanks for replying. Seems it was a temporary issue.

0 Kudos