How to read the text file using pyspark in Dataiku

dsahu Registered Posts: 3

I'm new to Dataiku and trying to read the text file using Pyspark in Dataiku. Tried creating dataframe using & used sparl context to create RDD but both methods throw some error. Now when I'm creating spark context it throws error like "RuntimeError: Java gateway process exited before sending its port number". Also when i use Spark session it says that "device has no space left". Below are codes I'm using.

Spark context:

# Import Dataiku APIs, including the PySpark layer
import dataiku
from dataiku import spark as dkuspark

# Import Spark APIs, both the base SparkContext and higher level SQLContext
from pyspark import SparkContext
from pyspark.sql import SQLContext

sc = SparkContext()
sqlContext = SQLContext(sc)

dataset1 = dataiku.Dataset("Dataset")
df1 = dkuspark.get_dataframe(sqlContext, dataset1)

Spark session:

#Initialize SparkSession
spark = SparkSession.builder.appName('test').getOrCreate()

Your assistance would really help me a lot. Thanks!

Best Answer


Setup Info
      Help me…