Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
import dataiku
from dataiku import spark as dkuspark
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
sc = SparkContext.getOrCreate()
sqlContext = SQLContext(sc)
# Read recipe inputs
internal= dataiku.Dataset("internal22") #internal22 is a hive table
internal_df= dkuspark.get_dataframe(sqlContext, internal)
internal_df.count()# it return as 0 but actual it has million records
Hi @sigma_loge ,
Could you try running the same or similar basic spark code in PySpark recipe and share that job diagnostics with support?
https://doc.dataiku.com/dss/latest/code_recipes/pyspark.html#anatomy-of-a-basic-pyspark-recipe
Once you've run, please grab the job diagnostics
https://doc.dataiku.com/dss/latest/troubleshooting/problems/job-fails.html
Raise a ticket and share this with support directly( not on Community)https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html
Thanks