Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on February 27, 2023 1:31PM
Likes: 0
Replies: 1
import dataiku
from dataiku import spark as dkuspark
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
sc = SparkContext.getOrCreate()
sqlContext = SQLContext(sc)
# Read recipe inputs
internal= dataiku.Dataset("internal22") #internal22 is a hive table
internal_df= dkuspark.get_dataframe(sqlContext, internal)
internal_df.count()# it return as 0 but actual it has million records
Hi @sigma_loge
,
Could you try running the same or similar basic spark code in PySpark recipe and share that job diagnostics with support?
https://doc.dataiku.com/dss/latest/code_recipes/pyspark.html#anatomy-of-a-basic-pyspark-recipe
Once you've run, please grab the job diagnostics
https://doc.dataiku.com/dss/latest/troubleshooting/problems/job-fails.html
Raise a ticket and share this with support directly( not on Community)https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html
Thanks