Hive/Dremio table to pyspark Dataframe

Registered Posts: 1

import dataiku
from dataiku import spark as dkuspark
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
sc = SparkContext.getOrCreate()
sqlContext = SQLContext(sc)

# Read recipe inputs
internal= dataiku.Dataset("internal22") #internal22 is a hive table
internal_df= dkuspark.get_dataframe(sqlContext, internal)

internal_df.count()# it return as 0 but actual it has million records

Answers

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.