The Dataiku Frontrunner Awards have just launched to recognize your achievements! Submit Your Entry

Spark DF float value appears as null values in notebook

JBB_LBP
Level 1
Spark DF float value appears as null values in notebook

Hi,

I upload a CSV file into a DSS dataset and i try ton convert il into a Spark DF :

"Mahout_dict" is the DSS Dataset with 2 columns : lb_word (string) and id_word (float) :

Capture.PNG

 

 

 

mahout_dict = dataiku.Dataset("mahout_dict")
mahout_dict_df = dkuspark.get_dataframe(sqlContext, mahout_dict)
mahout_dict_df.show()

 

 

The result is :

+-------+-------+
|lb_word|id_word|
+-------+-------+
|    000|   null|
|     06|   null|
|     08|   null|
|     09|   null|
|      1|   null|
|     10|   null|
|     11|   null|
|     14|   null|
|     18|   null|
|      2|   null|
|   2000|   null|
|   2001|   null|
|   2003|   null|
|   2004|   null|
|   2005|   null|
|   2006|   null|
|   2007|   null|
|   2008|   null|
|   2009|   null|
|   2010|   null|
+-------+-------+
only showing top 20 rows

Can you help me?

0 Kudos
1 Reply
fchataigner2
Dataiker
Dataiker

Hi,

it might be that the id_word column has extra whitespace left or right of the float value. Can you share the csv file (or part of it) ?

0 Kudos
A banner prompting to get Dataiku DSS