Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Spark DF float value appears as null values in notebook

JBB_LBP
Level 1
Spark DF float value appears as null values in notebook

Hi,

I upload a CSV file into a DSS dataset and i try ton convert il into a Spark DF :

"Mahout_dict" is the DSS Dataset with 2 columns : lb_word (string) and id_word (float) :

Capture.PNG

 

 

 

mahout_dict = dataiku.Dataset("mahout_dict")
mahout_dict_df = dkuspark.get_dataframe(sqlContext, mahout_dict)
mahout_dict_df.show()

 

 

The result is :

+-------+-------+
|lb_word|id_word|
+-------+-------+
|    000|   null|
|     06|   null|
|     08|   null|
|     09|   null|
|      1|   null|
|     10|   null|
|     11|   null|
|     14|   null|
|     18|   null|
|      2|   null|
|   2000|   null|
|   2001|   null|
|   2003|   null|
|   2004|   null|
|   2005|   null|
|   2006|   null|
|   2007|   null|
|   2008|   null|
|   2009|   null|
|   2010|   null|
+-------+-------+
only showing top 20 rows

Can you help me?

0 Kudos
1 Reply
fchataigner2
Dataiker
Dataiker

Hi,

it might be that the id_word column has extra whitespace left or right of the float value. Can you share the csv file (or part of it) ?

0 Kudos