Spark DF float value appears as null values in notebook

JBB_LBP · April 2021

Hi,

I upload a CSV file into a DSS dataset and i try ton convert il into a Spark DF :

"Mahout_dict" is the DSS Dataset with 2 columns : lb_word (string) and id_word (float) :

+-------+-------+
|lb_word|id_word|
+-------+-------+
|    000|   null|
|     06|   null|
|     08|   null|
|     09|   null|
|      1|   null|
|     10|   null|
|     11|   null|
|     14|   null|
|     18|   null|
|      2|   null|
|   2000|   null|
|   2001|   null|
|   2003|   null|
|   2004|   null|
|   2005|   null|
|   2006|   null|
|   2007|   null|
|   2008|   null|
|   2009|   null|
|   2010|   null|
+-------+-------+
only showing top 20 rows

The result is :

mahout_dict = dataiku.Dataset("mahout_dict")
mahout_dict_df = dkuspark.get_dataframe(sqlContext, mahout_dict)
mahout_dict_df.show()

Can you help me?

fchataigner2 · April 2021

Hi,

it might be that the id_word column has extra whitespace left or right of the float value. Can you share the csv file (or part of it) ?

Spark DF float value appears as null values in notebook

Answers

Categories

Setup Info

Tags