Spark DF float value appears as null values in notebook
JBB_LBP
Registered Posts: 1 ✭✭✭
Hi,
I upload a CSV file into a DSS dataset and i try ton convert il into a Spark DF :
"Mahout_dict" is the DSS Dataset with 2 columns : lb_word (string) and id_word (float) :
+-------+-------+ |lb_word|id_word| +-------+-------+ | 000| null| | 06| null| | 08| null| | 09| null| | 1| null| | 10| null| | 11| null| | 14| null| | 18| null| | 2| null| | 2000| null| | 2001| null| | 2003| null| | 2004| null| | 2005| null| | 2006| null| | 2007| null| | 2008| null| | 2009| null| | 2010| null| +-------+-------+ only showing top 20 rows
The result is :
mahout_dict = dataiku.Dataset("mahout_dict") mahout_dict_df = dkuspark.get_dataframe(sqlContext, mahout_dict) mahout_dict_df.show()
Can you help me?
Answers
-
Hi,
it might be that the id_word column has extra whitespace left or right of the float value. Can you share the csv file (or part of it) ?