Discover the winners & finalists of the 2022 Dataiku Frontrunner Awards!READ THEIR USE CASES

Read dataframe with datatype

davidmakovoz
Read dataframe with datatype

I have a dataset with a column 'Serial Number' with data type string, Text (see attached)

When I read it in a notebook 

 

mydataset = dataiku.Dataset(dataset_name)
df_f3 = mydataset.get_dataframe()
df_f3['Serial Number'].dtypes

 

 I get dtype('int64')

 And it's too late to convert it to string, because the original values have leading 0's which are lost when the values are read as integers.

How can I force it to read the column as a string? I tried 

 

 

df_f3 = mydataset.get_dataframe(infer_with_pandas=False)

 

 

but this failed for an unrelated reason, in a different column

 ValueError: Integer column has NA values in column 47

I'm using DSS Version 9.0.7

0 Kudos
1 Reply
CatalinaS
Dataiker

Hi @davidmakovoz ,

If you want to keep the original values leading zero you should indeed use

df_f3 = mydataset.get_dataframe(infer_with_pandas=False) 

In your case is failing because most likely there are empty cells in the other column and pandas is not able to deal with empty integers, it converts to double and uses NaN for empty value.

You should check if there are empty values in the other column and replace the empty values with an integer like 0.

0 Kudos