NumberFormatException: For input string: Data type wrongly interpreted on import from Excel
I created a dataset by importing an Excel file. The data type was incorrectly identified as integer because the first few records happened to contain data that could be identified as integer. When I change the data type to string DSS returns error "NumberFormatException: For input string:" when I try to explore the data.
I have tried deleting the data and reloading the Excel file but this doesn't work.
The Excel file is small i.e. less than 1,000 rows and 22 columns
How do I correct this?
Can I force DSS to revaluate the data?
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,973 Neuron
When you say "When I change the data type to string" can you please clarify where exactly did you change (which screen/tab/field/etc)? In general you should only change data types in a Prepare recipe or in a Code Recipe where you have control on how to deal with exceptions.
-
I think I have found the cause of the error. I have followed these steps
1. + Dataset
2. Upload a file
3. Select the Excel file. The first 5 rows have data that could be integers, however row 6 has string data but DSS appears to identify the data as integer.
4. DSS Auto-detect identified the data as integer and I incorrectly saved and proceeded to build recipes and so.
4. This is a beginner's mistake. What I should have done is to go to the schema tab and change the data type.
How do I go about correcting this?
I am unable to overwrite the dataset and changing the data type in the columns view of the dataset doesn't have any effect. -
Hello,
Just to be sure to understand, when you say DSS identifies the data as integer, do you mean (1) the storage type or (2) the meaning? (In the explore view, under the name of the column, the first type is the storage type and the second one is the meaning).
Best,
Mayeul