Force substring to integer
Here is the sample table:
ID | Column A |
1 | AA2001 |
2 | BB2002 |
I want to add a Column B, in which the values are forced to be integer.
I know I can do Column B = substring (Column A, -4), and Dataiku will automatically convert the values to integer, but the conversion process is a black box to me, and I don't know what's the conversion criteria/logic and when it might fail.
I thought I could add "numval" in front of the substring to force the conversion but it didn't work and returned blank.
Is there a logic I could apply to achieve this? Basically something like:
pd.to_numeric(df['Column A'].str[:4], errors='coerce')
Operating system used: win 11
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,090 Neuron
In a Prepare recipe use a Formula processor and add this:
if(isError(toNumber(substring(val("Your Column Name"), -4))), 0, toNumber(substring(val("Your Column Name"), -4)))
In my sample if the convertion fails it returns a zero. You can enhance it by replacing all possible characters with nothing to remove any possible failures.
Formulas documentation: https://doc.dataiku.com/dss/12/formula/index.html