Force substring to integer

yesitsmeoffical Registered Posts: 15
edited July 16 in Using Dataiku

Here is the sample table:

IDColumn A

I want to add a Column B, in which the values are forced to be integer.

I know I can do Column B = substring (Column A, -4), and Dataiku will automatically convert the values to integer, but the conversion process is a black box to me, and I don't know what's the conversion criteria/logic and when it might fail.

I thought I could add "numval" in front of the substring to force the conversion but it didn't work and returned blank.

Is there a logic I could apply to achieve this? Basically something like:

pd.to_numeric(df['Column A'].str[:4], errors='coerce')

Operating system used: win 11


Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,740 Neuron
    edited July 17 Answer ✓

    In a Prepare recipe use a Formula processor and add this:

    if(isError(toNumber(substring(val("Your Column Name"), -4))), 0, toNumber(substring(val("Your Column Name"), -4)))

    In my sample if the convertion fails it returns a zero. You can enhance it by replacing all possible characters with nothing to remove any possible failures.

    Formulas documentation:

Setup Info
      Help me…