Better Parsing of Numbers from Text Files.

tgb417
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

User Story:

As a data analyst that gets data from all sorts of places, I'd like Dataiku DSS to do more of the heavy lifting when it comes to parsing number columns so that I don't have to take a lot of time to figure out how to do the parsing on my own.

Example:

  • One "accounting oriented data set I was looking at from a CSV file was using the accounting style for negative numbers with the ( ) before and after the number. See here for a more complete discussion.
  • I suspect that others have other examples. Please jump in with more examples.

Possible solutions:

  • It would be nice if the import parser dealt with these common-ish cases.
  • Or It would be nice if the Extract Number Visual Recipe Step could handle more cases
2
2 votes

New · Last Updated

Comments

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    Thanks to @Ignacio_Toledo
    there is a viable workaround for the accounting habit of using (100) to mean -100. See here for details.

    That said better parsing would always be welcomed.

Setup Info
    Tags
      Help me…