How to remove scientific notation in a column

Dataiku
Dataiku Administrator, Dataiker, Alpha Tester Posts: 88 Administrator
edited July 16 in Knowledge Base

Formatting numbers can often be a tedious data cleaning task.

It can be made easier with the format() function of the DSS Formula language. This function takes a "printf format string" and applies it to any value.

Format strings are immensely powerful, as they allow you to truncate strings, change precision, switch between numerical notations, left-pad strings, pad numbers with zeros, etc. More specifically, DSS formulas use the Java variant of format strings.

For example, you may have a column of very small numbers represented in scientific notation. If instead you wanted to convert this column to 5 decimal places, you can use:

format("%.5f", my_column_name)

What’s next?

For more information, see:

Comments

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    Can one use this approach to change the time zone of a parsed date time to the local time zone. In the Western Hemisphere always showing parsed date time in GMT cause challenges for analysts an those consuming the insights created by analysts.

  • jfyuen
    jfyuen Dataiker, Registered Posts: 12 Dataiker

    As a rule of thumb, you never want to manipulate date using string formatting but date functions. You will run into daylight saving time issues, calendar problems, ...

    To avoid such problems, DSS only displays dates in UTC.

    For your problem, you can use the format date processor and select a static timezone (or a timezone column) to format the date.

    However values will be converted to strings and won't be dates anymore. Be careful when using them afterwards other than for display.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    Can one use these time zone corrected strings in charts as dates with the real date features of the charting system?

  • jfyuen
    jfyuen Dataiker, Registered Posts: 12 Dataiker

    Unfortunately no, it will only be treated as string.

    You can however "cheat" by reconverting the parsed date to a new date with a standard format date to the UTC timezone. It will be displayed with the "good" time while staying UTC under the hood.

    However, only use that column for display and not computation as it will differ from your real data.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @jfyuen

    I'm going to open a new thread on the date-time part of this discussion. Rather than further cluttering up this thread about Scientific Notation with Date Time Things.

    https://community.dataiku.com/t5/Using-Dataiku-DSS/Confusions-about-how-DSS-handles-Dates-and-Times/m-p/11928#M5380

Setup Info
    Tags
      Help me…