The Dataiku Frontrunner Awards have just launched to recognize your achievements! Submit Your Entry

How to remove scientific notation in a column

Formatting numbers can often be a tedious data cleaning task.

It can be made easier with the format() function of the DSS Formula language. This function takes a "printf format string" and applies it to any value. 

Format strings are immensely powerful, as they allow you to truncate strings, change precision, switch between numerical notations, left-pad strings, pad numbers with zeros, etc. More specifically, DSS formulas use the Java variant of format strings.

For example, you may have a column of very small numbers represented in scientific notation. If instead you wanted to convert this column to 5 decimal places, you can use:

format("%.5f", my_column_name)

 

What’s next?

For more information, see:

 

Comments
tgb417
Neuron
Neuron

Can one use this approach to change the time zone of a parsed date time to the local time zone. In the Western Hemisphere always showing parsed date time in GMT cause challenges for analysts an those consuming the insights created by analysts.  

jfyuen
Dataiker
Dataiker

As a rule of thumb, you never want to manipulate date using string formatting but date functions. You will run into daylight saving time issues, calendar problems, ...

To avoid such problems, DSS only displays dates in UTC.

For your problem, you can use the format date processor and select a static timezone (or a timezone column) to format the date.

However values will be converted to strings and won't be dates anymore. Be careful when using them afterwards other than for display.

tgb417
Neuron
Neuron

Can one use these time zone corrected strings in charts as dates with the real date features of the charting system? 

jfyuen
Dataiker
Dataiker

Unfortunately no, it will only be treated as string.

You can however "cheat" by reconverting the parsed date to a new date with a standard format date to the UTC timezone. It will be displayed with the "good" time while staying UTC under the hood.

However, only use that column for display and not computation as it will differ from your real data.

tgb417
Neuron
Neuron

@jfyuen 

I'm going to open a new thread on the date-time part of this discussion.  Rather than further cluttering up this thread about Scientific Notation with Date Time Things.

https://community.dataiku.com/t5/Using-Dataiku-DSS/Confusions-about-how-DSS-handles-Dates-and-Times/...

Labels (3)
Version history
Last update:
‎05-13-2020 02:59 PM
Updated by:
Contributors