Trailing zeros in Metrics
I am using the Dashboard facility to show metrics that have been rounded to 1 d.p. within a prepare recipe. The resulting metrics sometimes show several trailing zeros, after the 1st decimal place. Is there a way for me to resolve this?
I have checked the underlying dataset (which has just 1 row of data) and the all numbers correctly show just 1 d.p. The extra zeros seem to be generated when I turn the numbers into a metric.
I attach screenshot as an example:
-The number is the top-left, shows as 1165, when I would like to see 1165.0
-The numbers in the top-right and bottom-left show 5 extra trailing zeros
-The number in the bottom-right correctly show the number the correct format.
Operating system used: MacOS Sonoma 14.2.1
Best Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,982 Neuron
So I did some digging and this has nothing to do with dashboards but it's to do with the value of your metrics. If you want to have a single decimal point in your numbers you need to use the round() function to truncate the decimals of the value in your datset, a format function is merely doing for display purposes but not behind the scenes.
I couldn't find a way to do this in a prepare recipe, there doesn't seem to be a built-in formula language function that can round a number to specific decimals. The existing round() function only allows to round a number to the nearest integer. So I created a Custom Python Function which does the rounding:
# Modify the process function to fit your needs import pandas as pd def process(row): number = float(row['Double_Number']) number_rounded = round(number, 1) return number_rounded
This is not going to be very fast as it will run this Python code on each individual row but since you said this is a single row dataset it should be fine.
Another way to do it is to format the number (which returns a string) and then convert it back to a number using this formula:
toNumber(format("%.1f", Double_Number))
This will probably be a lot faster. Alternatively if you are using custom Python metrics probes you can use Python's round() function there. Now will this solve all your number format issues? Well no, turns out if the rounded number ends with trailing zeros Dataiku will remove them. So in my result set I did two custom Python probes and showed the min (12334.6) and the max (4325468.0) of Double_Number_Rounded and this is what you get:
You could argue Dataiku is doing the right thing, a trailing zero is superfluous and has value. However you could also argue that if you want to see numbers all with the same precision Dataiku shouldn't be messing with them. In this case I sit more close to the latter than the former since I tend to prefer flexibility of deciding how I want things rather than programs deciding for me.
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
As a possible work around, that should work fully in the formula language.
To for example get 1 decimal place
- What about multiplying the number you have in [my_collumn] by 10,
- then rounding,
- then divide by 10
(round([my_column]*10))/10
To get two decimal places multiply and divide by 100. Etc,etc,etc and so forth.
That said it might be nice to put in a product idea to extend the round function to take a number of decimal places to round.
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,982 Neuron
Hi Tom, I like your solution much better and it's likely to be faster as no data type change. I went ahead and raised the Product Idea for the improvement on the existing round() function:
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
I've upvoted the idea you have posted.
@Rupesh_Sethi
I'd invite you to up vote the product idea that @Turribeach
has submitted to make this less of a challenge for the next person.
You can also mark one of the posts here as a solution if we have been able to assist. This apparently may make it easier for others to find their answers. -
Many thanks for your responses @tgb417
@Turribeach
. They are all workable solutions. -
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
If you are so, inclined, let us know which one you actually end up choosing.
-
I tried 2 solutions:
toNumber(format("%.1f", Double_Number))
(round([my_column]*10))/10I also tried these 2:
toString(format("%.1f",[my_column]))
concat(toString(format("%.1f",[my_column])," ")All 4 solutions rounded the numbers as expected. However I still encountered the same trailing zeros issue when I converted the number to a metric in the resulting dataset.
BTW: my dataset only has one row, the process to create the metric seems to treat the data as a number even if the data type is Text.
As a compromise solution, I added some characters as I’m reporting in millions of dollars:
concat("$",toString(format("%.1f",[my_column])),"m")
The compromise solution worked.
Thanks for your help.