Trailing zeros in Metrics

Solved!
Rupesh_Sethi
Level 2
Trailing zeros in Metrics

I am using the Dashboard facility to show metrics that have been rounded to 1 d.p. within a prepare recipe. The resulting metrics sometimes show several trailing zeros, after the 1st decimal place. Is there a way for me to resolve this?
I have checked the underlying dataset (which has just 1 row of data) and the all numbers correctly show just 1 d.p. The extra zeros seem to be generated when I turn the numbers into a metric.
I attach screenshot as an example:
-The number is the top-left, shows as 1165, when I would like to see 1165.0
-The numbers in the top-right and bottom-left show 5 extra trailing zeros
-The number in the bottom-right correctly show the number the correct format.


Operating system used: MacOS Sonoma 14.2.1

0 Kudos
2 Solutions
Turribeach

So I did some digging and this has nothing to do with dashboards but it's to do with the value of your metrics. If you want to have a single decimal point in your numbers you need to use the round() function to truncate the decimals of the value in your datset, a format function is merely doing for display purposes but not behind the scenes. 

I couldn't find a way to do this in a prepare recipe, there doesn't seem to be a built-in formula language function that can round a number to specific decimals. The existing round() function only allows to round a number to the nearest integer. So I created a Custom Python Function which does the rounding:

 

 

# Modify the process function to fit your needs
import pandas as pd
def process(row):

    number = float(row['Double_Number'])
    number_rounded = round(number, 1)
    
    return number_rounded

 

 

Screenshot 2024-02-18 at 11.47.00.png

This is not going to be very fast as it will run this Python code on each individual row but since you said this is a single row dataset it should be fine.

Another way to do it is to format the number (which returns a string) and then convert it back to a number using this formula:

toNumber(format("%.1f", Double_Number))

This will probably be a lot faster. Alternatively if you are using custom Python metrics probes you can use Python's round() function there. Now will this solve all your number format issues? Well no, turns out if the rounded number ends with trailing zeros Dataiku will remove them. So in my result set I did two custom Python probes and showed the min (12334.6) and the max (4325468.0) of Double_Number_Rounded and this is what you get:

Screenshot 2024-02-18 at 11.52.43.png

You could argue Dataiku is doing the right thing, a trailing zero is superfluous and has value. However you could also argue that if you want to see numbers all with the same precision Dataiku shouldn't be messing with them. In this case I sit more close to the latter than the former since I tend to prefer flexibility of deciding how I want things rather than programs deciding for me.

 

View solution in original post

0 Kudos

As a possible work around, that should work fully in the formula language.  

To for example get 1 decimal place 

  • What about multiplying the number you have in [my_collumn] by 10,
  • then rounding,
  • then divide by 10
(round([my_column]*10))/10

To get two decimal places multiply and divide by 100. Etc,etc,etc and so forth.

That said it might be nice to put in a product idea to extend the round function to take a number of decimal places to round.

 

--Tom

View solution in original post

7 Replies
Turribeach

So I did some digging and this has nothing to do with dashboards but it's to do with the value of your metrics. If you want to have a single decimal point in your numbers you need to use the round() function to truncate the decimals of the value in your datset, a format function is merely doing for display purposes but not behind the scenes. 

I couldn't find a way to do this in a prepare recipe, there doesn't seem to be a built-in formula language function that can round a number to specific decimals. The existing round() function only allows to round a number to the nearest integer. So I created a Custom Python Function which does the rounding:

 

 

# Modify the process function to fit your needs
import pandas as pd
def process(row):

    number = float(row['Double_Number'])
    number_rounded = round(number, 1)
    
    return number_rounded

 

 

Screenshot 2024-02-18 at 11.47.00.png

This is not going to be very fast as it will run this Python code on each individual row but since you said this is a single row dataset it should be fine.

Another way to do it is to format the number (which returns a string) and then convert it back to a number using this formula:

toNumber(format("%.1f", Double_Number))

This will probably be a lot faster. Alternatively if you are using custom Python metrics probes you can use Python's round() function there. Now will this solve all your number format issues? Well no, turns out if the rounded number ends with trailing zeros Dataiku will remove them. So in my result set I did two custom Python probes and showed the min (12334.6) and the max (4325468.0) of Double_Number_Rounded and this is what you get:

Screenshot 2024-02-18 at 11.52.43.png

You could argue Dataiku is doing the right thing, a trailing zero is superfluous and has value. However you could also argue that if you want to see numbers all with the same precision Dataiku shouldn't be messing with them. In this case I sit more close to the latter than the former since I tend to prefer flexibility of deciding how I want things rather than programs deciding for me.

 

0 Kudos

As a possible work around, that should work fully in the formula language.  

To for example get 1 decimal place 

  • What about multiplying the number you have in [my_collumn] by 10,
  • then rounding,
  • then divide by 10
(round([my_column]*10))/10

To get two decimal places multiply and divide by 100. Etc,etc,etc and so forth.

That said it might be nice to put in a product idea to extend the round function to take a number of decimal places to round.

 

--Tom

Hi Tom, I like your solution much better and it's likely to be faster as no data type change. I went ahead and raised the Product Idea for the improvement on the existing round() function:

https://community.dataiku.com/t5/Product-Ideas/Enable-round-function-to-round-to-specific-number-of-...

 

0 Kudos

@Turribeach 

I've upvoted the idea you have posted.

@Rupesh_Sethi 

I'd invite you to up vote the product idea that @Turribeach has submitted to make this less of a challenge for the next person.

You can also mark one of the posts here as a solution if we have been able to assist.  This apparently may make it easier for others to find their answers.

--Tom
Rupesh_Sethi
Level 2
Author

Many thanks for your responses @tgb417 @Turribeach. They are all workable solutions. 

@Rupesh_Sethi 

If you are so, inclined,  let us know which one you actually end up choosing.

--Tom
0 Kudos
Rupesh_Sethi
Level 2
Author

I tried 2 solutions:

     toNumber(format("%.1f", Double_Number))
    (round([my_column]*10))/10

I also tried these 2:

     toString(format("%.1f",[my_column]))
     concat(toString(format("%.1f",[my_column])," ")

All 4 solutions rounded the numbers as expected. However I still encountered the same trailing zeros issue when I converted the number to a metric in the resulting dataset. 

BTW: my dataset only has one row, the process to create the metric seems to treat the data as a number even if the data type is Text.

As a compromise solution, I added some characters as I’m reporting in millions of dollars:
concat("$",toString(format("%.1f",[my_column])),"m")

The compromise solution worked.
Thanks for your help.