Type coercion in formulas
KjellK
Registered Posts: 15 ✭✭✭✭
I am merging two datasets, and in the result I have 3 columns of bigint/integer A, B, C, which I am using to compute a "usage percent", i.e. p= (A*B)/C and 100%>p<0%.<BR />
The script correctly deduces that the output should be double/decimal, but there are in fact no decimals; all numbers end in .0. My first reaction was that in an operation with 3 integers, the result would be (implicitly) coerced to integer and then, when put into a decimal column, all decimals would be 0. However, even if I force the columns in the schemas of the input datasets to be "decimal", this still happens.
A temporary solution is to add a factor of 100 to get the percentage, but that seems crude and is likely to cause problems further on (and i still lose any decimals in the percentages, i.e. I always get, say, 18.0, 34.0, etc) What am I missing?
The script correctly deduces that the output should be double/decimal, but there are in fact no decimals; all numbers end in .0. My first reaction was that in an operation with 3 integers, the result would be (implicitly) coerced to integer and then, when put into a decimal column, all decimals would be 0. However, even if I force the columns in the schemas of the input datasets to be "decimal", this still happens.
A temporary solution is to add a factor of 100 to get the percentage, but that seems crude and is likely to cause problems further on (and i still lose any decimals in the percentages, i.e. I always get, say, 18.0, 34.0, etc) What am I missing?
Best Answer
-
DataIku, hanks to your support, I now understand that formulas in a merge recipe are converted to SQL queries, which coerces in this way. By following the advice of writing the formula as p= (1.0*A*B)/C, the issue was resolved. Thanks!