Bug: Ecoding hash values with Base64, SHA1, SHA256, SHA512

berndito
Level 2
Bug: Ecoding hash values with Base64, SHA1, SHA256, SHA512

Hi all,

not sure if this is a bug, but to me it seems like that:

Following scenario: I tried to encode strings with SHA1 within a preparatoin recipe and found that several outputs got the same hash, even though the input string varied.

I finally got closer to the cause, that in some cases the input value had a notation like: "3151E19012".

Using the toBase64 encoding and decoding (fromBase64) showed the value "Infinity".

I guess that Dataiku interprets the value as integer and not as string and therefore the number  3151E19012 is interpreted as exponential number that is out of range and returns the value Infinity to the function.

Is this a known issue? 

0 Kudos
4 Replies
Clรฉment_Stenac

Hi,

You would need to force the storage type of your output column as string to avoid that.

0 Kudos
berndito
Level 2
Author

Hi @Clรฉment_Stenac ,

thanks for the quick response. ๐Ÿ˜Š

I enforced that by converting the input with toString() first

Here's how the input looks:

fromBase64(toBase64(toString(CY_MANUFACTURER_REFERENCE)))

 The output looks as following:

berndito_0-1582887409834.png

This should be reproducable.

For the moment I have a workaround with an regular expression that splits the string. But in general the behaviour shouldn't be like that.

 

0 Kudos
Clรฉment_Stenac

Hi,

OK, I hadn't understood that you wanted to use it in a formula.

You will need to use strval("CY_MANUFACTURER_REFERENCE") instead of just  CY_MANUFACTURER_REFERENCE

This is explained here: https://doc.dataiku.com/dss/latest/advanced/formula.html#variables-typing-and-autotyping

This behavior will not be changed:

  • When evaluating the formula, the type of the column is not yet computed, so it must consider each value independently and tries to autotype, unless you use strval()
  • Changing it would be a non-acceptable backwards compatibility breakage
berndito
Level 2
Author
Thanks, that solution works fine!