Updated Data "Meaning" of Email Addresses to accept RFC 6531 addresses that allows some UTF-8

tgb417
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

User Story:

As a data analyst that works with persons from around the world. It is challenging when the meaning of email address in data views does not currently correctly take into account local parts of email addresses (the part before the @) that includes characters beyond ASCII. The use of UTF-8 strings has been defined since at least 2012 and providers like gmail are allowing such strings to appear in the local part of an email address. Fixing this will allow more accurate evaluation of email addresses in a dataset, and fewer confessions as to data quality.

Notes:

1
1 votes

New · Last Updated

Comments

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    P.S. I know that I could create a local definition for email address as an interim work around. This is a request for the "standard" defined meaning in DSS to reflect these later standards.

Setup Info
    Tags
      Help me…