Updated Data "Meaning" of Email Addresses to accept RFC 6531 addresses that allows some UTF-8

Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,618 Neuron
1
1 votes

New · Last Updated

User Story:

As a data analyst that works with persons from around the world. It is challenging when the meaning of email address in data views does not currently correctly take into account local parts of email addresses (the part before the @) that includes characters beyond ASCII. The use of UTF-8 strings has been defined since at least 2012 and providers like gmail are allowing such strings to appear in the local part of an email address. Fixing this will allow more accurate evaluation of email addresses in a dataset, and fewer confessions as to data quality.

Notes:

Comments

  • Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,618 Neuron

    P.S. I know that I could create a local definition for email address as an interim work around. This is a request for the "standard" defined meaning in DSS to reflect these later standards.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.