Best way to clean data for country meaning

JLudwig
JLudwig Registered Posts: 5 ✭✭✭

I have a dataset which includes ISO 3166-1 alpha-3 country codes which are being detected properly, and full country names where about 14% are being flagged as not passing validation.

Is there some place I can get the canonical list of country names dataiku uses for meaning? And if I can get that list, what's the most time efficient way to patch up my dataset?


Operating system used: Ubuntu 18.04

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,636 Neuron

    This is old. But, I'm having the same issues.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,610 Neuron

    Why not get the one from Wikipedia and then add a User-defined meaning to Dataiku. That way you are in full control and don't depend in Dataiku updating anything.

Setup Info
    Tags
      Help me…