Setting Postgres Database to UFT8

Options
RanjithJose
RanjithJose Dataiku DSS Core Concepts, Registered Posts: 13 ✭✭✭✭

Hi,

I have a scenario where in i need to set Postgres database client setting to UFT8, before the dataset is created. Is it possible to have this setting changed just for my dataset?

SET client_encoding = 'UTF8'
Thanks,
Ranjith.

Answers

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Options

    Hi

    settings affecting communication with the database need to be put on the DSS connection settings, so you can't restrict them to one dataset, they'll be effective for all datasets on the connection

  • RanjithJose
    RanjithJose Dataiku DSS Core Concepts, Registered Posts: 13 ✭✭✭✭
    Options

    Is there a way to remove UFT8 character which is causing an exception from data being written to Postgres database?

    ERROR: character with byte sequence 0xef 0xbf 0xbd in encoding "UTF8" has no equivalent in encoding "LATIN1"
  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Options

    If the database doesn't accept it, you'll have to clean it up before, for example in a Prepare recipe with a FindReplace step to remove �. But there might be other UTF8 characters which will cause issues, so it'd be much simpler if the DB was working in utf8 already

  • RanjithJose
    RanjithJose Dataiku DSS Core Concepts, Registered Posts: 13 ✭✭✭✭
    Options

    thanks fchataigner2!

    I was thinking not to change the Postgres Database setting, since it impacts all the folks on the server since changes will be applicable for all Postgres Database on the server.

    Is there a way to identify all these UTF8 characters using Find and Replace recipe, using regular expression?

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    How would one set a connection the better handle UTF8 Characters?

Setup Info
    Tags
      Help me…