Do you know the History of Data Science? READ MORE

Setting Postgres Database to UFT8

RanjithJose
Level 1
Setting Postgres Database to UFT8

Hi,

I have a scenario where in i need to set Postgres database client setting to UFT8, before the dataset is created. Is it possible to have this setting changed just for my dataset?

SET client_encoding = 'UTF8'
 
Thanks,
Ranjith.
0 Kudos
4 Replies
fchataigner2
Dataiker
Dataiker

Hi

settings affecting communication with the database need to be put on the DSS connection settings, so you can't restrict them to one dataset, they'll be effective for all datasets on the connection

0 Kudos
RanjithJose
Level 1
Author

Is there a way to remove UFT8 character which is causing an exception from data being written to Postgres database?

ERROR: character with byte sequence 0xef 0xbf 0xbd in encoding "UTF8" has no equivalent in encoding "LATIN1"
0 Kudos
fchataigner2
Dataiker
Dataiker

If the database doesn't accept it, you'll have to clean it up before, for example in a Prepare recipe with a FindReplace step to remove �. But there might be other UTF8 characters which will cause issues, so it'd be much simpler if the DB was working in utf8 already

0 Kudos
RanjithJose
Level 1
Author

thanks fchataigner2!

I was thinking not to change the Postgres Database setting, since it impacts all the folks on the server since changes will be applicable for all Postgres Database on the server.

 

Is there a way to identify all these UTF8 characters using Find and Replace recipe, using regular expression?

0 Kudos
A banner prompting to get Dataiku DSS