Is there a maximum length for the dynamic SQL generated for moving data into a SQL based table?
I am writing a python recipe to join two csv datasets and write to a postgres table.
Here is the first error:
An invalid argument has been encountered : Column name not matching between query and schema (Total Randomized or Receiving Treatment-United States of Americ, expected Total Randomized or Receiving Treatment-United States of America) SELECT * FROM "CLINICALTRIALCOSTPROJECTION_sitespatientsvisitsbyregion_t" LIMIT 10000. Please check dataset schema.
The length of the column ("Total Randomized or Receiving Treatment-United States of America") is 64
Here is another error after the script runs when I try to explore the data (I get a similar error to the one above) and then go to created tabe settings:
Connection OK. Table exists, but schema does not match : Name mismatch for column 39 : '1st Patient Recruited _FPI_' in dataset, '1st Patient Recruited (FPI)' in table
DSS did write his schema with the label you provided when creating the output dataset, while Postgres truncated it, that is why after the recipe completes and DSS tries to synchronize his schema (with the accurate label) with the Postgres schema (with the truncated label), it gives an error.
As per Postgres documentation you can raise this limit in the Postgres config
The system uses no more than NAMEDATALEN-1 bytes of an identifier; longer names can be written in commands, but they will be truncated. By default, NAMEDATALEN is 64 so the maximum identifier length is 63 bytes. If this limit is problematic, it can be raised by changing the NAMEDATALEN constant in src/include/pg_config_manual.h.
Hi @Alex. I had two uploaded csv files and I was using a python script recipe to join them and write them to a postgres table. I used pgAdmin to check the structure of the table in postgres and it was fine. I was not able to import the table as a new dataset.
Hi Pouya, try using dropAndCreate=True as parameter of https://doc.dataiku.com/dss/latest/api/python/datasets.html?highlight=write_with_schema#dataiku.Dataset.write_with_schema Note that doing a join in-memory using Python is not very efficient. Can you use a Join recipe instead to leverage in-database computation?