Join us, on May 27th, for an introduction to the new Dataiku Academy Learn more

How to rename all the columns of a dataset with names coming from another one?

Dataiker
Dataiker
How to rename all the columns of a dataset with names coming from another one?
I have a first dataset with no column names (it appears col_0, Col_1) and a text file "dictionary" in which I have the names of the columns.

Is it possible to apply all these names of columns with the UI of DSS? If not, what is the simpliest way?

Thanks,
0 Kudos
3 Replies
Level 5

Since the last version (2.1) there is an API to programmaticaly do many things, including changing the schema:

http://doc.dataiku.com/dss/latest/api/public/index.html

It is a REST API with simple authentication, and a Python client is provided.



Specifically, to change the schema, the method are documented here:

https://doc.dataiku.com/dss/api/2.1/rest/#datasets-dataset-schema



Thus, one option would be to write a small python script that read the current schema and the dictionary, then uploads the new schema.



 



Hope this helps,



Jean-Baptiste Rouquier

0 Kudos
Dataiker
Dataiker
Or, in a Python recipe, rename the columns with pandas (and write the result to a new dataset). That could be faster.
Jeremy, Product Manager at Dataiku
0 Kudos
Level 5
Best is often to have the column names in the source file. If you can open it in a spreadsheet, just insert a new line at the top and paste the column names there 😉
0 Kudos
Labels (2)