How to rename all the columns of a dataset with names coming from another one?
UserBird
Dataiker, Alpha Tester Posts: 535 Dataiker
I have a first dataset with no column names (it appears col_0, Col_1) and a text file "dictionary" in which I have the names of the columns.
Is it possible to apply all these names of columns with the UI of DSS? If not, what is the simpliest way?
Thanks,
Is it possible to apply all these names of columns with the UI of DSS? If not, what is the simpliest way?
Thanks,
Tagged:
Answers
-
Since the last version (2.1) there is an API to programmaticaly do many things, including changing the schema:
http://doc.dataiku.com/dss/latest/api/public/index.html
It is a REST API with simple authentication, and a Python client is provided.
Specifically, to change the schema, the method are documented here:
https://doc.dataiku.com/dss/api/2.1/rest/#datasets-dataset-schemaThus, one option would be to write a small python script that read the current schema and the dictionary, then uploads the new schema.
Hope this helps,
Jean-Baptiste Rouquier
-
Or, in a Python recipe, rename the columns with pandas (and write the result to a new dataset). That could be faster.
-
Best is often to have the column names in the source file. If you can open it in a spreadsheet, just insert a new line at the top and paste the column names there ;-)