Import csv or txt, first line as column name failed
Hello,
When I import a csv or txt file, the first line is not recognized as column name, though in Python code it works perfectly.
In DSS, after the import, I have :
- In first line, I have col1, col2, etc
- In second line, the true name of the column as they are in the original csv/txt file
- Starting from third line, I have the rows
What is the solution to import a csv/txt with the appropriate col names ? I could not figure it out.
Thanks and kindest regards
Best Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi Laurent,
You should be to adjust which line the true column names are on. Using "Skip First Lines" options. You can set that to 1 and user check Parse Next line as column headers.
If my .csv file is
column1,column2
actual_name,other_name
some_value,some_valueThis is what I would see :
Let me know if that works for you.
Thanks,
Alex
-
LaurentS Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 21 ✭✭✭✭
I could figure out whete to see this option (preview after loading the file)
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
If you already have a dataset you can go settings and Format/Preview Tab. To get these options.
Additionally, you can reach this page when you create a dataset from a file there will be the Preview button once you import the file.
Let me know if you still have issues finding these options.
Answers
-
LaurentS Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 21 ✭✭✭✭
Hi and thanks a lot. I could figure out where to access to this view (preview after loading the file). Kindest regards
-
After Importing its working. Can we do the same in between the flow for any of the dataset.