Writing dataset out to txt file
I'd like to take my dataset and output it to a pipe delimited txt file. I poked around and I'm thinking a Python recipe would be the best method. Has anyone successfully done this?
Operating system used: Windows
Best Answer
-
Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
Hi,
One of the most commonly used dataset formats is CSV (comma-separated values). This is a text file, parseable by a text editor and with an extension '.csv'. You can control the separator in such files by goinginto the dataset's Settings > Preview and changing the separator (if the Type if the dataset is Separated values). Then rebuild the dataset.
Now if you want to extract a file specifically into a file with a '.txt' extension, you can use the pandas 'to_csv' method. For example:
import dataiku mydataset = dataiku.Dataset("customers") mydataset_df = mydataset.get_dataframe() mydataset_df.to_csv("/home/exampleuserhome/macpandas.txt", sep='|')
Answers
-
Thanks, how do I set up the filepath similar to where you're pointing it in the last line of code?
"/home/exampleuserhome/macpandas.txt"
-
Hi there - just wanted to follow-up on this one. Can you provide a little more background on how I set up the file path in the last row of Python code where I will write the .txt file to?