Writing out a tsv

Solved!
Scobbyy2k3
Level 3
Writing out a tsv

I am writing out a tsv file in Python

I have written out the code below but still get an error. Please kindly help make ad

 

# Compute recipe outputs
# TODO: Write here your actual code that computes the outputs
# NB: DSS supports several kinds of APIs for reading and writing data. Please see doc.

files = glob.glob(os.path.join(combined_file.get_path(), 'combined_file_*.tsv'))

# Find latest modifiled file
latest_file_df = max(files , key = os.path.getmtime)

 

# Write recipe outputs
latest_file = dataiku.Dataset("latest_file")
latest_file.write_schema_from_dataframe(latest_file_df)


Operating system used: windows

0 Kudos
1 Solution
AlexT
Dataiker

Hi @Scobbyy2k3 ,

The latest_file_df should be a  pandas dataframe. So you can use something like:

latest_file_actual_df = pd.read_csv(latest_file_df, sep='\t')

latest_file.write_schema_from_dataframe(latest_file_actual_df)

Let me know if that helps if not please share the exact error you are seeing. 

Thanks,

 

View solution in original post

0 Kudos
2 Replies
AlexT
Dataiker

Hi @Scobbyy2k3 ,

The latest_file_df should be a  pandas dataframe. So you can use something like:

latest_file_actual_df = pd.read_csv(latest_file_df, sep='\t')

latest_file.write_schema_from_dataframe(latest_file_actual_df)

Let me know if that helps if not please share the exact error you are seeing. 

Thanks,

 

0 Kudos
Scobbyy2k3
Level 3
Author

Thank you Alex, It did wok

But I made a lil tweak

 

latest_file = dataiku.Dataset("latest_file")
latest_file.write_with_schema(latest_file_actual_df)

0 Kudos