Compressing CSV and uploading to sharepoint

Danny78
Danny78 Registered Posts: 8

Hi everyone,

I am trying to compress a CSV and upload to sharepoint. However, I find that csv file is not actually compressed - also, when I look at the preview of sharepoint folder in dataiku - it shows the following error.

Screen Shot 2022-11-29 at 9.54.38 AM.png


I used the code similar to what has been shared in the post and solution. Any suggestions are welcome.

I use enterprise version - Dataiku 9.0.3. Is there a limit on size of the file that could be uploaded to sharepoint through dataiku?

Best Answer

  • AlexB
    AlexB Dataiker, Registered Posts: 68 Dataiker
    edited July 17 Answer ✓

    Double posting the answer here for future reference:

    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    import io
    import dataikuapi
    from datetime import datetime
    import gzip
    
    input_data = dataiku.Dataset("input_data")
    input_data_df = input_data.get_dataframe()
    
    content = input_data_df.to_csv(index=False).encode("utf-8")
    
    compressed_content = gzip.compress(content)
    
    output_folder = dataiku.Folder("output_data")
    output_folder.upload_stream("filename.csv.gz", compressed_content)

Setup Info
    Tags
      Help me…