CSV export and modification

EdBerth
EdBerth Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 15 ✭✭
edited July 16 in Using Dataiku

I have a python recipe that export a dataset to a csv.

But I need to modify the first row of this csv and its extension.

I'm trying to do this (read the csv and write a new first line) after exporting the dataset but i can't read it with the library csv.

Could you help me ?

# Recipe outputs
managed_folder_id = "pourImport"
output_folder = dataiku.Folder(managed_folder_id)

output_folder.upload_data(filename, analyses_df.to_csv(index=False, header=True, sep=";").encode("utf-8"))

with output_folder.get_download_stream(filename) as stream:
    with open(stream, 'r', newline='') as csvfile:
        # read csv
        reader = csv.reader('/'+csvfile, delimiter=';')
        lignes = list(reader)

    # first row
    new_first_row = ['2','toto']
    lignes[0] = new_first_row

    # write csv
    with open(stream, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile, delimiter=';')
    
        for ligne in lignes:
            writer.writerow(ligne)


TypeError: expected str, bytes or os.PathLike object, not HTTPResponse
Tagged:

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron

    This makes no sense to me. Why would you write it in the first place if you are going to modify later? Why not write it correctly in the first place? Exactly what are you trying to achieve? What is the problem with the file that requires you to write it twice? Thanks

  • EdBerth
    EdBerth Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 15 ✭✭

    The new first row has only 2 features while the folowing rows have 30 features.

    The pandas .to_csv function adds separators at the first row.

    I need a fisrt row like this

    "2";"tot"

    and not like this

    "2";"tot";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron

    Again you have not explined what your requirement is just what you think it should be done. I can't think of a reason as to why you would want to have a custom format CSV file. It is normal for empty columns to show like that, otherwise how would you know which column is which piece of data? In your sample the first two columns are fine the rest are empty but what if the empty columns are at the middle of the row?

  • EdBerth
    EdBerth Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 15 ✭✭

    it is a specific format intended to be integrated into other software.

    I don't know why this format is like this

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron
    edited July 17

    Then scrap what you got and write the file correctly in the first place. Here is how you can iterate rows of a dataframe:

    for index, row in df.iterrows():
        print(row['c1'], row['c2'])
  • EdBerth
    EdBerth Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 15 ✭✭

    Thank you for your help,

    I don't know how to directly write a csv in the output folder of the recipe without using the function df.to_csv().

    can you help me again ?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron

    What is exactly the format of the file you want? What should be in the first row, second row, third row, etc.

Setup Info
    Tags
      Help me…