Hi,
I would suggest reading the input dataset in as a Pandas dataframe, handling the append in the dataframe itself, and then writing the resulting dataframe (in overwrite mode) into your output dataset.
For example, something like:
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
inter = dataiku.Dataset("inter")
input_df = inter.get_dataframe()
# Create dataframe containing row you want to append
append_row = {'my_column': ['foobar']}
append_df = pd.DataFrame(data=append_row)
# Append row to input dataframe
output_df = input_df.append(append_df)
# Write recipe outputs
inter_temp = dataiku.Dataset("inter_temp")
inter_temp.write_with_schema(output_df)
I hope that this helps! I would also suggest checking out the following Pandas documentation, which provides more examples and details about how to use DataFrame.append:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html
Best,
Andrew
Hi ATsao,
Thanks for your reply it could help me in the future. I found a way without dataframe. Post here for those who whants an exemple of write_row_dict as I wanted yesterday.
# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku.core.sql import SQLExecutor2
from dataiku import pandasutils as pdu
# On lit le schema de la BDD en entrée et on le copie dans le dataset temporaire
input_dataset = dataiku.Dataset("dataset")
schema = input_dataset.read_schema()
output_dataset = dataiku.Dataset("dataset_temp")
output_dataset.write_schema(schema)
##Il faut ensuite ouvrir un writer pour ajouter des lignes
try :
writer = output_dataset.get_writer()
foobar="foobar"
values = {
"colonne1": foobar,
"colonne2": foobar,
"colonne3": 1,
"colonne4" : foobar
}
##Ne prend que des valeurs de type dictionnaire
writer.write_row_dict(values)
except:
writer.close()
Hi,
That's one of the only topic I found, and I have the same problem as @UserBird.
I would like to add one row to an existing dataset with a python recipe.
I'm looking for examples on the Internet and I can't find any... This is what I would like to do :
input_dataset = dataiku.Dataset("inter")
output_dataset = dataiku.Dataset("inter_temp")
foobar="foobar"
output_dataset.iter_rows(columns='my_column', values=foobar)
##Or something else but it should be very easy and I can't find a way...
If anyone has an answer, it would be gladly appreciated !
Have a good day.
Hi,
I would suggest reading the input dataset in as a Pandas dataframe, handling the append in the dataframe itself, and then writing the resulting dataframe (in overwrite mode) into your output dataset.
For example, something like:
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
inter = dataiku.Dataset("inter")
input_df = inter.get_dataframe()
# Create dataframe containing row you want to append
append_row = {'my_column': ['foobar']}
append_df = pd.DataFrame(data=append_row)
# Append row to input dataframe
output_df = input_df.append(append_df)
# Write recipe outputs
inter_temp = dataiku.Dataset("inter_temp")
inter_temp.write_with_schema(output_df)
I hope that this helps! I would also suggest checking out the following Pandas documentation, which provides more examples and details about how to use DataFrame.append:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html
Best,
Andrew
Hi ATsao,
Thanks for your reply it could help me in the future. I found a way without dataframe. Post here for those who whants an exemple of write_row_dict as I wanted yesterday.
# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku.core.sql import SQLExecutor2
from dataiku import pandasutils as pdu
# On lit le schema de la BDD en entrée et on le copie dans le dataset temporaire
input_dataset = dataiku.Dataset("dataset")
schema = input_dataset.read_schema()
output_dataset = dataiku.Dataset("dataset_temp")
output_dataset.write_schema(schema)
##Il faut ensuite ouvrir un writer pour ajouter des lignes
try :
writer = output_dataset.get_writer()
foobar="foobar"
values = {
"colonne1": foobar,
"colonne2": foobar,
"colonne3": 1,
"colonne4" : foobar
}
##Ne prend que des valeurs de type dictionnaire
writer.write_row_dict(values)
except:
writer.close()