Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
I would suggest reading the input dataset in as a Pandas dataframe, handling the append in the dataframe itself, and then writing the resulting dataframe (in overwrite mode) into your output dataset.
For example, something like:
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
inter = dataiku.Dataset("inter")
input_df = inter.get_dataframe()
# Create dataframe containing row you want to append
append_row = {'my_column': ['foobar']}
append_df = pd.DataFrame(data=append_row)
# Append row to input dataframe
output_df = input_df.append(append_df)
# Write recipe outputs
inter_temp = dataiku.Dataset("inter_temp")
inter_temp.write_with_schema(output_df)
I hope that this helps! I would also suggest checking out the following Pandas documentation, which provides more examples and details about how to use DataFrame.append:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html
Best,
Andrew
Hi ATsao,
Thanks for your reply it could help me in the future. I found a way without dataframe. Post here for those who whants an exemple of write_row_dict as I wanted yesterday.
# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku.core.sql import SQLExecutor2
from dataiku import pandasutils as pdu
# On lit le schema de la BDD en entrรฉe et on le copie dans le dataset temporaire
input_dataset = dataiku.Dataset("dataset")
schema = input_dataset.read_schema()
output_dataset = dataiku.Dataset("dataset_temp")
output_dataset.write_schema(schema)
##Il faut ensuite ouvrir un writer pour ajouter des lignes
try :
writer = output_dataset.get_writer()
foobar="foobar"
values = {
"colonne1": foobar,
"colonne2": foobar,
"colonne3": 1,
"colonne4" : foobar
}
##Ne prend que des valeurs de type dictionnaire
writer.write_row_dict(values)
except:
writer.close()
Hi,
That's one of the only topic I found, and I have the same problem as @UserBird.
I would like to add one row to an existing dataset with a python recipe.
I'm looking for examples on the Internet and I can't find any... This is what I would like to do :
input_dataset = dataiku.Dataset("inter")
output_dataset = dataiku.Dataset("inter_temp")
foobar="foobar"
output_dataset.iter_rows(columns='my_column', values=foobar)
##Or something else but it should be very easy and I can't find a way...
If anyone has an answer, it would be gladly appreciated !
Have a good day.
Hi,
I would suggest reading the input dataset in as a Pandas dataframe, handling the append in the dataframe itself, and then writing the resulting dataframe (in overwrite mode) into your output dataset.
For example, something like:
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
inter = dataiku.Dataset("inter")
input_df = inter.get_dataframe()
# Create dataframe containing row you want to append
append_row = {'my_column': ['foobar']}
append_df = pd.DataFrame(data=append_row)
# Append row to input dataframe
output_df = input_df.append(append_df)
# Write recipe outputs
inter_temp = dataiku.Dataset("inter_temp")
inter_temp.write_with_schema(output_df)
I hope that this helps! I would also suggest checking out the following Pandas documentation, which provides more examples and details about how to use DataFrame.append:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html
Best,
Andrew
Hi ATsao,
Thanks for your reply it could help me in the future. I found a way without dataframe. Post here for those who whants an exemple of write_row_dict as I wanted yesterday.
# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku.core.sql import SQLExecutor2
from dataiku import pandasutils as pdu
# On lit le schema de la BDD en entrรฉe et on le copie dans le dataset temporaire
input_dataset = dataiku.Dataset("dataset")
schema = input_dataset.read_schema()
output_dataset = dataiku.Dataset("dataset_temp")
output_dataset.write_schema(schema)
##Il faut ensuite ouvrir un writer pour ajouter des lignes
try :
writer = output_dataset.get_writer()
foobar="foobar"
values = {
"colonne1": foobar,
"colonne2": foobar,
"colonne3": 1,
"colonne4" : foobar
}
##Ne prend que des valeurs de type dictionnaire
writer.write_row_dict(values)
except:
writer.close()
Sure, The marked answer is correct.
But R language is also used for Data Science.
When it comes to appending data frames, the rbind() and cbind() function comes to mind because they can concatenate the data frames horizontally and vertically. In this example, we will see how to use the rbind() function to append data frames.
To append data frames in R, use the rbind() function. The rbind() is a built-in R function that can combine several vectors, matrices, and/or data frames by rows.