Create a SQL output dataset with auto increment id column
Hi,
I am trying to create a SQL output dataset with auto increment ID column ( Primary Key ), then using Python recipe which writes to this output dataset excluding the ID column. I could not find any example to modify the manual schema to create Primary key ID column and write to the same. Is there a way to achieve this in DSS ?
TIA.
Answers
-
Hi esakkiraj,
If the ask is to add an "index" column to the output dataset, then you can certainly handle this through Python. What you would add to do is to make sure that this index column is added as a column into the pandas dataframe itself (after converting the input dataset into a dataframe) and then write to your output dataset using the standard write_with_schema (which will overwrite the schema). For example, you could do something like:
import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu # Read recipe inputs input = dataiku.Dataset("input_dataset") input_df = input.get_dataframe() # Add index column to dataframe and copy to output input_df.reset_index(level=0, inplace=True) output_df = input_df # Write recipe outputs output = dataiku.Dataset("output_dataset") output.write_with_schema(output_df)
Let me know if that helps!
Thanks,
Andrew