dtype in dataiku.Dataset().get_dataframe()

Options
UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker
edited July 16 in Using Dataiku

Is there a way to use an equivalent of dtype (from pd.read_table()) inside dataiku.Dataset() or dataiku.Dataset.get_dataframe() ?


my_file = pd.read_table("input_file"
, dtype={
'field1':str,
,'field2':str})

I'm trying, but both of these output an unexpected keyword argument error :


mydataset = dataiku.Dataset("input_file"
, dtype={
'field1':str,
,'field2':str})
my_file = mydataset.get_dataframe()


mydataset = dataiku.Dataset("input_file")
my_file = mydataset.get_dataframe(dtype={
'field1':str,
,'field2':str})

Thanks

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Options
    At the moment (DSS 4.0), it's not possible to force dtypes. This is something we're considering adding.

    You can however use "infer_with_pandas=False", which will force the dtypes as specified by the dataset schema.
Setup Info
    Tags
      Help me…