NULL instead no value

Options
marbys
marbys Registered Posts: 7

Hi,

I'm trying to write dataframe into Snowflake table with function write_with_schema. Data comes from excel file, where we have some columns with no value and we want to keep it that way. I used parameters with read_excel and also fillna('') and dataset looks ok (no nulls, only no values/empty), but after using wrIte_with_schema we are getting nulls in those columns. Is there a way to insert data with no value/string lenght 0 instead of nulls?

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    Can you check the data type of the column in the data frame and the column in the Snowflake table? Are they the same type?

  • marbys
    marbys Registered Posts: 7
    Options

    Object type for dataframe and varchar in Snowflake

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    What version of pandas are you using?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    Are you sure you are applying fillna() correctly?

    https://stackoverflow.com/a/67356585/10491951

    Can you you do a count and group by that column to see all possible values after the fillna()?

  • marbys
    marbys Registered Posts: 7
    Options

    Yes, dataframe looks ok. I just tried different option to have no value instead on null in dataframe, but it seems, that write_with_schema still converts that to null.

    429105254_1567162310702382_3338356576566462941_n.png

    429041625_1171443807573497_5522633410630892444_n.png

    429118700_910385297448119_1749607961959726017_n.png

    Even if there is '' as a value, it goes as null into Snowflake.

  • marbys
    marbys Registered Posts: 7
    Options

    429302783_1048536849586892_5340554447217456188_n.png

    Raw is my expected output

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    edited July 17
    Options

    Your Python code doesn't look right to me. This is how you usually write to a Dataset from a Python recipe:

    # Write recipe outputs
    customers_out = dataiku.Dataset("some_poutput_dataset")
    customers_out.write_with_schema(df)

    I don't see where you got OUTPUT from in your code.

  • marbys
    marbys Registered Posts: 7
    Options

    I have OUTPUT declared in a different cell.

    OUTPUT = dataiku.Dataset("OUTPUT_DATAIKU")

    Everything works, except no values(''), that somehow are converted to null when using write_with_schema/write_from_dataframe

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    Can I suggest you restart your kernel and run from top to bottom again? Or run the code in a recipe to make sure there are now left overs from you playing in the Notebook.

  • marbys
    marbys Registered Posts: 7
    Options

    Done that, I even checked e.g. fillna('5') and it goes to Snowflake as 5. I don't think it's an issue with python/pandas/dataframe, but specifically with writing to Snowflake and using built-in (write_with_schema/writre_from_dataframe) function to do that. Maybe there is a parameter or different function to use?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    Try write_with_schema(df, drop_and_create=True) and with write_dataframe(df, infer_schema=False)

  • marbys
    marbys Registered Posts: 7
    Options

    That's not a solution, everything goes to Snowflake as it should, except '' or "" or ' '- still as null

Setup Info
    Tags
      Help me…