Exception: An error occurred during dataset write (VqRYUkpmtr): IllegalArgumentException: Connection

Options
Rickh008
Rickh008 Dataiku DSS Core Designer, Registered Posts: 15 ✭✭✭✭

I get the following error when attempting to write data from a dataframe within a python recipe. This error happens when using either "write_from_dataframe" (deprecated), "write_dataframe", or "write_with_schema".

> Exception: An error occurred during dataset write (VqRYUkpmtr): IllegalArgumentException: Connection

What should I try?

Full error below:

Capture.PNGCapture2.PNG

Best Answer

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    Answer ✓
    Options

    This error can occur when your Snowflake connection has automatic fast-write enabled, but you haven't configured it.

    Please look at your Snowflake connection and check your automatic fast-write settings. If automatic fast-write is checked, you must also fill in the "Auto. fast-write connection" and "Path in connection" fields.

    51B87D3F-B3AA-4D48-9BD5-1E2F532C4C42_1_201_a.jpeg

    If you don't want to use automatic fast-write, you can also fix the issue by unchecking it.

    For more information about automatic fast-write, refer to our documentation.

Answers

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    edited 3:58PM
    Options

    Hi @Rickh008
    ,

    It looks like the dataset that you're trying to write data to doesn't have a connection configured. Is this an existing dataset, or are you creating it programmatically with Python?

    If it's an existing dataset, please make sure that it has a connection configured. On the dataset's page, go to Settings > Connection. Note that the settings that are available will differ depending on your connection type.

    C95502FC-E77E-40EF-96A2-19AEC8D68848_1_201_a.jpeg

    If you're creating the dataset, please make sure that you're specifying a connection on creation. The following example will create a managed dataset called my_dataset with the filesystem_managed connection:

    import dataiku
    
    client = dataiku.api_client()
    project = client.get_default_project()
    
    # Create the managed dataset
    builder = project.new_managed_dataset("my_dataset")
    builder.with_store_into("filesystem_managed")
    dss_dataset = builder.create()
    
    dataset = dss_dataset.get_as_core_dataset()
    # Write a dataframe to the new dataset
    dataset.write_with_schema(my_dataframe)

    For more information about how to create datasets with Python, see Datasets (introduction) and Programmatic creation and setup (managed datasets).

    If this doesn't solve your issue, please post your full code so that I can assist you better.

    Thanks,

    Zach

  • Rickh008
    Rickh008 Dataiku DSS Core Designer, Registered Posts: 15 ✭✭✭✭
    Options

    Thanks for your response @ZachM

    I am creating a new dataframe in a Snowflake database and have ensured that I have the proper connection settings entered. Picture of connection settings below. I attempted to rerun my code after pre-creating the table using "create table now" in the connection settings and received the same error. My code is posted below.

    import dataiku

    import pandas as pd, numpy as np

    from dataiku import pandasutils as pdu

    # Read recipe inputs

    ohs_Features_by_MGRID = dataiku.Dataset("OHS_Features_by_MGRID")

    ohs_Features_by_MGRID_df = ohs_Features_by_MGRID.get_dataframe()

    df = ohs_Features_by_MGRID_df

    df.loc[df['TEST_COMPLETION_COUNT'] < 5, 'OHS_LEAVING_avg':'WELLNESS_LAG_DIFF_avg'] = 0

    ohs_FEATURES_BY_MGR_CLEAN_df = df

    # Write recipe outputs

    ohs_FEATURES_BY_MGR_CLEAN = dataiku.Dataset("OHS_FEATURES_BY_MGR_CLEAN")

    # Set the schema of ‘myoutputdataset’ to match the columns of the dataframe

    ohs_FEATURES_BY_MGR_CLEAN.write_schema_from_dataframe(ohs_FEATURES_BY_MGR_CLEAN_df)

    # Write the dataframe without touching the schema

    ohs_FEATURES_BY_MGR_CLEAN.write_dataframe(ohs_FEATURES_BY_MGR_CLEAN_df)

    ohs_FEATURES_BY_MGR_CLEAN.write_with_schema(ohs_FEATURES_BY_MGR_CLEAN_df)

  • Rickh008
    Rickh008 Dataiku DSS Core Designer, Registered Posts: 15 ✭✭✭✭
    Options

    I am unable to view my "Connections" settings so I have made an internal IT ticket to look at this, per your suggestion. Thanks

    Is there anything else I should look at?

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    Options

    Other than automatic fast-write, the only other thing I can think of is that there's something set up weirdly on the dataset.

    If automatic fast-write isn't the issue, I would recommend creating a new managed dataset via the recipe configuration page, and replacing the old output dataset with it. This would verify that the dataset is configured correctly.

    1FB94E06-06C6-4EEA-BACE-AF2B623EACBA_1_105_c.jpeg

  • Rickh008
    Rickh008 Dataiku DSS Core Designer, Registered Posts: 15 ✭✭✭✭
    Options

    Strangely enough, I was able to temporarily get around this issue by storing to a different snowflake database & schema under a a different connection selection (snowflake role, in this case).

    Thanks for your help!

  • Rickh008
    Rickh008 Dataiku DSS Core Designer, Registered Posts: 15 ✭✭✭✭
    Options

    @ZachM
    My IT support was able to disable "Auto. fast-write connection" and this solved my connection issue. Addressing the S3 connection so that we can use this function in the future.

    Thanks again.

Setup Info
    Tags
      Help me…