Exception: An error occurred during dataset write (VqRYUkpmtr): IllegalArgumentException: Connection
I get the following error when attempting to write data from a dataframe within a python recipe. This error happens when using either "write_from_dataframe" (deprecated), "write_dataframe", or "write_with_schema".
> Exception: An error occurred during dataset write (VqRYUkpmtr): IllegalArgumentException: Connection
What should I try?
Full error below:
Best Answer
-
This error can occur when your Snowflake connection has automatic fast-write enabled, but you haven't configured it.
Please look at your Snowflake connection and check your automatic fast-write settings. If automatic fast-write is checked, you must also fill in the "Auto. fast-write connection" and "Path in connection" fields.
If you don't want to use automatic fast-write, you can also fix the issue by unchecking it.
For more information about automatic fast-write, refer to our documentation.
Answers
-
Hi @Rickh008
,It looks like the dataset that you're trying to write data to doesn't have a connection configured. Is this an existing dataset, or are you creating it programmatically with Python?
If it's an existing dataset, please make sure that it has a connection configured. On the dataset's page, go to Settings > Connection. Note that the settings that are available will differ depending on your connection type.
If you're creating the dataset, please make sure that you're specifying a connection on creation. The following example will create a managed dataset called my_dataset with the filesystem_managed connection:
import dataiku client = dataiku.api_client() project = client.get_default_project() # Create the managed dataset builder = project.new_managed_dataset("my_dataset") builder.with_store_into("filesystem_managed") dss_dataset = builder.create() dataset = dss_dataset.get_as_core_dataset() # Write a dataframe to the new dataset dataset.write_with_schema(my_dataframe)
For more information about how to create datasets with Python, see Datasets (introduction) and Programmatic creation and setup (managed datasets).
If this doesn't solve your issue, please post your full code so that I can assist you better.
Thanks,
Zach
-
Thanks for your response @ZachM
I am creating a new dataframe in a Snowflake database and have ensured that I have the proper connection settings entered. Picture of connection settings below. I attempted to rerun my code after pre-creating the table using "create table now" in the connection settings and received the same error. My code is posted below.
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
ohs_Features_by_MGRID = dataiku.Dataset("OHS_Features_by_MGRID")
ohs_Features_by_MGRID_df = ohs_Features_by_MGRID.get_dataframe()
df = ohs_Features_by_MGRID_df
df.loc[df['TEST_COMPLETION_COUNT'] < 5, 'OHS_LEAVING_avg':'WELLNESS_LAG_DIFF_avg'] = 0
ohs_FEATURES_BY_MGR_CLEAN_df = df
# Write recipe outputs
ohs_FEATURES_BY_MGR_CLEAN = dataiku.Dataset("OHS_FEATURES_BY_MGR_CLEAN")
# Set the schema of ‘myoutputdataset’ to match the columns of the dataframe
ohs_FEATURES_BY_MGR_CLEAN.write_schema_from_dataframe(ohs_FEATURES_BY_MGR_CLEAN_df)
# Write the dataframe without touching the schema
ohs_FEATURES_BY_MGR_CLEAN.write_dataframe(ohs_FEATURES_BY_MGR_CLEAN_df)
ohs_FEATURES_BY_MGR_CLEAN.write_with_schema(ohs_FEATURES_BY_MGR_CLEAN_df)
-
I am unable to view my "Connections" settings so I have made an internal IT ticket to look at this, per your suggestion. Thanks
Is there anything else I should look at?
-
Other than automatic fast-write, the only other thing I can think of is that there's something set up weirdly on the dataset.
If automatic fast-write isn't the issue, I would recommend creating a new managed dataset via the recipe configuration page, and replacing the old output dataset with it. This would verify that the dataset is configured correctly.
-
Strangely enough, I was able to temporarily get around this issue by storing to a different snowflake database & schema under a a different connection selection (snowflake role, in this case).
Thanks for your help!
-
@ZachM
My IT support was able to disable "Auto. fast-write connection" and this solved my connection issue. Addressing the S3 connection so that we can use this function in the future.Thanks again.