Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
trying to load the Snowflake table via python data frame.
test_data - my Snowfkale table
python code:
test_data = dataiku.Dataset("test_data")
test_data .write_with_schema(test_df)
I am able to load the table without a partition.
**********************************************************************
Enabled partition by activating the partition on the table definition (in table explorer)
When I enabled the partition and execute the python recipe getting the below error,
ERROR:dataiku.core.dataset_write:Exception caught while writing Traceback (most recent call last): File "/data/dataiku/dataiku-dss-10.0.3/python/dataiku/core/dataset_write.py", line 218, in run self.streaming_api.wait_write_session(self.session_id) File "/data/dataiku/dataiku-dss-10.0.3/python/dataiku/core/dataset_write.py", line 181, in wait_write_session raise Exception(u'An error occurred during dataset write (%s): %s' % (id, decoded_resp["message"])) Exception: An error occurred during dataset write (a2YqezyvBf): RuntimeException: A partition ID must be provided, because the dataset TESTEXPLORER.test_explorer_data is partitioned ERROR:dataiku.core.dataset_write:RemoteStreamWriter thread failed Traceback (most recent call last): File "/data/dataiku/data_dir/code-envs/python/python-36-dev-mk/lib/python3.6/site-packages/requests/adapters.py", line 474, in send low_conn.send(b'\r\n') File "/usr/lib64/python3.6/http/client.py", line 1000, in send self.sock.sendall(data) BrokenPipeError: [Errno 32] Broken pipe During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/data/dataiku/dataiku-dss-10.0.3/python/dataiku/core/dataset_write.py", line 147, in run self.streaming_api.push_data(self.id,self._generate()) File "/data/dataiku/dataiku-dss-10.0.3/python/dataiku/core/dataset_write.py", line 184, in push_data jek_or_backend_void_call("datasets/push-data/", params={"id": id}, data=generator, err_msg="Streaming: push-data call failed") File "/data/dataiku/dataiku-dss-10.0.3/python/dataiku/core/intercom.py", line 440, in jek_or_backend_void_call return backend_void_call(path, data, err_msg, **kwargs) File "/data/dataiku/dataiku-dss-10.0.3/python/dataiku/core/intercom.py", line 431, in backend_void_call return _handle_void_resp(backend_api_post_call(path, data, **kwargs), err_msg = err_msg) File "/data/dataiku/dataiku-dss-10.0.3/python/dataiku/core/intercom.py", line 361, in backend_api_post_call **kwargs) File "/data/dataiku/data_dir/code-envs/python/python-36-dev-mk/lib/python3.6/site-packages/requests/sessions.py", line 577, in post return self.request('POST', url, data=data, json=json, **kwargs) File "/data/dataiku/data_dir/code-envs/python/python-36-dev-mk/lib/python3.6/site-packages/requests/sessions.py", line 529, in request resp = self.send(prep, **send_kwargs) File "/data/dataiku/data_dir/code-envs/python/python-36-dev-mk/lib/python3.6/site-packages/requests/sessions.py", line 645, in send r = adapter.send(request, **kwargs) File "/data/dataiku/data_dir/code-envs/python/python-36-dev-mk/lib/python3.6/site-packages/requests/adapters.py", line 501, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: [Errno 32] Broken pipe
Hi @muthu11,
Thanks for posting.
It looks like the error of interest in this case is:
Exception: An error occurred during dataset write (a2YqezyvBf): RuntimeException: A partition ID must be provided, because the dataset TESTEXPLORER.test_explorer_data is partitioned
It’s curious you mentioned you’re encountering `A partition ID must be provided` error when running your code from a Python recipe (https://doc.dataiku.com/dss/latest/code_recipes/python.html) because I would expect a Python recipe to automatically handle partition dependencies.
For example, the following code runs successfully when executed from a Python recipe in my lab environment:
[...]
partitioned_output = dataiku.Dataset("partitioned_output2")
partitioned_output.write_with_schema(partitioned_output_df)
Note: the partition to write to is specified next to the run button in the Python recipe (the UI will not allow you to run the recipe unless a partition is specified):
Note: `partitioned_output2` is a DSS partitioned dataset being written to Snowflake
The same code, when when run from a Python notebook fails with the following error:
ERROR:dataiku.core.dataset_write:Exception caught while writing
Traceback (most recent call last):
File "/Users/mgallegos/Library/DataScienceStudio/kits/dataiku-dss-10.0.5-osx/python/dataiku/core/dataset_write.py", line 229, in run
self.streaming_api.wait_write_session(self.session_id)
File "/Users/mgallegos/Library/DataScienceStudio/kits/dataiku-dss-10.0.5-osx/python/dataiku/core/dataset_write.py", line 192, in wait_write_session
raise Exception(u'An error occurred during dataset write (%s): %s' % (id, decoded_resp["message"]))
Exception: An error occurred during dataset write (Yvssn6hyqi): RuntimeException: A partition ID must be provided, because the dataset COMMUNITY_23249.partitioned_output2 is partitioned
To allow the code to run in a Python notebook I must specify a partition to write to, for example using `set_write_partition`:
[...]
partitioned_output = dataiku.Dataset("partitioned_output")
partitioned_output.set_write_partition(spec='2014') # specify a partition to write to when inside a Python _notebook_
partitioned_output.write_with_schema(partitioned_output_df)
---
Can you provide the following:
Thank you,
Mike