Upload files to dataset using python api - MultipartFile parameter 'file' is not present
bcalcutt
Registered Posts: 3 ✭✭✭
Hello, I'm trying to upload an excel file as a dataset by using the Python API. I'm receiving an error which I don't understand.
For reference, I am following the example in the API docs here: https://doc.dataiku.com/dss/latest/python-api/datasets-other.html#uploaded-datasets-programmatic-creation-and-upload
--------------------------------------------------------------------------- HTTPError Traceback (most recent call last) c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body) 1020 stream = stream) -> 1021 http_res.raise_for_status() 1022 return http_res c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\requests\models.py in raise_for_status(self) 939 if http_error_msg: --> 940 raise HTTPError(http_error_msg, response=self) 941 HTTPError: 400 Client Error: Bad Request for url: https://dataiku.ae.ge.com//dip/publicapi/projects/ADDITIVESPARES/datasets/M2_G02_CSPL/uploaded/files During handling of the above exception, another exception occurred: DataikuException Traceback (most recent call last) <ipython-input-27-13a19fcf17d4> in <module> 2 3 with open(upload_file_path, "rb") as f: ----> 4 dataset.uploaded_add_file(f, upload_file_path) 5 6 # At this point, the dataset object has been initialized, but the format is still unknown, and the c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dss\dataset.py in uploaded_add_file(self, fp, filename) 348 :param str filename: The filename for the file to upload 349 """ --> 350 self.client._perform_empty("POST", "/projects/%s/datasets/%s/uploaded/files" % (self.project_key, self.dataset_name), 351 files={"file":(filename, fp)}) 352 c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_empty(self, method, path, params, body, files, raw_body) 1029 1030 def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None): -> 1031 self._perform_http(method, path, params=params, body=body, files=files, stream=False, raw_body=raw_body) 1032 1033 def _perform_text(self, method, path, params=None, body=None,files=None, raw_body=None): c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body) 1026 except ValueError: 1027 ex = {"message": http_res.text} -> 1028 raise DataikuException("%s: %s" % (ex.get("errorType", "Unknown error"), ex.get("message", "No message"))) 1029 1030 def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None): DataikuException: org.springframework.web.bind.MissingServletRequestParameterException: Required MultipartFile parameter 'file' is not present
dataset = proj.create_upload_dataset("M2_G02_CSPL") # you can add connection= for the target connection with open(upload_file_path, "rb") as f: dataset.uploaded_add_file(f, upload_file_path) # At this point, the dataset object has been initialized, but the format is still unknown, and the # schema is empty, so the dataset is not yet usable # We run autodetection settings = dataset.autodetect_settings() # settings is now an object containing the "suggested" new dataset settings, including the detected format # andcompleted schema # We can just save the new settings in order to "accept the suggestion" settings.save()
Tagged:
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,212 Dataiker
Hi,
Could please share you full code sample and DSS version
. I've tested the below sample code and it worked fine on DSS 9.0.4, based on the error it appears it's unable to read the file you defined at upload_file_path.
import dataiku from dataiku import pandasutils as pdu import pandas as pd client = dataiku.api_client() proj = client.get_project(dataiku.get_custom_variables()["projectKey"]) upload_file_path= "/Users/myuser/Downloads/SAMPLE.xlsx" dataset = proj.create_upload_dataset("SAMPLE_OTHER") # you can add connection= for the target connection with open(upload_file_path, "rb") as f: dataset.uploaded_add_file(f, upload_file_path) # At this point, the dataset object has been initialized, but the format is still unknown, and the # schema is empty, so the dataset is not yet usable # We run autodetection settings = dataset.autodetect_settings() # settings is now an object containing the "suggested" new dataset settings, including the detected format # andcompleted schema # We can just save the new settings in order to "accept the suggestion" settings.save()