Upload files to dataset using python api - MultipartFile parameter 'file' is not present

bcalcutt
bcalcutt Registered Posts: 3 ✭✭✭
edited July 16 in Using Dataiku

Hello, I'm trying to upload an excel file as a dataset by using the Python API. I'm receiving an error which I don't understand.

For reference, I am following the example in the API docs here: https://doc.dataiku.com/dss/latest/python-api/datasets-other.html#uploaded-datasets-programmatic-creation-and-upload

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body)
   1020                     stream = stream)
-> 1021             http_res.raise_for_status()
   1022             return http_res

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\requests\models.py in raise_for_status(self)
    939         if http_error_msg:
--> 940             raise HTTPError(http_error_msg, response=self)
    941 

HTTPError: 400 Client Error: Bad Request for url: https://dataiku.ae.ge.com//dip/publicapi/projects/ADDITIVESPARES/datasets/M2_G02_CSPL/uploaded/files

During handling of the above exception, another exception occurred:

DataikuException                          Traceback (most recent call last)
<ipython-input-27-13a19fcf17d4> in <module>
      2 
      3 with open(upload_file_path, "rb") as f:
----> 4         dataset.uploaded_add_file(f, upload_file_path)
      5 
      6 # At this point, the dataset object has been initialized, but the format is still unknown, and the

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dss\dataset.py in uploaded_add_file(self, fp, filename)
    348         :param str filename: The filename for the file to upload
    349         """
--> 350         self.client._perform_empty("POST", "/projects/%s/datasets/%s/uploaded/files" % (self.project_key, self.dataset_name),    351          files={"file":(filename, fp)})
    352 

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_empty(self, method, path, params, body, files, raw_body)
   1029 
   1030     def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):
-> 1031         self._perform_http(method, path, params=params, body=body, files=files, stream=False, raw_body=raw_body)
   1032 
   1033     def _perform_text(self, method, path, params=None, body=None,files=None, raw_body=None):

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body)
   1026             except ValueError:
   1027                 ex = {"message": http_res.text}
-> 1028             raise DataikuException("%s: %s" % (ex.get("errorType", "Unknown error"), ex.get("message", "No message")))
   1029 
   1030     def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):

DataikuException: org.springframework.web.bind.MissingServletRequestParameterException: Required MultipartFile parameter 'file' is not present

dataset = proj.create_upload_dataset("M2_G02_CSPL") # you can add connection= for the target connection

with open(upload_file_path, "rb") as f:
dataset.uploaded_add_file(f, upload_file_path)

# At this point, the dataset object has been initialized, but the format is still unknown, and the
# schema is empty, so the dataset is not yet usable

# We run autodetection
settings = dataset.autodetect_settings()
# settings is now an object containing the "suggested" new dataset settings, including the detected format
# andcompleted schema
# We can just save the new settings in order to "accept the suggestion"
settings.save()

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
    edited July 17

    Hi,

    Could please share you full code sample and DSS version

    . I've tested the below sample code and it worked fine on DSS 9.0.4, based on the error it appears it's unable to read the file you defined at upload_file_path.

    import dataiku
    from dataiku import pandasutils as pdu
    import pandas as pd
    
    client = dataiku.api_client()
    proj = client.get_project(dataiku.get_custom_variables()["projectKey"])
    upload_file_path= "/Users/myuser/Downloads/SAMPLE.xlsx"
    
    dataset = proj.create_upload_dataset("SAMPLE_OTHER") # you can add connection= for the target connection
    
    with open(upload_file_path, "rb") as f:
        dataset.uploaded_add_file(f, upload_file_path)
    
    # At this point, the dataset object has been initialized, but the format is still unknown, and the
    # schema is empty, so the dataset is not yet usable
    
    # We run autodetection
    settings = dataset.autodetect_settings()
    # settings is now an object containing the "suggested" new dataset settings, including the detected format
    # andcompleted schema
    # We can just save the new settings in order to "accept the suggestion"
    settings.save()

Setup Info
    Tags
      Help me…