Upload files to dataset using python api - MultipartFile parameter 'file' is not present

Registered Posts: 3 ✭✭✭
edited July 2024 in Using Dataiku

Hello, I'm trying to upload an excel file as a dataset by using the Python API. I'm receiving an error which I don't understand.

For reference, I am following the example in the API docs here: https://doc.dataiku.com/dss/latest/python-api/datasets-other.html#uploaded-datasets-programmatic-creation-and-upload

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body)
   1020                     stream = stream)
-> 1021             http_res.raise_for_status()
   1022             return http_res

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\requests\models.py in raise_for_status(self)
    939         if http_error_msg:
--> 940             raise HTTPError(http_error_msg, response=self)
    941 

HTTPError: 400 Client Error: Bad Request for url: https://dataiku.ae.ge.com//dip/publicapi/projects/ADDITIVESPARES/datasets/M2_G02_CSPL/uploaded/files

During handling of the above exception, another exception occurred:

DataikuException                          Traceback (most recent call last)
<ipython-input-27-13a19fcf17d4> in <module>
      2 
      3 with open(upload_file_path, "rb") as f:
----> 4         dataset.uploaded_add_file(f, upload_file_path)
      5 
      6 # At this point, the dataset object has been initialized, but the format is still unknown, and the

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dss\dataset.py in uploaded_add_file(self, fp, filename)
    348         :param str filename: The filename for the file to upload
    349         """
--> 350         self.client._perform_empty("POST", "/projects/%s/datasets/%s/uploaded/files" % (self.project_key, self.dataset_name),    351          files={"file":(filename, fp)})
    352 

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_empty(self, method, path, params, body, files, raw_body)
   1029 
   1030     def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):
-> 1031         self._perform_http(method, path, params=params, body=body, files=files, stream=False, raw_body=raw_body)
   1032 
   1033     def _perform_text(self, method, path, params=None, body=None,files=None, raw_body=None):

c:\users\212408256\appdata\local\programs\python\python39\lib\site-packages\dataikuapi\dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body)
   1026             except ValueError:
   1027                 ex = {"message": http_res.text}
-> 1028             raise DataikuException("%s: %s" % (ex.get("errorType", "Unknown error"), ex.get("message", "No message")))
   1029 
   1030     def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):

DataikuException: org.springframework.web.bind.MissingServletRequestParameterException: Required MultipartFile parameter 'file' is not present

dataset = proj.create_upload_dataset("M2_G02_CSPL") # you can add connection= for the target connection

with open(upload_file_path, "rb") as f:
dataset.uploaded_add_file(f, upload_file_path)

# At this point, the dataset object has been initialized, but the format is still unknown, and the
# schema is empty, so the dataset is not yet usable

# We run autodetection
settings = dataset.autodetect_settings()
# settings is now an object containing the "suggested" new dataset settings, including the detected format
# andcompleted schema
# We can just save the new settings in order to "accept the suggestion"
settings.save()

Answers

  • Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,270 Dataiker
    edited July 2024

    Hi,

    Could please share you full code sample and DSS version

    . I've tested the below sample code and it worked fine on DSS 9.0.4, based on the error it appears it's unable to read the file you defined at upload_file_path.

    import dataiku
    from dataiku import pandasutils as pdu
    import pandas as pd
    
    client = dataiku.api_client()
    proj = client.get_project(dataiku.get_custom_variables()["projectKey"])
    upload_file_path= "/Users/myuser/Downloads/SAMPLE.xlsx"
    
    dataset = proj.create_upload_dataset("SAMPLE_OTHER") # you can add connection= for the target connection
    
    with open(upload_file_path, "rb") as f:
        dataset.uploaded_add_file(f, upload_file_path)
    
    # At this point, the dataset object has been initialized, but the format is still unknown, and the
    # schema is empty, so the dataset is not yet usable
    
    # We run autodetection
    settings = dataset.autodetect_settings()
    # settings is now an object containing the "suggested" new dataset settings, including the detected format
    # andcompleted schema
    # We can just save the new settings in order to "accept the suggestion"
    settings.save()

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.