Multiple S3 file read

sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron

Hi Team,

How can i read multiple S3 files from a single connection by identifying the path

Thanks in Advance


  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker


    the default for S3 datasets in DSS is to point to a S3 "folder" (ie a prefix for blob object paths), and DSS will consider all the blobs in that folder to belong to the dataset. If you want to restrict to a few paths, you can use "show advanced options" in the dataset Settings > Connection tab, and give rules to get fine-grained control over which blobs constitute the dataset.

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron


    I selected the S3 connection and gave the path where all my files are stored but i am not able to read all the files , even i am not able to list the files under that path.

    Below is the error:

    Did not find any non-empty file

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron

    Hi Team,

    Could you please help here.

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker


    the error message implies that there is no blob in your S3 bucket with the prefix you gave. You should browse the path to make sure you fill a value for the path that points to some files.

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron


    The path i provided is showing me the files when i browse it but when i tried "List Files" it is showing me the error:

    Did not find any non-empty file

    do we have to change any setting in connection?

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker

    if the path points to a place with files, then it's the inclusion/exclusion rules which are incorrect. Can you screenshot the state of the setup in that screen, and an example of full blob path you want selected?

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron


    Below you can see i can able to browse objects but not able to read or list files, even there is no rule in the advance section


  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker

    that behavior is indeed unexpected. If you have files when browsing, hitting "list files" would show them.

    You should generate an instance diagnostic in Administration > Maintenance > Diagnostic tool, and open a ticket on with the zip (sent over if too big, over 15MB)

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron


    I am now able to create a managed folder and able to see all my files.

    But how can i now read all files in my folder using python or other recipe

    Please note:

    1. I have Sub folders in my Managed folder

    2. File format is gz

    Thanks in Advance

Setup Info
      Help me…