Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Team,
How can i read multiple S3 files from a single connection by identifying the path
Thanks in Advance
Hi,
the default for S3 datasets in DSS is to point to a S3 "folder" (ie a prefix for blob object paths), and DSS will consider all the blobs in that folder to belong to the dataset. If you want to restrict to a few paths, you can use "show advanced options" in the dataset Settings > Connection tab, and give rules to get fine-grained control over which blobs constitute the dataset.
Hi,
I selected the S3 connection and gave the path where all my files are stored but i am not able to read all the files , even i am not able to list the files under that path.
Below is the error:
Did not find any non-empty file
Hi Team,
Could you please help here.
Hi,
the error message implies that there is no blob in your S3 bucket with the prefix you gave. You should browse the path to make sure you fill a value for the path that points to some files.
Hi,
The path i provided is showing me the files when i browse it but when i tried "List Files" it is showing me the error:
Did not find any non-empty file
do we have to change any setting in connection?
if the path points to a place with files, then it's the inclusion/exclusion rules which are incorrect. Can you screenshot the state of the setup in that screen, and an example of full blob path you want selected?
Hi,
Below you can see i can able to browse objects but not able to read or list files, even there is no rule in the advance section
that behavior is indeed unexpected. If you have files when browsing, hitting "list files" would show them.
You should generate an instance diagnostic in Administration > Maintenance > Diagnostic tool, and open a ticket on support.dataiku.com with the zip (sent over dl.dataiku.com if too big, over 15MB)
Hi,
I am now able to create a managed folder and able to see all my files.
But how can i now read all files in my folder using python or other recipe
Please note:
1. I have Sub folders in my Managed folder
2. File format is gz
Thanks in Advance