Importing a hierarchical file system from an ec2 connection

UserBird
Dataiker
Importing a hierarchical file system from an ec2 connection
I have an ec2 connection that connects to a hierarchical file system that contains many sublevels and 100+ bottom-level files. I would like to import all of these into a DSS project, preferably while keeping the hierarchy, or if that's not possible naming each file with the complete path (e.g. root/folder/file1 becomes root_folder_file1), or if that's not possible just mass importing all of the bottom-level files (file1, file2, file3...). Functionally, I want to click on the root folder and hit "import all".

I do not see a "mass dataset creation" option in the Connections page of the Administrator panel. When I try to import a folder, it appears to stack all datasets that match the schema of the first dataset (e.g. folder contains file1 and file2, if file1 and file2 contain the same column headers it combines them into the stacked file1+2) and ignore the rest of the files in the folder that do not match the schema of the first dataset (e.g. if file1 and file2 have different schemas, it will only upload file1 and will ignore file2). Whatever the actual behavior is, it definitely doesn't import everything and keep the hierarchy.

Is it possible to do this?

Thanks so much!
0 Kudos
0 Replies

Labels

?
Labels (3)
A banner prompting to get Dataiku