Load multiple dataset at the same time
 
            
                
                    ele_f                
                
                    Registered Posts: 17 ✭✭✭✭                
            
                        
            
                    Hi, 
I need to load multiple dataset from my local pc to a dss project.
Is there a way to load al the file together and assign to each of them a different dataset name?
At the moment if multiple files are loaded in dss (and have the same schema) they are union together into a unique file.
This functionality would really streamline the data import process, if available in dss.
Thanks
                        I need to load multiple dataset from my local pc to a dss project.
Is there a way to load al the file together and assign to each of them a different dataset name?
At the moment if multiple files are loaded in dss (and have the same schema) they are union together into a unique file.
This functionality would really streamline the data import process, if available in dss.
Thanks
            Tagged:
            
        
            Answers
- 
            I agree this would be useful feature! Most people solve it using python. You only need a few lines of code, essentially a for loop which reads in the files one by one into a pd.DataFrame(), appends a column with the file name, and finally appends that DataFrame to the output dataframe, which of course you save back to dataiku. out = pd.DataFrame(columns=columns)
 for i, data in enumerate(useful_station):
 print i,
 try:
 with open(path+data) as f:
 lines=f.readlines()
 a = [l.replace("\n","").split() for l in lines]
 d = pd.DataFrame(a, columns=columns)
 d[["Month","Day","Hour"]] = d[["Month","Day","Hour"]].astype(int).astype(str)
 d["key"] = data+"-"+d["Month"]+"-"+d["Day"]+"-"+d["Hour"] # new column with file name!
 if i:
 out = pd.concat((out,d))
 else:
 out = d
 except IOError:
 print
 print "********"The code sample above was built for something similar, you can use it as inspiration, but not just run it. 
- 
            @matthias.funke , after creating the DataFrame with filenames in a new column, how would one create multiple datasets from it?

