Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I agree this would be useful feature! Most people solve it using python. You only need a few lines of code, essentially a for loop which reads in the files one by one into a pd.DataFrame(), appends a column with the file name, and finally appends that DataFrame to the output dataframe, which of course you save back to dataiku.
out = pd.DataFrame(columns=columns)
for i, data in enumerate(useful_station):
print i,
try:
with open(path+data) as f:
lines=f.readlines()
a = [l.replace("\n","").split() for l in lines]
d = pd.DataFrame(a, columns=columns)
d[["Month","Day","Hour"]] = d[["Month","Day","Hour"]].astype(int).astype(str)
d["key"] = data+"-"+d["Month"]+"-"+d["Day"]+"-"+d["Hour"] # new column with file name!
if i:
out = pd.concat((out,d))
else:
out = d
except IOError:
print "********"
The code sample above was built for something similar, you can use it as inspiration, but not just run it.