Are there any ways with Python to create a loop to read files all in a zip file?
Im trying to create a loop in Dataiku using a loop fuction to read files all in the same zip, but is there a way to read the files without writing the names static ly ?
How would that work? Maybe assigning each file to a different variable? But do i have to write the file name this way? Or the loop will assign the variable with each file they read?
Thank you any help/comment is greatly appreciated
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
Welcome to the Dataiku community.
There are some built in ways to deal with ziped files as if they were unzipped, data sources. This is a really cool feature that might help in your case. This talks about the subject.
In a python recipient Dataiku can give you all of the files in a managed folder or directory. You might find these conversations useful in what you are trying to do.
here is some doc on working with managed folders
https://doc.dataiku.com/dss/latest/python-api/managed_folders.html?highlight=managed%20folder#module-dataikuapi.dss.managedfolder
let us know if this is of help and how you are getting along with your project. -
So the first step would be to import the file as a zip (simple import) and the second in python i can unzip it and assign variables to each file. Would that work?