Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello,I have a partitionned folder and I need to python recipe to create the associated partitionned dataset.In the notebook, I can use the following code :
files = dataiku.Folder("xxxx")files_info = files.get_info()
#chemin des fichierspaths=files.list_paths_in_partition()
df_measure=pd.DataFrame()for itemName in paths:
with files.get_download_stream(itemName) as j:contents=j.read()parsed_json=json.loads(contents)
# Write recipe outputsmeasurement = dataiku.Dataset("Measurement")measurement.write_with_schema(df_measure)
But, when back in the recipe, the partitions are managed by DSS. I also need to remove files.list_paths_in_partition() and for itemName in paths.
How can I load the right file in files.get_download_stream(itemName) ???
Thanks a lot
In the actions of your partitioned folder, you can pick the "create dataset" one. This will create a dataset which is merely a view of the files in the folder. You can then activate partitioning on this dataset.
Post a Question