Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi all,
I have the following case:
3 different files in sharepoint that have as filename title and timestamp. I have created connection with the Sharepoint and I have created the databases but I want to ask:
1)If there is a way DSS ignore the timestamp and create the dataset based only on the title?
2) Can I trigger a scenario to update the database in dss when the file in the sharepoint site gets updated?
Thank you!
1) You can add a Folder pointing to share point where the files will be present.
Use a Files in Folder dataset where you can use glob/regex
If this doesn't suffice, you could use Python recipe to read from the folder.
Another option is to use a partitioned folder/dataset and sync the latest available if you just need the last file.
https://doc.dataiku.com/dss/latest/partitions/index.html
If the dataset is pointing to a path and new files is added it won't be detect you can use a folder instead of dataset that should detect any changes in that folders path.
Thanks
Hi @e_pap ,
I am not sure I understand what you mean for (1). Could explain perhaps provide a screenshot of what you want to include?
2) You can use file modified trigger https://doc.dataiku.com/dss/latest/scenarios/triggers.html#dataset-modification-triggers on Sharepoint files datasets. File modified on Sharepoint Lists datasets is not supported.
Hi @AlexT ,
Thank you for your response!
To clarify my first question, I have a scenario like this where the "Sample" part if always the same but the timestamp changes. Is there a way to connect with the files in sharepoint with a regex or something similar?
โ
Regarding the second question I created a scenario like the below
but what I observed is that I had to go to the dataset I created from the sharepoint file and update the path in the settings tab to trigger the scenario. When I just dropping an updated file in the sharepoint site the scenario didn't seem to recognise the change. Maybe my configuration is not correct or I missed a step.
1) You can add a Folder pointing to share point where the files will be present.
Use a Files in Folder dataset where you can use glob/regex
If this doesn't suffice, you could use Python recipe to read from the folder.
Another option is to use a partitioned folder/dataset and sync the latest available if you just need the last file.
https://doc.dataiku.com/dss/latest/partitions/index.html
If the dataset is pointing to a path and new files is added it won't be detect you can use a folder instead of dataset that should detect any changes in that folders path.
Thanks
It worked!
Thank you @AlexT