-
Re: Can you please help with the documentation on spark vs dss engine
There is no magic logic to decide which compute engine will be better for each recipe. Generally speaking Spark should be better for larger datasets but it will depend on many other factors. If using…1 · -
Re: How to setup Athena connection using s3 connection
There are no screen shots attached to your post. Have you reviewed the Dataiku Athena documentation? https://doc.dataiku.com/dss/latest/connecting/sql/athena.html1 · -
Re: How to know which data are used in a flow
In v12 there isn't much you can do. But v13 has a new Column-level Data Lineage so another good reason to upgrade: https://doc.dataiku.com/dss/latest/data-catalog/data-lineage/index.html1 · -
Re: Seeking Optimization Tips for DSS Flow and Spark Configuration
You have given us your requirement (to reduce the flow execution from 20mins to 5mins) but you haven’t given us any additional information to go about. Please post a picture of your flow, give detail…1 · -
Re: jupyter-run data directory
https://community.dataiku.com/discussion/comment/45475#Comment_45475 They can't. Only admins can. In v13.3.0 there is a new API to clear Jupyter outputs, likely the most common cause of huge not…1 ·