Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi, We need to understand what is happening when instance is building from an application. In our case it is taking a lot of time for instance building and while building it shows Importing/exporting dataset reading many GB Of data.
Second question which might be related to above is when do we need to include dataset in Included content of Application. Actually we have a pipeline which is base for our application. In some of the dataset of this pipeline, we have created charts/published to dashboard and now we want to see this dashboard on application. Do we need to include this dataset in Included content before we expose its dashboard. Basically what is the significance of Including dataset in Included content, Is it why instance making is taking long time because included dataset has many GB of data?
Lastly is there any option in application to add section where user can click the link and go and see the underlying dataset of dataiku pipeline. I can find download Dataset option but cannot find "view Dataset". There is one option which says select dataset file. Is this what we should use? and with what tile behavior for our purpose?
Hi @mayur_garg while you wait for a more detailed and complete response, I wanted to point out a few resources available to learn more about Dataiku Applications:
I hope this helps!
Applications allow the parallel execution of the same data pipeline with different inputs, but this means that the flow is duplicated for every instance. I suspect that in your case, you have a lot of data to instantiate, thus it is taking a long time and space.
You can design your application to minimise this risk. See attached the example of an application pattern:
Concerning publishing charts to the dashboard, you can probably get away with sharing the underlying dataset with the application template project and defining the charts on that shared object.
Concerning showing the underlying dataset, you can publish a dataset as a dashboard insight, which allows the user to review the data without accessing the flow per se.
I hope this helps.