Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Sorry for the delay on these.
1. To perform the same set of feature processing steps on your model as you would the rest of your flow you when you built it, you would either need to
* consolidate those steps all into the "Steps" component of the AutoML capability - see the scoring.png
* or Pre-process these and use an Enrichment (dataset lookup), as part of your API.
* or Create a set of custom code and deploy this as part of an arbitrary Python or R code.
2. DSS has its own lineage tooling in the form of the Catalog. You have the potential to read out that lineage by reading it out of the REST API . You can list the datasets, recipes, models, and plugins and rebuild the flow lineage. Our most complete external catalog integration to date is with Alation. Any other integrations would involve Dataiku services or your own internal development.
3. This would require an in-depth discussion. DSS has some capabilities in this area, but for any industry-based standard (PCI, HIPPA, etc.), we would likely look to include a partner tool for assistance.
4. DSS can deploy the AutoML prediction models, custom Python and R models, and arbitrary Python or R code with any combinations of libraries. Please look through our documentation on API Endpoints to get a better sense of what we can accomplish. Happy to take follow-ups. https://doc.dataiku.com/dss/latest/apinode/endpoints.html
5. Yes. One of the course tenets of DSS is being an end-to-end solution for data science and self-service analytics. Data ingestion and transformations are things we excel at inside DSS. The caveat I would make is we are focused more on analysts and data science communities and their particular problems. While we have teams that use DSS for more EDW loading scenarios, DSS can make use of the same Spark compute clusters used by these tools), it is not a core function of DSS and customers would likely be better served with the likes of an Informatica, Data Stage, and others.