Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on September 7, 2022 6:00PM
Likes: 0
Replies: 3
Problem Statement: Right now, there are great, no-code visual recipes in DSS for select, common machine learning algorithms and data wrangling tools. In other cases, code recipes are required and users repeatedly have to both code the algorithms and add the same lines of code wrappers in DSS code recipes to map to datasets for test, train, output, etc. data sets. There is a middle ground of algorithms that are repeatedly created in code by various users and would be ripe for modularization into recipes that do not exist yet. For example, there are many important algorithms in modules like Python's sklearn or R's tidymodels/caret that that do not have visual recipes in DSS. However, there are so many that maintaining all those recipes would be challenging for Dataiku.
Solution: What if... Dataiku could parse the requirements for each algorithm in popular packages and auto-generate a form for entry of parameters using a single visual recipe module? The form fields would be auto-generated by mapping and exposing the modules syntax and variables within a single recipe template. It is a simple trick we use in custom applications now. This would provide a many-to-one mapping of algorithms to one recipe and avoid a large effort of software maintenance.
Scenarios/UseCases: See attached. Using a module requiring code (sklearn adaboost) as an example, a user might select a new option for autogenerated recipes in the model algorithm selector screen, type in "sklearn.ensemble.AdaBoostRegressor", then dataiku would import the algorithm, assemble the proper input and output datasets, read the documentation, and present a screen with default parameters (see attached).
To take this maybe one step further. What about using something like github copilot?
Interesting idea @tgb417
. That is a very similar concept for sure, though it would need to be within the visual recipe framework. Lets see what DataIKU says.
Any update on this? Would very much like to integrate Github copilot