Currently, use of the Dataiku API within shaker script python processors results in "No DSS URL or API key found in any location" upon trying to access a dataset. Ideally, users can use the Python API from within a prepare recipe to pull in data from other datasets as needed.
For example, consider the use-case that I'd like to access a metric from a prior dataset in my prepare recipe, such as to record the last-run timestamp to fill a column. Currently, this doesn't seem to be possible. Instead, I'd need to run a separate Python recipe before my prepare recipe to create this column. While in this simple use-case, that's not too inconvenient, for more complex cases where computation needs to be performed or there are many values needing to be checked, it can quickly complicate a project to require Python code that accesses the Dataiku API to live outside prepare recipes, and really restricts the power of the Python processor for prepare recipes. I think adding this capability would be really valuable for complex prepare recipes, since the shaker scripts can really speed up implementation time for complex data transformations and are preferable to writing a Python-only transformation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.