Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi all,
we need to pass a parameter to a scenario, read such value from within the scenario
in a step of type Python code, then we need to pass this value to a recipe.
We've found the following post
https://community.dataiku.com/t5/Using-Dataiku-DSS/Scenario-with-parameters/m-p/3687#M2697
but unfortunately we cannot use project variables; also we've found this post:
but we actually need to read the variable from within the Python code and find a way to pass
the value to the recipe (again, without using project variables).
What we are trying to do is to run the same recipe in parallel for a few times (so we don't wait
for the recipe to finish); if we use project variables it may happen that value which should be
read by one execution of the recipe can be overwritten by a further update for next execution.
Any suggestion is warmly welcome. Best Regards.
Giuseppe
Hi Giuseppe,
What kind of recipe receives this input variable ? If it's a visual recipe, there is unfortunately no simple alternative to project variables. If it's a code recipe, maybe there is a workaround, but more details will be needed.
I'm not familiar with your use-case, but it may be a good fit for Dataiku Applications, if there is a need to instantiate multiple parametrised scenario runs at once.
Best,
Harizo
Hi Harizo,
it is a Python recipe. I didn't think about application (we're working an a POC and there isn't
enough time for developing it), but it might be worth mentioning it to the prospect. I'll take
a look. Thanks. Rgds.
Giuseppe
Hi,
the recipe is a python code recipe. Apart from to find a way to pass parameters from a scenario to a recipe, it seems there are some other roadblocks to parallel execuition of a recipe:
a) suppose to have two running instances of the recipe; they cannot write simultaneously to the same output (no matter this is a dataset, a folder or whatever else); this also applies if you have multiple outputs (i.e. it's enough to have one of this with running built and all the other writes to all the outputs cannot be done);
b) mirroring this, two different recipe cannot have the same output (that to avoid having the same job id for the write).
Applications could help, but we need to run an instance for each value of the loop.
So only way to have parallel execution of a flow seems to be to copy the same piece of flow as many times as needed.
Giuseppe