Hybrid Python/R code environment

fmonari Registered Posts: 18 ✭✭✭✭

I am facing a problem wherein I need to create a custom R model in DSS because I can't find a corresponding algorithm in Python. While this can be done with code recipes, it is a bit inconvenient to track all the experiment necessary during model development. Thus I was thinking to use a fictitious python model, calling R in the background through packages like rpy2.

For cases like this, I think it would be great to have the possibility to build hybrid Python/R code environments and it should be not too difficult to modifying the building process of the Docker containers accordingly.

Maybe it could be possible to think at giving the possibility, as advanced option, to specify code environments as Docker files or images inheriting from dku-spark-base-teunrfbekcqtcymgo9jmomdg:dss-9.0.4.

1 votes

New · Last Updated


  • ClaudiusH
    ClaudiusH Alpha Tester, Dataiker Alumni, Registered Posts: 106 ✭✭✭✭✭✭

    For reference: the earlier discussion with background on this idea is Hybrid code environment. It includes more context on where this request is coming from.

  • natejgardner
    natejgardner Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

    A cool approach might be a streamlined system to directly call other recipes from code. R functions could be wrapped in a recipe, then called from Python using the Dataiku API. If Dataiku's overhead were very low, it could be similar to serverless and microservice architecture, where Dataiku can become the glue logic to easily blend processes from all sorts of languages or frameworks.

    In this case, instead of the recipe taking a dataset as input or a dataset as output, it would just behave as a function, being callable with input parameters or an observable data stream and returning a value or processed data stream to its caller. These could be exposed generically regardless of underlying implementation with a wrapping Dataiku API.

Setup Info
      Help me…