Managing Configuration Files Across Dataiku Instances for User-Accessible Settings

Options
Tanguy
Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron

We're seeking a solution to efficiently configure settings across Dataiku instances, with a specific focus on settings accessible to users, rather than those at the admin level. Our primary aim is to facilitate changes to these configuration settings without disrupting the production environment or necessitating deployment, in contrast to the use of "global variables" within a project or the use of a config file inside a project library.

For example, within our organization, we have several Dataiku projects that regularly send reports via mailing lists. These mailing lists frequently evolve in response to organizational shifts. Presently, we store these mailing lists in the "Global variables" section, but any adjustments require deployment to take effect in the production environment.

Is there a centralized solution where these configuration changes can be established once and consistently applied across all instances (especially between a Design environment and the "attached" Automation Node(s))?


Operating system used: linux

Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,737 Neuron
    Answer ✓
    Options

    With regards to email lists I think I will quote Steve Jobs: "you are holding it wrong" ‌

    There should be no need to change email lists in code. Create an email distribution, give the business or the relevant people permission to update it and use the SMTP in Dataiku. In Exchange you can even change the Display Name of the email dritribution list and keep the same SMTP email address or even move SMTP email address to a new email distribution. There should be no need to hardcode email distribution lists in Dataiku.

    Having said that I am sure your question applies to othe config items not just email distribution lists. We are using Global Variables at the moment but my view is that we should move away from them. On one side you have the hassle of having to redeploy these to all environments every time you make a change. And on the other side you can't have secrets in them since Globals are visible to all users even clickers. On top of that we have a lot of variables which means that these pester the user space when they load variables via get_custom_variables()/get_variables() and they could even clash with user's code as users have no way of knowing what these variables are called.

    So where to store these then? Well in my view it will be great to use a secrets store to also store configuration values. The reason I say this is because I have secrets to store too (like Cloud keys, etc) so might as well put everything in one single place and let the code retrieve the values at run time. This obviously adds a depedency on the secrets store which means you should probably use some very robust, highly available and serverless service, such as the secrets stores offering from the Cloud vendors. If you are too worried about not being able to retrieve the config/secrets during an outage you could implement some sort of caching in a local file but then again you will need to encrypt this file as it may have secrets.

    Hope it helps.

Answers

  • Tanguy
    Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron
    Options

    I notice that this request is closely related to the proposal of this product idea.

Setup Info
    Tags
      Help me…