How to get metrics (jmx, prometheus, etc) for Dataiku DSS ?

ecerulm
ecerulm Registered Posts: 47 ✭✭✭✭✭

From what I can get from the documentation at https://doc.dataiku.com/dss/latest/operations/monitoring.html DSS does not export metrics via prometheus, or JMX. The only thing it can do it's to export metrics to Graphite/Carbon server.

The documentation does not mention what metrics are actually exported either, so for me it's hard to tell if it's even worth it to go all the trouble of setting something up that talks the graphite protocol and send to my already existing monitoring solution on ELK.

Anybody knows what metrics are actually exported beside cpu and vm usage (which I suppose are there)?


Operating system used: Amazon Linux 2

Answers

  • rubelagu
    rubelagu Registered Posts: 2

    In my DSS instance there are 8252 distict metrics so I can't list all of them here but here is a list of the ones I found more interesting to track in my monitoring

    dss.UUID.server.dku.objectsCounts.global.datasets 2324 1731601973
    dss.UUID.server.dku.objectsCounts.global.recipes 1994 1731601973
    dss.UUID.server.dku.scenarios.activeTriggers.global 4 1731601973
    dss.UUID.server.dku.collaboration.activeUsers 2 1731601973
    dss.UUID.server.dku.collaboration.connectedUsers 2 1731601973
    dss.UUID.server.dku.collaboration.activeTrackingSessions 3 1731601973
    dss.UUID.server.dku.collaboration.trackingSessions 3 1731601973
    dss.UUID.server.dku.jobs.activities.possiblyRunning 0 1731601973
    dss.UUID.server.dku.jobs.activities.queued 0 1731601973
    dss.UUID.server.dku.jobs.activities.waitingForPermit 0 1731601973
    dss.UUID.server.jvm.memory.heap.committed 1251999744 1731601973
    dss.UUID.server.jvm.memory.heap.init 1042284544 1731601973
    dss.UUID.server.jvm.memory.heap.max 8589934592 1731601973
    dss.UUID.server.jvm.memory.heap.usage 0.11 1731601973
    dss.UUID.server.jvm.memory.heap.used 906292560 1731601973
    dss.UUID.server.jvm.memory.non-heap.committed 256507904 1731601973
    dss.UUID.server.jvm.memory.non-heap.init 7667712 1731601973
    dss.UUID.server.jvm.memory.non-heap.max -1 1731601973
    dss.UUID.server.jvm.memory.non-heap.usage -245757056.00 1731601973
    dss.UUID.server.jvm.memory.non-heap.used 245757056 1731601973
    dss.UUID.server.dku.notifications.websocket.send.activeThreads 0 1731601973
    dss.UUID.server.dku.notifications.websocket.send.pooledThreads 2 1731601973
    dss.UUID.server.jvm.threads.blocked.count 0 1731601973
    dss.UUID.server.jvm.threads.count 111 1731601973
    dss.UUID.server.jvm.threads.daemon.count 38 1731601973
    dss.UUID.server.jvm.threads.deadlock.count 0 1731601973
    dss.UUID.server.jvm.threads.new.count 0 1731601973
    dss.UUID.server.jvm.threads.runnable.count 19 1731601973
    dss.UUID.server.jvm.threads.terminated.count 0 1731601973
    dss.UUID.server.jvm.threads.timed_waiting.count 64 1731601973
    dss.UUID.server.jvm.threads.waiting.count 28 1731601973
    dss.UUID.server.dku.scenarios.triggerRuns.global.max 5.68 1731601973
    dss.UUID.server.dku.scenarios.triggerRuns.global.mean 1.54 1731601973
    dss.UUID.server.dku.scenarios.triggerRuns.global.min 1.35 1731601973
    dss.UUID.server.dku.scenarios.triggerRuns.global.stddev 0.15 1731601973
    

    so far I think I can configure metricbeat graphite module https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-graphite.html to receive these metrics and send it to elasticsearch.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron

    What port is this running on? Do you need a login for JMX?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron

    OK I got something working using Jolokia. As their Github page says Jolokia is a fresh way to access JMX MBeans remotely using REST. I downloaded the JVM-Agent from the downloads page and then used the second option of the JVM Agent to attach it to the com.dataiku.dip.DSSBackendMain process ID (9015 in my case) using OpenJDK 20 for MacOS (I used openjdk-20.0.2_macos-aarch64_bin.tar.gz as I am on Apple silicon):

    […]/jdk-20.0.2.jdk/Contents/Home/bin/java -jar ./jolokia-agent-jvm-2.1.1-javaagent.jar start --port 7778 9015
    

    After doing that I could see the Jolokia HTTP server up on http://127.0.0.1:7778/jolokia/

    I then used this sample curl query to pull the one of the metrics:

    curl -s http://127.0.0.1:7778/jolokia/read/metrics:name=dku.jobs.running/Count | jq .
    

    And here is the response:

    {
      "request": {
        "mbean": "metrics:name=dku.jobs.running",
        "attribute": "Count",
        "type": "read"
      },
      "value": 1,
      "status": 200,
      "timestamp": 1731710330
    }
    

    To see what's available you can launch […]/jdk-20.0.2.jdk/Contents/Home/bin/jconsole. Then you attach it to the com.dataiku.dip.DSSBackendMain process:

    image.png

    Then select the MBeans tab, exapand the metrics and you should see everything that's available. Some of the metrics don't show up initially so may need to have some activity first for them to show:

    image.png
  • rubelagu
    rubelagu Registered Posts: 2

    What I did is to use the Administration > Other > Misc > Graphite Reporting

    • Server: localhost:9109
    • Frequency: 1

    if you start a nc -l -p 9109 you will see that Dataiku sends the metrics via tcp to that port.

    nc -l -p 9109
    dss.xxxxx.server.dku.objectsCounts.global.datasets 2324 1731601973 dss.xxxxx.server.dku.objectsCounts.global.recipes 1994 1731601973 dss.xxxxx.server.dku.scenarios.activeTriggers.global 4 1731601973 …

    The I used metricbeat graphite module to receive the metrics, filter them (I'm only interested on 20 metrics of the +8000 available), then I send those to eleasticsearch.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron
Setup Info
    Tags
      Help me…