Monitoring CPU, RAM and Disk usage of DSS instance
Hi,
I need to monitor the resources usage of our DSS Instance VM (Linux, v8.0.1) after 2 crashes this month.
I have followed the dkumonitor installation documentation, and the connection works fine. However, I can't find on the documentation, or on the dkumonitor repository, where to look at to get CPU / RAM / Disk usage (and others if relevant to monitor).
I have more or less randomly found:
- dss.{instance}.collectd.memory.used -> RAM ?
- dss.{instance}.backend.server.jvm.memory.heap.used -> not sure what this is, or if it is really useful to monitor ?
- dss.{instance}.collectd.cpu.*.percent.user (and same +.system) -> CPU usage ?
- For storage I couldn't find anything yet.
I am wondering if I am setting the correct paths to monitor these parameters ?
Thank you for your support
(You can find screenshots of dkumonitor+grafana results.
VM config = 32CPU, ~60Gb RAM, Disk size unknown)
Answers
-
AntonB Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered, Neuron 2022 Posts: 7 ✭✭✭✭
Hello Axel, have you been able to figure it out?
-
Hi Anton
Yes, without feedbakc we kept these source on Grafana to monitor.
However, I must mention for other user that might be concerned by regular crash intances that we were misguided with this CPU and RAM Monitoring.
For almost a year, we were in v8.0.2 on Linux SLES. Once we migrated to v8.0.4, we were informed that the crashes were due to thread issues, know with this prior version.
Today, we change from HortonWorks Datalake to Cloudera and could monitor these parameters with almost the same syntax. Our current version is 10.0.2 and we didn't face anymore these crashes / reboot, with almost 50% more users.