DSS install in NAS

Hello,
We are currently running Dataiku version 13.1.4 on RHEL 8.4 and 8.10.
To enable high availability (HA), we are planning to use a NAS-based setup, targeting either an active-active or active-standby architecture.
Currently, we operate a dual-node environment where, in the event of a failure on the active node, the standby node is brought up using rsync
.
If anyone has experience implementing HA using NAS in a Dataiku environment, we would appreciate it if you could share your setup or lessons learned.
We are particularly interested in any issues or limitations you may have encountered when using NAS in this kind of setup.
We are aware that Dataiku does not recommend installing on NFS or EFS, but from our understanding, that limitation refers to using those file systems directly. Since NAS is more about connected storage rather than the underlying file system, we believe this setup might still be viable — and would like to confirm.
Reference:
Thank you in advance!
Operating system used: rhel 8.4
Operating system used: rhel 8.4
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron
It goes beyond a recommendation. The page you linked says clearly you should NOT do it. Therefore DO NOT install in a NAS.
Each Dataiku node has a different strategy for resilience. For the Designer node you can achieve some resilience using disk snapshots from your cloud vendor. But do note that those snapshots need to match a backup of the Dataiku internal runtime database (which for a larger Designer node will need to be moved to an external PGSQL instance) and a backup of all your data sources AT THE SAME time. Furthermore more to get a consistent snapshot you should also stop Dataiku. In practice this means that it’s almost impossible to get a full consistent snapshot without shutting down Dataiku and waiting few hours for backups to run on all your data layers, some of which you may not have control of. So the best you could do is to take live snapshots of the Dataiku DATA_DIR disk and accept you will have some inconsistencies if you decide to restore Dataiku from a snapshot.For the Automation node it’s simpler to have HA as you could have a complete replica of your Automation node and deploy to it in parallel just disabling scenarios instance wide to prevent dual running.
Finally the API node, if you end up using it, supports deploying API services in HA mode in Kubernetes.
-
I’m posting this question after reviewing the official Dataiku documentation and the following community thread:
🔗From what I saw, the limitations mentioned are specifically about NFS and EFS — I couldn’t find any clear restriction regarding the use of NAS itself.
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron
What Dataiku needs for its DATA_DIR is a high performance local storage (aka local SSDs). That performance can not be achieved with a NAS no matter how good your NAS is. So you should drop this NAS idea really.
-
Hello,
I have a question regarding NAS storage.
Is NAS required to use HDDs only, or is it possible to use SSDs as well?
As I understand it, NAS refers to network-attached storage, so it should be independent of the type of storage media inside. I would appreciate it if someone could clarify whether SSD-based NAS is common or recommended for better performance.
Thank you in advance for your insights!
-
If NAS can deliver the required performance and fully meet Dataiku’s requirements, I believe configuring with NAS shouldn’t be a problem.
I’m reaching out to gather others’ opinions and experiences on this matter.Thank you in advance for your insights!
Disks
It is highly recommended to run DSS on SSD drives.
While legacy rotational hard drives can be used, performance will be severely impacted, especially for larger instances, with many users. In these instances, rotational hard drives may lead to a non-workable experience.
Filesystem
We strongly recommend only using XFS or ext4 as the filesystem on which DSS is installedd
The filesystem on which DSS is installed must be POSIX compliant, case-sensitive, support POSIX file locks, POSIX ACLs and symbolic links.
Warning
Do NOT install Dataiku DSS on a NFS filesystem (v3 or v4). This is known not to work, and will cause failures, hangs, and possible corruptions. This includes Amazon EFS.
GlusterFS is known to cause instabilities and is not supported as the filesystem for installing DSS
Dataiku makes no particular recommendation as to the underlying block device. In particular, Dataiku does not have experience working with DRDB as the underlying block device and cannot provide recommendations about it.
-
Our on-premises environment uses NAS configured with SAN storage.
It doesn’t seem to fall under the mentioned limitations, so I’m wondering if there might be other reasons why it wouldn’t be supported or recommended.Could anyone please share insights or experiences regarding this?
Thank you!
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron
No matter what your NAS/SAN vendor tells you there is no NAS in the planet that can match local storage performance. Sure newer NAS systems got faster but there is a fundamental difference you need to understand. Storage speed is usually measured in two aspects: bandwidth and latency. While your NAS solution may have decent bandwidth it will never be able to match the latency of local storage. When I said you should use SSDs I meant you should use locally attached SSDs, not SSDs in your SAN exposed via a NAS. A network round trip is a million years away from the latency offered by locally attached SSDs.
Dataiku uses a very archaic form of metadata database based on lots of JSON files and other directories and files. It also produces a lot of logging and may even read and write datasets to your DATA_DIR if permitted. All of that means you really really really need good LOCAL storage to have good performance.
Do not use a NAS.
-
I agree with your point.
Since local storage connects via PCIe and enables direct communication between the kernel and the disk, it's understandable that network-based storage cannot surpass this level of speed.
However, my thought was: if we can achieve performance close to that level, would it still be a viable setup?
If IOPS is the only limiting factor for using NAS, and all other requirements are met — specifically the disk and filesystem requirements as described in the official documentation (
) — then I was wondering if setting up a HA (High Availability) configuration using NAS could be a possible approach.This idea is based on the assumption that, if we can match the performance requirements, NAS might still be a feasible option despite the general recommendations.