-
Time Series Forecasting - Installing additional packages
We are trying to use Time Series forecasting plugin in one of our projects to train models but when we try to train the models we are getting this error. "Initial analysis completed with a severe warning: You don't have access to a code-env with required packages to run time series forecasting models, please ask your…
-
API node installation
Hi! i'm doing the MLOps Practitioner training, and while doing the real time api. i tried to install an api node and create it for the infrastructure however the api node fail when i try to launch it. it gives me an error out of nowhere, u could find a screenshot from the execution log Thank you. Operating system used:…
-
Exception in thread "Thread-30" java.lang.OutOfMemoryError: Direct buffer memory
I am getting this error when one of our Data Scientist is running a Job. Currently i have assigned backed.xmx memory to 8G planning to increase it to 20 GB for larger data processing. Is this the solution ?. Operating system used: Cent OS
-
DSS can't read Teradata BLOB column
I'm having an issue connecting to a table on Teradata Vantage (17) where a data column is stored in BLOB format. (the column contains image files in base64 encoding) Dataiku autodetects the column as "string" datatype and all the column's values are read in as NaN. Changing the datatype to Object in the schema definition…
-
Queued Activities and Job Prioritization
Is there a way to * query the number of waiting activities due to insufficient slots i. at any given time ii. Historically? * differentiate an activity in waiting status because it’s awaiting an upstream dependent activity vs a job slot vs global slot. * prioritize jobs over others? E.g. a priority param per job? * If not,…
-
Scaling Horizontally & Resource management with Kubernetes
Resource management & Scaling horizontally with Kubernetes Given that activities within the same job run as threads within the same JEK process. How do you size the Kubernetes pods accordingly? A job may contain sections where 20x activities can run concurrently or may be completely sequential only using 1 core. a) Given…
-
Installation of DSS failed on Ubuntu
I tried to install DSS v11.0.2 on Ubuntu (20.04), but got an error: [+] Creating data directory: DATA_DIR [+] Saving installation log to /home/ubuntu/DATA_DIR/run/install.log [+] Using Java at /usr/bin/java : openjdk version "11.0.16" 2022-07-19 [+] Checking required dependencies + Detected OS distribution : ubuntu 20.04 +…
-
Import project. Connection remapping for dataiku-managed-storage
Hi! I'm trying to export a Dataiku project from the Dataiku online service/version into a local instance. Export goes well but at import issues appear as below: Issues were encountered * ERRORMissing connection Missing connection: Connection missing for dataset baseline_fixed (not remapped): dataiku-managed-storage (EC2) I…