-
[Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. TABLE_NOT_FOUND
I am reading data from S3 location but I am not able to use it unless I sync it first. Syncing this huge data takes lot of time. What is the best way to avoid this issue without syncing it?
-
Read trailing 14 days data from a partitioned S3 location as /load_date=YYYY-MM-DD/load_hour=HH
I want to read the trailing 14 days data from S3. I have already setup my S3 connection and want to read data for last 14 days load date and only the 24th load_hour. How can I apply filter just while reading the S3 location using S3 connection setup. Since I need to do it for multiple data sources reading it individually…
-
Label task
Hello, I have created a label task for the evaluation of a query. The input is a sample of rows of a bigger dataset. With the label task you can then assigned one of five categories. Using the label task has lead to changes of the query that feeds the label task. Now I want to reset the task / remove the data associated…
-
I cannot install TextPreparation Plugin
Hi Friends, I am unable to install TextPeparation Plugin to DSS Home . After several attempts no success. All other NLP Puglins were installed correctly . I am attaching the screenshot of the problem.. please try to help solve my proble m Operating system used: Windows 10
-
Logging in dataiku notebook / recipe ...
Hello Team, I am working on pyspark recipes. I use notebook to build the logic and change it back into recipe. The dataiku and spark operations ( e.g. df.count() ) emits a lot of log statements to the console and makes the notebook very difficult to use. Is there a way for me to supress logging from dataku and spark APIs?…
-
Defining a global variable in the base name of the output file for a dataset
Hello I am working on a flow that has a python recipe that sets global variables. In the output dataset of the recipe a couple of these variables are being used to set the path and filename of the dataset which is stored in Azure. From researching on how to define the filename it states to set the "Force single output…
-
Make "Table of Contents" visible in Wiki Edit mode
When editing wiki pages, the Table of Contents tree is not visible, making navigation of the wiki exceedingly difficult. I'm often toggling between View and Edit modes in order to get around. Visibility of the Table of Contents tree / section headers while in edit mode would make editing wiki pages much easier.
-
Practical Use of Code Studios
I am starting this thread to learn about how others are using code studios (such as VSCode, JupyterLab, and Streamlit) and for what purposes. In our organization, we were initially excited about the feature introduced in DSS v11. However, our enthusiasm was quickly dampened by the fact that users cannot select the…
-
i am looking a strange error while accessing my dataflows
flows were working previously but now this error window limits me to use any of my projects
-
Extend the "Rebuild Code Studio templates" option to non-admins when updating a code environment
I was pleasantly surprised to discover the "Rebuild Code Studio templates" option in the "Containerized Execution" settings of a code environment. This feature enables the rebuilding of Code Studio templates that rely on a given code environment, effectively killing two birds with one stone. However, after investigating…
-
Usage of tags of github in version control
Hi, I've connected my project version control to a remote repository hosted on github. Usually when developping with github I use tags to check releases of my code to move it to production, to being aware of which version of code I've deployed. I would like to use it in Dataiku in the same way, creating a bundle from the…
-
I have problems related to sending emails using the SMPT protocol which I run using the Dataiku scen
Operating system used: almalinux (8.8) Operating system used: almalinux (8.8)
-
Clarification on handling streams through the API?
In this video: At around the timestamp 3:23, the last line of code shows folder.upload_stream("name_of_file_in_folder", f) Which appears to be incorrect since "folder" is undefined. Should this have been handle.upload_stream(… instead? (Also, please add version 13 as a version option in the ask a question form?) Operating…
-
Network / Firewall issue when deploying pods ?
Hello team, i m looking for network / firewall configurations to resolve this error : [2024-12-18 17:09:19,841] [1/MainThread] [ERROR] [root] Could not reach DSS: HTTPConnectionPool(host='almalinux-8', port=10001): Max retries exceeded with url: /dip/api/tintercom/containers/get-execution (Caused by…
-
Reorder columns in a dataset
Hello, I would like to reorder the columns of my dataset without using a Prepare Recipe. Is it possible ? Thank you !
-
Questions on quick modeling prediction
I have questions on the quick modeling part of dataiku. Now I am completing an assignment, but I find that the column of data in my label data used to calculate the cost does not appear in the unlabeled data. This problem caused me to be unable to predict unlabeled data with the model I trained. I would like to ask how to…
-
Capture time stamp and duration for each individual activities in a job.
How can we retrieve the start and end timestamps for each individual activities that runs parallel in a job?
-
Remote kernal for notebook/recipe
similar like how jupyter notebook capability. reason behind is that some library/capability already in remote server/terminal. executing the code at remote kernal and just obtain result back from it.
-
Document conversion to source RAG
support library like - docling (https://ds4sd.github.io/docling/) - markitdown (https://github.com/microsoft/markitdown)
-
K-Modes supported
Hi, I noticed that Dataiku supports K-means clustering but couldn't find support for k-modes. Am I missing some documentation. If not, are there any plans to support k-modes clustering? thx Operating system used: Windows 10
-
Overview of all running jobs
Hello Community, Is it possible to get an overview of all running jobs? Is it possible by checking the internal metrics dss_jobs and then filtering on state empty? This seems to provide an overview of all jobs running. However, for example, if the job is partitioned, is it also possible to see all the sub-jobs running?…
-
How to set recipe container through dataikuapi
Hi, I would like to change dynamically the container for the execution of a recipe through the dataikuapi-external-client. I could not find any referenece in the documentation. How can I achieve that (if possible)? Regards, Filippo
-
A user is trying to execute a python recipe and get the attach message. How to we fix that?
He can log in to dataiku web and is in the developer group. He also has the right to pyhton code env. Thanks. Visanu
-
Extract ID out of a column
Hello , I'm trying to extract the numbers after the / (but it's not always the first / ) and it's not the end of the url... In exemple i only want the Bold number /33988-329055 I though an extract would work but it doesn't... Could anyone knows why ? Thanks
-
Use Case : sync data
example table a most updated data. data b has not been updated. but examples there are data inconsistencies. suppose table a has 25 data. now table b only has 15 data: 1. table A 1 - 10 suppose it has the same id as in table b. but table b data has not been updated even though the id is the same. 2. Table A 11 - 25 the…
-
I am getting this error on my Windows10
{Cannot run program "conda": CreateProcess error=2, The system cannot find the file specified, caused by: IOException: CreateProcess error=2, The system cannot find the file specified} I am using Anaconda server. I updated my PATH file…
-
When I save from notebook to insights, the diagram does not show up on the insights screen.
Hi, community. I would like to save the static insights and add them to the dashboard. However, when I run the following command to save it to an insight, it does not show up when I go to the Insights screen. Can someone please help me with this? Below is the python code. --------- import dataiku from dataiku import…
-
Failed to attach GKE Cluster
Hello team i m getting this error when trying to attach a GKE cluster Failed to start cluster : <class 'googleapiclient.errors.HttpError'> : {'vary': 'Origin, X-Origin, Referer', 'content-type': 'application/json; charset=UTF-8', 'date': 'Sat, 14 Dec 2024 17:04:59 GMT', 'server': 'ESF', 'cache-control': 'private',…
-
Add Auto Syncing Mode for Code Studios
As an end user of DSS I want to be able to have the ability to auto sync my changes in the code studio to DSS so that I don't lose my work if the code studio crashes or automatically shuts down. Auto syncing would allow me to not be able to lose any work the code studio gets turned off or I forget to sync my changes back…
-
Dataiku Answers get error
Hi team, I am trying to use Dataiku Answers to build a QA chatbot based on a knowledge bank that I built. After deployment, I can see the chat UI was OK, but when I ask a question, there was error to get response. The question I tried was successful in Prompt Studio. I attached the backend log here. Can someone help me…
-
Scope Error when trying to attach gke cluster
Hello Dataikers, i m trying to attach a gke cluster with a service account. on my dss server gcloud command works fine and i m able to deploy a k8S cluster. On dds gui, i m facing a kind ok right issue with api (i think) I m getting this error Failed to start cluster : <class 'googleapiclient.errors.HttpError'> :…
-
Any way to overlay "events" on a chart in my dashboard?
Hello, I am wondering if there is any way to overlay some custom text "events" on my charts. Effectively something like the below: or this: I don't need anything too fancy, my use case is that I am plotting some attributes, let's say # new users and I want to manually mark some events, eg "marketing campaign x" etc thank…
-
Add users and grant permissions via dataikuapi for Projects
hi Team, Do we have any programmatic way to add and grant permissions to users for projects via dataikuapi? (i.e) project.add_users via dataikuapi? project.get_permissions()['permissions'] Thanks, Thiagu
-
Remove columns by pattern
Lets say I want to remove all the columns that contain the word "Spot" in them. How would I do that. I cannot figure out the syntax for Remove columns matching.
-
Alteryx to DSS migration
I am planning to migrate Alteryx workflows to Dataiku, i got to know Dataiku has an accelerator for this purpose. Can anyone help me with the URL to install the project for Alteryx to DSS conversion partner accelerator tool.
-
Unable to modify SSH Private Key in Clusters/Configuration
Hello team, I m facing an issue when trying to configure GKE cluster. Looks like SSH Private Key is in a config file somewhere on my dss server but where :) ? I have an error that i m trying to resolve in my cluster creation. Do you have any idea where i can change the entry for SSH private key, and which value is required…
-
No module named 'yaml
Hello Dataikers, I'm trying to start or stop a GKE cluster and i m facing an issue with error : Failed to stop cluster : <class 'ModuleNotFoundError'> : No module named 'yaml' On my dss server side : python3 -c "import yaml; print(yaml.__version__)" → 5.4.1 (command return version) Do i miss something during configuration…
-
How to import a plugin after a dataiku instance change
hello, I initally had a devlopped plugin available in a specific Dataiku instance. We had to change the dataiku instance . Is there an easy way to reload or import this previous plugin in this new instance? Thank you
-
On opening/loading, python notebook gives Out of Memory error
Hi, When I try to open a python notebook, screen freezes after couple of seconds and finally throws error saying "Out of Memory". Any suggestions to resolve this issue is appreciated. Thanks. Regards, AJ
-
ELT best practices? (workspace, intermediate data sets, views...)
We are preparing a SAS to Databricks migration and we are considering Dataiku as a low-code ETL for non-technical users. Dataiku feels very close to an awesome experience but there are a few issues that make me worry about the sustainability of such an approach. Do you have recommended best practices to mitigate those…
-
Elasticsearch index with custom settings?
Hi community! I am wondering how I can create an Elasticsearch index with custom settings for the analyzer, filter and tokenizer. The documentation (doc.dataiku.com/dss/latest/connecting/elasticsearch.html) mentions "you can use an index template before building the managed dataset for the first time", however, it does not…
-
Enhance Dataiku - Snowflake interoperability
I have encountered several challenges involving column name handling and data type management while integrating with Snowflake. I'd pointed out a few things during a mission to integrate the platform on SF. It was no mean feat, especially when it came to managing schemas and types. I noticed that the problem is becoming…
-
Enhance Code Studios Templates APIs to support automated administration
Hi, The current Code Studios Templates APIs (see links below) don't support certain capabilities that we need. We would like to have Python APIs to: Obtain the full list of build IDs related to a Code Studios Template as shown in the Code Studios Template ⇒ Build History ⇒ Show Build drop down. This is needed to be able to…
-
Admin Academy
Hi, i want ask about become a admin. i saw my friend have a admin learning path
-
SqlExecutor2 does not handle Snowflake ARRAY data type * Bug?
Hello, I am using the SQLExecutor2 to read a temporary table and write to a Snowflake dataset in a Python recipe. Here is the column data type: {"type":"ARRAY","length":16777216,"byteLength":16777216,"nullable":true,"fixed":false} Here is my Python code: Expression type does not match column data type, expecting…
-
how we can achieve Data Lineage in DataIku
How we can achieve table and column decencies in recipes.
-
Dataiku Input recipe using Encoding
Please add an option for character encoding when specifying input files; even if I want to specify UTF-8, I can't do so on Dataiku and have to use another tool to convert the character encoding before it can be imported into the Dataiku flow.
-
End to End Possibility for Dev-Ops Implementation with Best Practices
Write now Dataiku possess a lot of unique abilities to develop scalable ML / Deep Learning / GenAI algorithms. In addition to that Dataiku has facilitated collaborative development using flow-zones etc. Even there are a lot of Data quality checks and metrics to facilitate operational efficiency and drift detection. But all…
-
Using SQLExecutor2 inside shared library
Hi, I would like to execute some raw sql queries like insert the rows directly into the oracle database. Based on the various community discussions, I chose to use SQLExecutor2. My code is as below: from dataiku import SQLExecutor2 import dataiku def test(): # get the needed data to prepare the query # for example, load…
-
How to use LLM Mesh work with LiteLLM
Hi, I'm working on an Agentic Gen AI project using the crewai package, which uses LiteLLM as the engine to connect to various Gen AI models. I would like to use Dataiku LLM Mesh, but it seems that it's not compatible with the LiteLLM. I tried to use the DKULLM and DKUChatLLM, but both of them are not working. I'm on…
-
Dataiku's Python environment is using a version of the GNU C Library (glibc) that is too old.
I can't import packages pyg-lib and torch-sparse for pytorch-geometric, because Dataiku's Python environment is using a version of the GNU C Library (glibc) that is too old. Both pyg-lib and torch-sparse depend on glibc version 2.29 or newer, but current system has an older version. i got error while importing:…
-
Use knowledge bank in API Designer
Hello, Dataiku Team I am trying to deploy an endpoint to API node where I can use knowledge bank object from my project flow. I am using this code example but when I write this line: vector_store = kb.as_langchain_vectorstore() I get the next error: Failed: Failed to run function : <class…
-
Marimo Notebooks Integration in DSS
I'd like to propose the integration of Marimo notebooks alongside the existing Jupyter notebooks in DSS. Marimo is an innovative notebook environment that addresses several limitations of traditional Jupyter notebooks while maintaining compatibility. Here are some key advantages of Marimo notebooks: Code quality : Marimo…
-
Visual ML - model with multiple features
Hello, is it possible for the AutoML Prediction Visual ML recipe to have multiple targets? Currently, I can only create a prediction model on a single feature/target but I plan on prediction for two features as I need both predicted values for the custom metric scoring that I will be using for the said model. Thank you.…
-
How to get metrics (jmx, prometheus, etc) for Dataiku DSS ?
From what I can get from the documentation at DSS does not export metrics via prometheus, or JMX. The only thing it can do it's to export metrics to Graphite/Carbon server. The documentation does not mention what metrics are actually exported either, so for me it's hard to tell if it's even worth it to go all the trouble…
-
Move objects and Zones on the Flow
I think it would be very useful to be able to move objects and flow zones around in the flow display. It appears Dataiku determines where each recipe, dataset, etc go in the flow and I cannot edit that. I have used Alteryx in the past and it had that ability, which I liked. It allows me to organize the flow however I see…
-
Vertical Scrolling for Datasets
It would boost my productivity significantly if I could use "Shift" + "Scrollwheel" to vertically scroll. Instead of finding the small scrollbar in the bottom of the dataset each time.
-
How to retrieve the test dataset used in the trained model With python?
Hello everyone, I am working on Dataiku, primarily using their API. I have trained my model and would like to retrieve the dataset that was used for testing via the API methods. Despite trying several methods, including get_train_info(), I am unable to obtain the test dataset. I don't want to export it; I just want to…
-
Fixed copy of Python-based scenario that did not copy the script
How can we effectively replicate this issue in order to conduct thorough testing during the version upgrade process? Specifically, what steps should we follow to ensure that the issue is accurately reproduced, and what testing methodologies can we apply to assess its impact in the new version? Operating system used:…
-
Multi level variable in plugin field
I have the following variable "my_var" : { "value": "aaaa" } How can I access the value of my_var.value in a plugin field? I get the following error doing ${my_var.value} "Unknown DSS variable: my_var.value"
-
<class 'json.decoder.JSONDecodeError'> when evaluating a deployed Random Forest model
How to replicate: Using windows10, download the latest Dataiku DSS on-premise version (13.2.3). Create a New project, upload any dataset with a "target" column having binary value. Click the dataset - Lab - AutoML Prediction - Quick Prototype - Train a Random Forest model on "target", using default settings. Deploy the…
-
How to resolve labelling dashboard displaying data already labelled?
For a new project I have setup a Webapp for labelling of tabular data. I added the Webapp as a tile in a dashboard. For me both the Webapp and dashboard function properly. Yet, when I share the dashboard with a team member it displays the dataset is already labelled and does not allow for further labelling of the data. Do…
-
Is it possible to disable attachment function in scenario [Send message] [Reporters] setting?
as subject, this is to control security for data delivery. Thanks.
-
Automated alerts from Unified Monitoring on bundle or API endpoint failure
We find the Unified Monitoring (UM) feature extremely useful as it allows us to see the health of our bundle and real-time prediction APIs. However, the is no way to be alerted if a deployment fails or if an API endpoint is down. We currently have some Python scripts that scrape the data from UM and then identify any…
-
Updating project long description using Python
I believe the long description is stored in a long-description.md file. If that is true, how can I update this file using Python? Context here is we are duplicating a known-good workflow and writing user-specific information into the long description. thx Operating system used: Windows 10 Operating system used: Windows 10
-
How do I see the computed column in Data iku?
Hi I'm new to Dataiku. I'm using Dataiku 13.2.2. Currently I'm going through Core Designer Certificate from Dataiku academy. I have generated 3 calculated field using prepared recipes. But the calculated fields are not visible in my prepared data set. Can you please let me know the steps to view those fields? Operating…
-
How to load a pre-trained model into a codenv (Resources Directory) in a no-internet-access instance
Hi. I am looking for using some pretrained model (for example embeddings model) within my project. The DSS instance I am working on cannot access Internet. Still i was able to retrieve the models at some point…and now I want to re-use them. I was also able to upload the model in a managed folder and use it in a code recipe…
-
Conditionally attaching a file to an email reporter
Hi, We attach the final dataset to a scenario email reporter when it is complete. The issue is that when the scenario fails, we want to send the email with no attachment. When the scenario passes, we want to attach the Excel file. How can this be done? thx Operating system used: WIndows 10
-
Where we can modifiy configuration for audit.log files
Hi everyone, As stated in the title of this post I am trying to find out how to modify the initial configuration of the audit.log files. By default, we keep 20 files of 100Mb and I wanted to know if it was possible to modify for example the number of files to increase it to 40 (for instance)? I have not found any…
-
403 Forbidden on Jupyter notebooks after updating from 13.0.0 to 13.0.3
Hello, I'm using a custom installed dataiku, on debian 11, free license (with advanced features trial). I'm getting "403 Forbbiden" when opening jupyter notebooks after updating from 13.0.0 to 13.0.3. I've noticed that I don't get them when connecting to the Dataiku instance directly from my home network, only when doing…
-
About using python code from Global Shared Code
Hello, I have added a file named test_flow_global.py at Global Shared Code at path lib-python (main folder)--> python (sub-folder), the file contain def function that I want to just import and test in a notebook, how can I do that ? I am just a beginner in Python so looking for any help. saved file name - test_global.py…
-
I'm being redirected to another site instead of my Dataiku instance.
I'm on student license.Until yesterday,I was able to open my instance,but now I'm being redirected to revizto . com.Please solve my problem. Thanks. Operating system used: Mac OS 15.1
-
Error with "Optical Character Recognition (OCR)"
Hello, When I try to use the recipe "Optical Character Recognition (OCR)" on a folder containing grayscaled pictures (obtained with "Greyscale" recipe), it fails. The error type is: "Error in Python process: At line 22: <class 'ImportError'>: libGL.so.1: cannot open shared object file: No such file or directory". Can you…
-
Dark Mode
Every developer needs a dark mode A dark theme for the flow, datasets, and recipe configs would go a long way toward making Dataiku fit into workflows that involve many other dark mode tools. Dataiku is definitely very bright when swapping from other tools which operate in dark mode. Extensions like Dark Reader do a pretty…
-
googlesheets plugin feature: Ignore top n rows on import
Reading a google sheet with the plugin currently requires that header columns are in row 1. In the wild, a lot of users don't build sheets like that and the data begins some rows down the sheet. I suggest to add a feature of ignoring a number of top rows to correctly set the header row and table data.
-
Show all data points in Charts even when "Automatic" date range and Zoom are enabled
As per the title, the Chart seems to truncate at the 3rd or 2nd last data point when the Automatic date range and Zoom are enabled (see first screenshot). If I instead select the actual granularity of my data (e.g. Day from the X axis Date Range drop-down menu) then the last data points appear on the chart, BUT I lose the…
-
How to manage file format restrictions like the MP3 requirement in Speakatoo’s API when creating a w
I am developing a widget and need to handle file format restrictions specified by Speakatoo’s API, such as the requirement for MP3 files. What’s the best way to manage this? Are there any recommended practices or examples for ensuring compliance with the API’s file format rules? Any help would be appreciated!
-
automatically remove obsolete versions of code envs on Automation Nodes
This product idea addresses the issue discussed here: Remove old versioned environments and kernels after importing a bundle. Recently, we faced an issue with one of our automation nodes. New deployments were failing because there was no space left on disk. Upon investigation, we discovered that a code environment was…
-
Retrieval Augmented Generations Tutorial Question
In the tutorial Tutorial | Use the Retrieval Augmented Generation (RAG) approach for question-answering There is prerequisite section that says the following: A compatible code environment for retrieval augmented models. This environment must be created beforehand by an administrator in the Administration panel > Settings…
-
Allow configure CORS on API Deployed Settings
When an API is deployed with Dataiku, CORS security layer does not allow to consume the service, since it is hosted in a different server, then web browsers throws CORS error. Please add a entry on Deployment Settings to allow us to disable CORS configure according our needs. Settings references for Apiman:…
-
Code Studio, Streamlit app needs to write back to data set.
Hello Community, i have a Streamlit app on Code studio which reads data from the Datasets. i am trying to write back comments from Streamlt to Dataiku dataset. When i try to use the below code df_write = dataiku.Dataset('Comments db') df_write=write_with_schema(df_user_comments,drop_and_create=True) it doesnt work. giving…
-
PYSPARK_PYTHON environment variable issue in PySpark
Hi I am facing below issue for PySpark recipe. Exception: Python in worker has different version 2.7 than that in driver 3.6, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. I have set the environment variables using…
-
This problem is occurring on the Datiaku platform. What can I do to solve it?
-
Saving Vector Store as KB
I was wondering if there was any way of saving a FAISS vector store I create in a python notebook as a knowledge bank I can use later on? I created a vector store (see code below) which has summaries as the embedded objects, and the parent documents as the retrieved documents. I did this based on LangChain's…
-
Reverse Proxy Configuration
We are trying to setup reverse proxy. Below code is useful but not sure where to put this code? What is the configuration file under which this code should be kept? As DSS comes with nginx embedded in it, I am not able to find the directory in which the nginx related files lie. # nginx SSL reverse proxy configuration for…
-
delete vs drop
Hi, currently I am working on a project which I have to collect data but it should be new and have the same schema with previous one but without keeping the past data how can I do this? Firstly, I thought that delete option will work for me but at the end, it didn't do what I am looking for; Thanks in advance Operating…
-
How Can I Integrate Speakatoo’s Text-to-Speech API with Dataiku for Audio Data Insights?
I’m looking to enhance my Dataiku workflows by integrating Speakatoo’s text-to-speech (TTS) API to turn data insights or alerts into audio. Has anyone tried using a TTS service like Speakatoo within Dataiku for this purpose? I think it could help make data monitoring or reporting more accessible. What challenges should I…
-
Removing project tags in dataiku via API
For updating project tags I've tried the below mentioned options. awb_eng_project_metadata = awb_eng_project.get_metadata() awb_eng_project_metadata['tags'] = [] awb_eng_project.set_metadata(awb_eng_project_metadata) awb_eng_project_metadata['tags'] = ['foo'] awb_eng_project.set_metadata(awb_eng_project_metadata) While…
-
Group Recipe - Empty dataset
Hello, I have an issue regarding a group recipe. Does it work on an empty dataset ? Secondly, if it doesn't work, is it possible to stop a scenario if this dataset is empty ? Thank you for your answer !
-
Create bundle with release notes through Python API
Hello everyone! Through the Web UI, when creating bundles, we are allowed to enter a text for release notes. I wonder if there is a way to do the same through the Python API, which we use to automatically create bundles as part of our CI/CD pipeline. The pipeline we created was inspired by Dataiku's template here which…
-
how to apply basic table tool option in Dataiku . reference picture added in below
-
OpenMP runtime is not installed
Hi, I received the error below. How can I resolve it ? Failed to train : <class 'xgboost.core.XGBoostError'> : XGBoost Library (libxgboost.dylib) could not be loaded. Likely causes: * OpenMP runtime is not installed - vcomp140.dll or libgomp-1.dll for Windows - libomp.dylib for Mac OSX - libgomp.so for Linux and other…
-
Import txt file in dataset
Hello, i have a txt file fidex-width. When import the file in dataset DataIku not respects the position of string a so when i extract the data from the selected row in col_0 the position change. Example if in the original file the amount is substring in position 9 for 9 carachter after i have import the file this positione…
-
Calculate a single metric/check via the Dataiku API
Hi, It is currently not possible to calculate a single metric or check via the Dataiku API while this is possible via the GUI. The following APIs exists: dataset.compute_metrics() dataset.run_checks() but they will calculate all enabled metrics/checks which may take a lot of time. So this idea is to provide an API to allow…
-
Copying / saving Checks and metrics
Hello, Is there any way to copy checks and metrics from one data set to another? Is it possible to save a custom code check as a plugin ? Operating system used: Centos 7
-
Edit default metrics and checks as a project-wide setting
When creating a new dataset, I practically always edit the default metrics and checks to run row counts after build. Ideally, I could define this from the project settings so that every new dataset created automatically has my desired metrics and checks configured. Of course, this doesn't apply to column-specific values,…
-
Enable/disable scenario step
Hi, While debugging a scenario you may disable some prior steps in order to avoid some long SQL query. If you disable the previous steps, it removes the conditions set too. Could you disable this behavior ? Greetings, Steven
-
Visibility condition in plugin parameter
I have a Python plugin with a parameter A of type "MULTISELECT" and another parameter B which I'd like to hide based on A. For instance, if A can take on the values (1, 2, 3), what should be the "visibilityCondition" for parameter B when A meets one of these conditions? * Hide B if A has no values selected * Hide B if A =…
-
Add Seldon to deployment options
One of the deployment options in our company is Seldon (Seldon, MLOps for the Enterprise.). It would be great if Dataiku had the option to deploy directly to Seldon, the way deployment to K8, AWS, Databricks or Azure is now possible. Seldon in general deploys MLflow artefacts.
-
add an incremental column in dataset
requirement is to add an incremental column in datset, it should not be an identity column however data in it will be unique.