Using Dataiku

Dataiku Online Users What databases do you use?
We are considering a migration to Dataiku Online. To All Dataiku Online Users What database do you use with your Dataiku online instance? How much data are you loading into that data? How do you find the setup? Operating system used: Mac OS Senoma 14.4.1
Using the chart functionality for a dataset from data warehouse
Hi, I have a table that resides in data warehouse. Once I create a connection to that table in Dataiku, I would like to directly explore the data without doing any additional processing, so I was hoping to use the "Chart" tab available when I double-click on the dataset. I had initially selected the option "In-database"…
XML Parsing= > Python recipe
Hello, I would like to get a Python recipe to upload easily my XML file in Dataiku. If anyone as this magic recipe, I would appreciate Many thanks
How to export a saved model, as zip file, in a managed folder?
It would be great to have some help from the community: "How to export a saved model, as zip file, in a managed folder?" It seems like that I need to do it two steps: # Step 1: save the model as a zip file at the instance where the current project is running # Step 2: upload the zip file from the instance location to the…
SQL Statement in Dataiku, not recognizing function
I am trying to return the YEAR and MONTH of a date in SQL using the YEAR and MONTH functions in a SQL recipe. For some reason YEAR is being registered but the DSS does not like MONTH. Can someone help me troubleshoot why? Thank you
Replicating Statistics Tab analyses on a different dataset
Hi, I am working on a correlation analysis. I prepared my data and started using the statistics tab in the dataset to generate correlation matrix and other statistical summaries. As I was looking at the scatter plot results, I realized that there were outliers in the data that I wanted to clip. So, I decided to run a…
difference between rows
i have a data which has years, column, sales column, i want to create another column difference which calculates the diffeence between the years , how can i do that an what formula should i use ?
Ideas for using partitioned and non-partitioned datasets in parallel
Hi! So I have this case in which I have dataset, in which I receive files every 30 minutes. I have to process the files as soon as they arrive, so for that I had to partition them by hour (to be able to process only the latest hour). But I also need to have an option to run a "reprocess" on the whole dataset. Unfortunately…
Window recipe not producing expected results when using DSS engine
Hi there, The issue I am having is that the DSS engine is producing a completely different result than when I use the SQL engine. Has anyone faced a similar issue? I would appreciate some insight on this. Basically, all I want to do is produce a columns with the MAX() value inferred from another column. No partitions, no…
Jenkins - Project and API
Hello, I'm seeking documentation/guidance on implementing a Jenkins pipeline for our specific use case, but I haven't found any helpful resources. Our API service consists of two endpoints (Python Function Endpoint and Model Predict Endpoint), and the Project Library containing business logic sits between them. The trained…

Trending Discussions

Docs for "pandasutils"?
Hello, My apologies if this is a remedial question, but at the start of every Python recipe the boilerplate code includes an import of: from dataiku import pandasutils as pdu Is there documentation for pandasutils? Is it a package that can be used in Python recipes? I've tried looking through the Dataiku Developer Guide,…
Run a Time Series Forecasting Model
I get the following error message Error message: Failed to train : <class 'ImportError'> : libcuda.so.1: cannot open shared object file: No such file or directory Operating system used: 13.1.4
Identifying the Node Type in a DSS Notebook using Python
In Python, in a DSS notebook, I want to know if the code is running in the design node or the automation node. How can I do that?

Leaderboard

Turribeach 3539

tgb417 2473

Ignacio_Toledo 1079