Mac OSでインストールしたDataikuを利用しています。 モデルをトレーニングしたところ以下エラーが出ます。 エラーに従ってインストールが必要なのでしょうか? Failed to train : <class 'xgboost.core.XGBoostError'> : XGBoost Library (libxgboost.dylib) could not be loaded. Likely causes: * OpenMP runtime is not installed - vcomp140.dll or libgomp-1.dll for Windows - libomp.dylib for Mac OSX -…
I am trying to build models through Dataiku's Python API. I want to deploy the model as an API endpoint. I want to add some additional feature creation steps in the visual analysis to pass raw data to the endpoint, as given below in the Dataiku Documentation. I want to know if it's possible to create preprocessing steps…
Suppose I have a Pandas DF and I want to create a new table in a SQL Server connection with all the data in the DF. For Snowflake, I use the DkuSnowpark module and write_with_schema, but I couldn't find something similar for SQL Server. I tried using SQL Alchemy but I got driver error, but couldn't find another way.
I want to create a scenario to refresh workbook once a day but as I understand Dataiku is typically used in cloud or server environments where direct interactions with local applications like Microsoft Excel through COM (Component Object Model) objects are not supported. How can I run this python script in Dataiku?
I try to upload datasets from my location to dataiku but it only allows me to upload data smaller than 1 GB in weight. I have tried several types but it has not been possible since it generates an error when loading the information. I don't know if this is directly due to the instance or the license I have.
Hi, I am trying to save a pandas styler object as an HTML ( ref: pandas.io.formats.style.Styler.to_html — pandas 2.2.2 documentation (pydata.org) ) in Dataiku Managed folder (HDFS not local). Can you help me with that? My dataframe styler name: df folder handler: folder Code to save to a managed folder: region_monitor_path…
Hello, I would like to increment the number of rows group by some variables only when a condition between a date and its lag is true. The idea is the following : if it is the first time we encounter an id, then var = 1; else if id = id_lag and dat - dat_lag > 30 then var = var +1 ; I try to do this with a window recipe but…
Hello ! I would like to automatically add a timestamp to my output dataset names (and then export them to folders). Does someone know how to do that ? For example, at September, 10th, my dataset would be named "Dataset_100924"
Hello ! I would like to export several datasets of my project to the same folder (each folder for each necessary date). Does someone know how to do that ? Thanks !
I am trying to combine multiple rows into a single nested json object. I know how to do the opposite (i.e. flatten), but cannot find the right tool to go the opposite direction. As an example, I start with this data: Class, Student, Grade 1, Sally, A 1, Matt, A 1, Phil, C What I want as an output is a single record: Class,…
Create an account to contribute great content, engage with others, and show your appreciation.