I'm trying to use a python recipe to export a partitioned dataset, when I partition by a specific column(date column(LDD for example)), that column is removed from the dataset. How would I export the partitioned dataset into monthly files based on LDD which was partitioned?
Hi folks, I want to update specific cells in an Excel sheet using openpyxl.load_workbook in Python. When I run the code, I don’t encounter any errors, but nothing gets updated in the file. How can I solve this problem? Thanks in advance.
I have a dataset that I want to random split into train and test sets with an 80/20 ratio. I aim to repeat this random splitting and bootstrapping of the training data 1,000 times. For each iteration, I'll train an XGBoost model and then export the SHAP values, Gini index for each feature, F1 score, and ROC AUC for the…
I am working on a scorecard in Dataiku and I would like to calculate the percentage of completion in a set number of columns. Basically, I would like to replicate this formula in excel: =SUM(COUNTIF(ColumnX:ColumnXX,"*")/Total Number of Columns) and am having issues. The columns are a mix of strings, integers, and text,…
Given an Original Data set, then do Time Series Lab, and deploy a model. Then, I train the model > predict > score based on Original Dataset, getting Forecast Data. Forecast data contains date, values, forecast, percentile columns. I assume values to be the original data. However, when I compare it to my original data it…
Hi, I am attempting to use the saved model object to predict a dataset using the score recipe. However, I am encountering the following error message: "Invalid argument - An invalid argument has been encountered: Unknown DSS variable: dip.projectKey." Can you help me resolve this issue? Operating system used: Windows
Do we have a mechanism to retrieve list of projects within our Dataiku instance that has certain specific tags? Thanks!
Hi I have a project 'test' which has 6 datasets. student_name & s_1.s_2,s_3..s_5. student_name dataset has 2 columns: name & content. content column shows the name of datasets pertaining to each student (s_1 etc). I am creating a dashboard where each student will have a tab to show the contents of the related s_n dataset.…
Hi I am trying to follow the MLOps Best practices to deploy a model in production. I am trying register a model after adding the model in experiment tracking. The model has been added in experiment tracking successfully as a 'OTHER' model using the below syntax mlflow.pyfunc.log_model( artifact_path=f"model_age}",…
Create an account to contribute great content, engage with others, and show your appreciation.