-
get_dataset loading strings as floats
I have a dataset with US Zip Codes in it, which are obviously very similar to integers. I need to do some processing in python on them, and have built a notebook to do so. However when I call: my_dataset= dataiku.Dataset("my_dataset") my_dataset_df = my_dataset_df.get_dataframe() I find that sometimes my Zip Codes get…
-
Convert/Transform 'Money Value'
Hello, i am importing a column into a project, from Excel. The column is coming in as 'Money Value' ie: $12,000.00 I'm trying to convert or transform the values to a number (decimal, etc). I've researched but haven't been able to find a method. I'd like to include this in my Dataiku workflow and not have to reformat in…
-
Dummy/One-Hot Encode an Array/Set of Columns?
In my data I have two different types of data that I basically want to treat the same way. In one I have a column with array data, like: ColumnA[A,B][A][B,C] I want to dummy encode these to make something like: ColumnA_AColumnA_BColumnA_C110100011 And then in another case I have a set of columns like:…
-
How to delete a Dataiku account ?
Hi everyone, I recently created this Dataiku account (mostly for Dataiku Academy), and thus I'd like to delete my old account, created with a different email address, that I don't use anymore. Could you help me with this ? Thanks
-
Set model as an output in python recipe.
Hi, I am new in Dataiku and I would like to know how to save a regressor as an output in a python recipe. I have seen an option to load a model Interaction with saved models — Dataiku DSS 11 documentation How can I set as an output the model created for example in this function: def train_model(X_train: pd.DataFrame,…
-
Authentication token has expired. The user must authenticate again.
When I used to run tables consisting of huge datasets it usually takes longer duration to compute. When a single recipe exceeds 5hrs, I get this token expired error and the job fails. Do we have some solution to exceed out run duration.
-
Limit the size of a dataset with appending behavior
Hi, I want to limit the number of rows in a dataiku dataset. It should only keep the latest 90 Rows and delete the oldest. The dataset is built by appending one row at a time to it. I tried, instead of appending directly to the dataset, to create another dataset with just the newest row and a python recipe to implement the…
-
Assistance Needed: Error Encountered in Running Scored Recipe with "Output Explanations" Option
Hi, We are currently facing an issue while attempting to run a scored recipe with the "Output Explanations" checkbox selected. The error message, as captured in the attached screenshot We have examined the input dataset, but been unable to pinpoint the root cause of the issue. Interestingly, the job completes successfully…
-
2 recipe that output to the same datasets
Hi there, I understand that we cannot use the output of another recipe as another output of a recipe. Is there a work way around this? My situation is as below: one of the recipe would have a condition that if necessary it would need to update an output of another recipe with the datasets. the datasets of the other recipes…
-
Pandas version 2 timeline
Hi, is there any timeline for when Dataiku will be supporting Pandas v2? Thanks