How to run Dataiku flow parallel for multiple different parameters
I've created one flow which takes input file from S3 based on Scenario trigger parameters and run the flow and finally saves the processed data into S3 in different locations based on the parameters it'll upload into different path. When I'm triggering the above flow with different scenarios to build ALLOC_STEP9_copy with…
File format conversion
Hey Dataiku users, I just wanted to know how I can convert a very big binary data file to a human readable file like xml/ csv or anything that I can see the decoded data? Thank you! Operating system used: Windows
Trigger Dataiku project and how to pass arguments from external python script to Dataiku
In my case, I want to trigger Dataiku project from external python script that runs on Airflow cluster. How can I trigger entire project workflow in Dataiku with passing some arguments from python script to Dataiku project. And how to access those arguments in Dataiku.
How do I take a dataset and publish to the Tableau Server with Python code?
Hi There, Our team is getting started with exporting our model output into Tableau. We can do this through the Hyper export Plugin. But we would like to also do this programmatically through the python code recipe. Goal - Publish a dataset to Tableau Server using python code and not the plugin.
get_dataset loading strings as floats
I have a dataset with US Zip Codes in it, which are obviously very similar to integers. I need to do some processing in python on them, and have built a notebook to do so. However when I call: my_dataset= dataiku.Dataset("my_dataset") my_dataset_df = my_dataset_df.get_dataframe() I find that sometimes my Zip Codes get…
Convert/Transform 'Money Value'
Hello, i am importing a column into a project, from Excel. The column is coming in as 'Money Value' ie: $12,000.00 I'm trying to convert or transform the values to a number (decimal, etc). I've researched but haven't been able to find a method. I'd like to include this in my Dataiku workflow and not have to reformat in…
Dummy/One-Hot Encode an Array/Set of Columns?
In my data I have two different types of data that I basically want to treat the same way. In one I have a column with array data, like: ColumnA[A,B][A][B,C] I want to dummy encode these to make something like: ColumnA_AColumnA_BColumnA_C110100011 And then in another case I have a set of columns like:…
How to delete a Dataiku account ?
Hi everyone, I recently created this Dataiku account (mostly for Dataiku Academy), and thus I'd like to delete my old account, created with a different email address, that I don't use anymore. Could you help me with this ? Thanks
Set model as an output in python recipe.
Hi, I am new in Dataiku and I would like to know how to save a regressor as an output in a python recipe. I have seen an option to load a model Interaction with saved models — Dataiku DSS 11 documentation How can I set as an output the model created for example in this function: def train_model(X_train: pd.DataFrame,…
Authentication token has expired. The user must authenticate again.
When I used to run tables consisting of huge datasets it usually takes longer duration to compute. When a single recipe exceeds 5hrs, I get this token expired error and the job fails. Do we have some solution to exceed out run duration.