Is it possible to use a code studio to take advantage of autocomplete, plugins, etc in order to create an RShiny app and then deploy it as a webapp? Specifically I would like to use RStudio. If so how would I do this
Hello everyone, I believe that python allows people to use the group_by() method without any aggregations; however, in dataiku, we must aggregate when we use the group recipe. In other words, I would like to group by a specific column and keep all other columns without aggregating, is that possible in any way? Note that I…
Hello, I'm trying to find the 1st date between 2 date fields. I was thinking of using a min formula but there may be missing values in these fields and the formula doesn't seem to work in this case. Is there another solution other than an "if then" formula ?
Goodday! In the API Designer, we can define connections to use with SQL Query Endpoints. How do we remap these connections based on deployments to different API nodes? (ie. use different connection for deployments to a production API node vs. deployments to an acceptance API node) I don't see any option in the deployer UI…
Hi, I have gone through few of the post on the remove duplicate but none of that give the clear answer on the same. Can you pls. provide the path to showcase how can i use some column with condition if that value repeats it would stop counting the same value with entire row in the output? K.Rgds, Kalpesh
Hi dataiku users, I want to know how to resolve the situation in subject. I use filter recipe only for processing the exception data, and stack with main data after that, so if there are no filtered records in output dataset, no problem. but in dataiku, if there are not all data sets in stack recipe, return error and stop…
Hi All! I'm trying to use streaming Python with the example given in documentation: https://doc.dataiku.com/dss/latest/streaming/cpython.html#writing-to-datasets If i try to follow it , it doesn't work exactly: 1) .get_continuous_writer() expects a source-id as one of the arguments 2) if i give something like…
I get random errors on my Kafka due to GCS bucket failures and Bigquery size limits. I'm working with my teams to resolve, but I'm wanting to know if there is an easy way to restart a continuous process in the event of a failure? I thought about setting a scenario to start the process every 30 minutes or so, but I'm sure…
I am trying to create an ML application which can display the various types of transformations that happen in a dataset, like the count of certain rows, their min, max, etc. This app will take data from the flow and display whatever information is needed for the particular run. I need a way to pass the information to this…
I have a similar question to the one posted a few years ago https://community.dataiku.com/t5/General-Discussion/Python-script-to-export-any-kind-of-recipes-into-SQL/m-p/21298 I have a flow with tons of recipes. I want to convert that into "a" code, python, SQL, pyspark... I do not care. The solution in the link works only…
Create an account to contribute great content, engage with others, and show your appreciation.