-
Support user agent string with download recipe
I am trying to download data from some US government websites, and most will allow the download recipe to run without hassle, but a few like the Bureau of Labor and Statistics (BLS) will sometimes return a 403:Forbidden error when trying to download data. I can easily get around this by using CURL or anything else that…
-
Perform quick SQL query on SQL dataset from UI
For my workflow it would be very helpful to have the option to perform a quick SQL query on a (SQL) dataset in the Flow from the UI. For example by right clicking. Things like count distinct values of a specific column, etc. Right now, I go to my separate SQL client to perform these quick checks, but that requires tool…
-
Add Ability to add updated/inserted time to UPSERT recipe
The new upsert recipe is great, and has alot of potential. It would be awesome to have the ability to add an audit column to this recipe, updated_at if the row was updated, or created_at if its a new row. If its a new row, updated_at would be left blank, created_at would be now() if an update, updated_at would be now(),…
-
Add description to each IP set in IP Allowlist extension to identify them
Add description to each IP set in IP Allowlist extension to identify them. It enable to know for which organization and who this ip or ips range is added in whitelist.
-
Provide ability to export Insights to images in Scenario Steps and the Python API
Currently only Dashboards can be exported to images in Scenario Steps (Export Dashboard step). While there is an export option in the GUI to export Insights to images this is not possible to do via Scenario Steps nor the Python API. So please add support for this. And also extend the Python API to allow Dashboard exports…
-
List managed folders from project
Currently, the only way to view which managed folders are associated with a project is to check the flow. However, on large projects, the flow is too large to load. (On my project of just 7,000 datasets, the flow crashes the browser tab). Datasets and recipes can be listed in the datasets and recipes pages, but managed…
-
Comments in Formula
User Story: As a creator of formulas in Dataiku, I would like to be able to add comments in formulas, this would allow me to leave information in formulas about why formulas are configured the way that they are, increasing trust and communications, and it would allow the ability to "comment out" chunks of code while…
-
Allow datasets to automatically reload schema when jobs run.
Currently, if columns in a dataset source are added or removed, jobs and scenarios that read from that dataset will fail until you reload the the schema from table. Even if everything downstream does not have dependencies on the column changes. We would like to see a setting to allow datasets to always reload schema when…
-
Ability to choose input data set for copied and pasted subflows
I often have to copy a portion of a flow to use in a different section. Having the ability to define my input data would make things more efficient and eliminate some human error. In the use case I have, I want to copy the portion circled in red and paste it to where the green circle is, but I don't want it to branch off…
-
Managed-datasets Metadata Synchronization Across Multiple DSS Instances
Use Case As an organization, we utilize three distinct DSS instances to manage our data analytic and ML workflows: * Self-Service and Data Products Consumption Instance: For end-users to consume data products, and work independently by having access to curated data. * Design and Development Instance: For designing and…