-
Using the "Files in folder" dataset
If you load lots of files of the same type into Dataiku you should be looking using the Files in folder dataset. It's a great built-in feature to automate the ingestion of files of the same format. You can create a new Files in Folders dataset by going to: Dataset => New Dataset => All dataset types => DSS => Files in…
-
Import multiple files in a "managed folder" and create an "original dataset" column containing the f
Hello, It would be cool to have the possibility to import multiple files in a "managed folder" and create a vertically-stacked dataset, and choosing to have an "original dataset" column containing the imported file names. Best regards
-
How to Clear metrics and checks history?
I wonder how we can clear the actual metrics and checks history? I tried to clear the dataset but it does not impact metrics' and checks' history. Thanks is advance for you guidance.
-
Tips for working with SQL Temporal tables?
I've searched around a bit, and I have found no information about Dataiku handling MSSQL temporal tables. Am I using the wrong search terms? Is this an area where I can only do this manually (temporal tables store a history with to datetime columns that bracket when the row was effective) MSSQL has some special query…
-
Share a dataset using python api
Team, Want to know if there is any python API to share the dataset from one project to another. For example, Project X has dataset D. Run a python API to share dataset D with Project Y and Z. Thanks, Skanda
-
Bug? Axis labels and prefixes not working in line/bar chart
This seems to be a bug. Editing the axis labels and adding a prefix doesn't seem to do anything in the line and bar charts in the Charts tab of a dataset (see screenshot) Operating system used: Windows
-
How to fully automate model retraining on the most up-to-date training data?
We are trying to build an automated pipeline (via a Scenario) that, among other things, involves retraining our main classification model each time the Scenario is run. Ideally, this retraining should happen on freshly-updated training data (the training dataset is refreshed/recalculated earlier in the same Scenario).…
-
Dataset virtualization
Hi All, I am trying to understand how virtualization in DSS works. In the following example, SQL pipelines are enabled and virtualization is allowed for 'split_1' and 'split_2'. When building 'stacked' with smart reconstruction, 'split_1' and 'split_2' remain unbuilt (virtualized) as expected. However, in the next example,…
-
Invalid argument An invalid argument has been encountered : Invalid loc: empty name
This problem exists in a single project on my instance. Every time I attempt to delete ANY dataset (regardless of connection or type) I get the following message. This also happens when I try to create new datasets using recipes. Thanks in advance!
-
Operationalizing connection to REST APIs
Wondering if anyone out there has had some success in operationalizing connections to REST APIs as a source of data for DSS Projects. I would love to have a conversation with folks who are working on this type of challenge. In my case: Dataiku DSS has allowed us to automate the gathering of data from a CRM system not…