-
データセットのEmbeddingで作成したナレッジバンクのデータ抽出時のメタデータ列の動的フィルタリングについて
年齢や性別などのメタデータ列と自由記述欄列を持つデータセットに対してEmbeddingレシピを実行し、ナレッジバンクを作成しました。その際、メタデータ列に年齢や性別列を設定しています。 そのナレッジバンクに対してKnowledge Bank Search toolを利用し、ビジュアルエージェントからデータ抽出しようとしています。その際、メタデータ列で動的なフィルタリングをしたいのですが、可能でしょうか。 Knowledge Bank Search tool画面のAllow dynamic filtering項目にチェックすることで可能だと考えていたのですが、うまくいきません。対象列のmeaningをBag of…
-
" Parse next line as column headers" option not working for csv files
When uploading a csv file, and ticking the " Parse next line as column headers" option, the created dataset doesn't have the column names contained in the first row of the file.
-
Meet "Connection xxx not found" when exporting a project
Thanks for your time at the beginning. I am currently exporting a project, only check the 4 default options. However, it failed with a warning "An invalid argument has been encountered : Connection 'SF_VAW_PROD_ATP_MED' does not exist"Then I tried to use Python API to find this connection, but failed again: import dataiku…
-
Change Auto-Typing to an off or on option with default “Off”
Would like to have the Auto-Typing setup as an option that can be turned off and on with the default being “Off”. This feature is changing my unit serial numbers (230836735F) to a Float (2.30836735E8) which causes me to lose records when joining on the unit serial numbers field in a following step. This will cause my…
-
Renaming a dataset using Python API
Dear Community, I am trying to rename a dataset from a project using the python API using the rename method from the dataikuapi.dss.dataset.DSSDataset class (https://developer.dataiku.com/latest/api-reference/python/datasets.html#dataikuapi.dss.dataset.DSSDataset.rename) but I get an AttributeError: 'DSSDataset' object has…
-
Longer Connection text box on New Snowflake dataset page as needed
Request for the text box for Connection on New Snowflake dataset page to get longer to fit the full connection text if the connection text is longer than the current text box length. Our organization has a standard prefix for connections based on division/team/project, so I have multiple connections with the same prefix…
-
Exception: Unable to fetch schema for PROJECT.dataset: b'Ticket not given or unrecognized
Hi there, I encounter the sudden issue of not being able to load datasets into a Jupyter Notebook. Changing environment/Kernel doesn't help. System reboot doesn't help. Force reloading doesn't help neither. Nothing was changed in the code. Flow still runs, so it runs as a receipt but not when trying to work in the…
-
Perform quick SQL query on SQL dataset from UI
For my workflow it would be very helpful to have the option to perform a quick SQL query on a (SQL) dataset in the Flow from the UI. For example by right clicking. Things like count distinct values of a specific column, etc. Right now, I go to my separate SQL client to perform these quick checks, but that requires tool…
-
Setting up Stages in Snowflake to work with Dataiku
In Dataiku DSS when working with Snowflake there is an option to use a stage. This apparently speeds up performance by increasing the number of different types of processes one can do inside Snowflake without having to ship data back to the DSS server for processing. Are folks using this feature? What has your experience…
-
refresh partitions in dss via API
Hi, we have added by a python api a new dataset into the project and pointing it to an existing location in HDFS where partition folders are stored. (This location is managed by another DSS instance). This kind of "import" of read only dataset works, but I did not find a way how to "refresh" the list of partitions, i.e.…