-
Crawl budget prediction for enhanced SEO with OnCrawl plugin
We’re pleased to share that Dataiku has published an OnCrawl plugin. At OnCrawl, we are convinced that data science, like technical SEO, is essential to strategic decision-making in forward-looking companies today. The complexity of today's markets, the sheer volume of data available affecting SEO, the growing opacity of…
-
Allow Scenario Trigger on dataset change for Google Sheets
The idea of a "Trigger on dataset change" is excellent, but it doesn't support all dataset types. It would help us a lot if it could trigger on dataset changes in Google Sheets.
-
Can't set up "Containerized execution", "Build image for containerized visual recipes" not working
I have a GCP / Ubuntu installation of DSS. I'm trying to set up GKE to run recipes on. I've used the GKE plugin to create a cluster, I can see it running (both from DSS and from the Google Cloud Console). I think the documentation is outdated and refers to an older version of the plugin and still relies on the Google…
-
Extract tables from PDF
Hello community, to perform RAG, I want to extract tables from PDFs. I would like to do this using Dataiku plugins, but the quality is not what I expect. Do you know of other methods to do this? Thanks !
-
googlesheets plugin feature: Ignore top n rows on import
Reading a google sheet with the plugin currently requires that header columns are in row 1. In the wild, a lot of users don't build sheets like that and the data begins some rows down the sheet. I suggest to add a feature of ignoring a number of top rows to correctly set the header row and table data.
-
Integration with Microsoft Fabric and its OneLake
Hi, Couldn't find anything on an integration with Microsoft Fabric through OneLake yet in the Dataiku docs/release notes. Is this coming soon? As I read the Microsoft docs I understand we can't connect directly via ADLS, only via APIs or SDKs. Thanks in advance, Jonathyan Operating system used: RHEL 8
-
Using Neo4j plug-in to create relationships duplicates nodes
Hi, I have created unique identifier(s) for two types nodes in my graph. I first push the data on the nodes into Neo4j, using Export nodes recipe: Primary key is set to a column containing the unique identifier for the node. Then I push the data using the Export relationships recipe. Primary keys for source and target are…
-
How to get the handle of the current plugin?
I'm developing a plugin and I'd like to get the handle of the current plugin to get its name and settings. Similar to client.get_default_project() to get the current project. Also, is there a way to know which scenario is triggering the plugin? Is there any way to achieve this?
-
Time Series PreparationプラグインのTime series resamplingの仕様について
Time Series PreparationプラグインのTime series resamplingを利用して日付のリサンプリングを実施しようとしております。 データは以下画像の通り月次、カテゴリ別の欠損行のあるデータで、カテゴリごとにデータが無い月の列を用意することを目的にしています。 Time series resamplingを用いることで実現できる認識なのですが、月次の時の挙動が想定と異なるためご質問です。 添付画像の通り、レシピの設定画面でResampling parametersのTime stepを'1', Unitを'Months'にすることで月次のリサンプリングが実現できる認識ですが、…
-
Custom Params Dataiku Plugin Recipe
Hello, I am developing a plugin recipe, using python. I want the user to be able to give some input, which I allow in the form of params. One of the params I want to have is a datetime, which is not a default param type in dataiku plugin. Therefore, I wish to build a custom param. Preferbly, I would like to develop a…