-
Spark Cluster mode
Hello, As we are using Spark heavily, we are having the problem of slowness of application launching in yarn cluster mode. The slowness comes from having many DSS related files and also many jars files has to be uploaded for every single spark application. We checked the feature of using Cluster mode. However, we know that…
-
How do I find an Archived Project?
I changed the Status of a project to "Archive" on several project home pages. I then realized that I want to look at my Archived Projects. However, I can not find them. Does anyone know how to resurrect an archived project? Operating system used: Sonoma 14.3.1
-
ERROR: poppler no found on variable PATH
Hello everyone, I'm facing an issue with a Python function embedded in a Dataiku API. The business case involves applying OCR to PDFs, and I'm encountering two problems: * The PATH environment variable throws an error, stating that it cannot find the poppler-utils library within that variable. * Failed: Failed to run…
-
Snowflake dataset override default connection details
Hi I was wondering what the exact names for the parameters were that need to be added in the 'specificSettings' to get a new managed dataset on snowflake be materialized in a catalog and schema different from the default details of the connection. I added a screenshot of the UI (I am looking for the database and schema…
-
Combine split and get in a formula
Hello everybody, it's my first post on the comunnity. I've a question, I try to alterate a column its save a dates in format dd/MM/yyyy HH:mm:ss, and I want it to split for get the hour, I tried with asDate, get and split but unsuceess. Somebody to help me. Thanks in advance. HP
-
DataFrame developed with PySpark remains running without yielding any results.
Hi everyone, I´ve a challengue with a jupyther notebook using pyspark. The trouble is when I try to instance a dataframe with the instruction write_with_schema. The complete sentence are: import dataiku from dataiku import spark as dkuspark from pyspark import SparkContext from pyspark.sql import SQLContext sc =…
-
Shapely feature importance
Hello Dataiku team, Thanks for this tool< has been useful for my school projects. I am currently trying to map the features that are influential in my model using Shapely feature importance, But they are not changing to different target classes. How do i resolve this, again this is a very useful tool. Best, Jyothikamalesh.S
-
HOw to learn dataiku
hii guys, i want to use the dataiku as an administrtor i dont know where ton start? do you know from where i can start. documents are not of my type
-
Running multiple SQL statements within a Python recipe to return a dataframe
Hi, Here is some SQL our internal Python recipe creates dynamically as it boots to check of the database, schema and table have proper permissions before performing a function. show grants on database MYDB;create or replace temporary table db_check as select EXISTS( select * from table(result_scan(-1)) where "privilege" =…
-
Wrong date format when reading Excel file
Hi, I have some issue when I'm reading Excel files stored in folder. I import 1 file by month in my folder and I create a dataset to read all files and obtain one stacked table as result. For some of them (but not all), the format is not read like it's stored : * In my Excel file, the column is store with dd/MM/yyyy format…