-
Invoking library code from webapp isn't working
import dataikuimport pandas as pd, numpy as npfrom dataiku import pandasutils as pduimport time def write_random(name: str): sample = dataiku.Dataset(name) sample.write_schema([{"name":"data", "type":"double"}]) with sample.get_continuous_writer("source-id-string-dummy") as sample_writer: while True: val = np.random.rand()…
-
Dataiku JEK port (Dataiku & Spark)
Hello, I am using Dataiku 12.5.2 and Spark 3.2.2. I am utilizing PySpark in Dataiku. Upon checking the log, I see that it is initialized in the form of "Init: running in flow, JEK port=32929" from dku.spark.context. Is there a way to set the JEK port mentioned here to a fixed port or to allocate it within a specific range?…
-
Create SQL table for Dataset using python API
Using python API, I can create an SQL Dataset, or clear it using the DSSDataset.clear() method, but afterwards I have to manually click the "Create Table Now" in the settings tab of the dataset before using it in recipes. Is there a way to achieve the same effect as clicking the button using the python API? I checked the…
-
Bug in Stack Recipe
Hi All, I am sharing below a minimum reproducible project that triggered an error in one of our larger workflows involving the stack recipes.We have been seeing these errors for Snowflake tables (they may exist in others) around string length and truncation. The culprit seems to be that Dataiku is automatically recognizing…
-
How to prevent DSS replace NA with null?
Hi, I'm using Python recipe to query and insert data to the output SQL Server dataset as below. import dataikuimport pandas as pdfrom dataiku import SQLExecutor2# Read recipe inputsp787PDMItem = dataiku.Dataset("_P787PDMItem_src")p787PDMItem_df = p787PDMItem.get_dataframe()# Initialize an empty DataFrame to collect all…
-
UI looks wrong when joining a dataset with itself
Hello, For a project I'm currently working on, I need to join a dataset with itself using different filters and computed fields each time. => something that would look like SELECT * FROM (SELECT computed_field_1 FROM data WHERE filter_1) AS data_1 JOIN (SELECT computed_field_2 FROM data WHERE filter_2) AS data_2 ON…
-
Dataset in Email Body
Hi Team, Can we use Data of any DSS table while creating Email body? Actually I want data(prepared in my DSS workflow) to be displayed in Tabular form in Email Body. I am fine if i need to write any custom Python Code for it. Thanks in Advance
-
Managed Folder contents location unicity
Goodday! Are managed folder contents entered into any type of logging system, backup system, or version control? Ie. can those contents be found in other places than the managed folders themselves? I'm assuming that's not the case, and the actual managed folder is the only place that the contents/data is actually stored…
-
FInding flow zones by tag name?
Hi, Would like to be able to find flow zones by tag name using the Python DSS, such that users can select matching flow zones for execution using their own parameterized variables. Is this possible? thx Operating system used: Windows 10
-
Calculating percentage individual sales based on monthly total sales
I have a dataset with two columns representing individual sales values and another column representing the month. I want to create an additional column that calculates the percentage of individual sales values relative to the total sales values for each month.