Using Dataiku

Sort by:
521 - 530 of 5.2k
  • Hello, while following this link, https://knowledge.dataiku.com/latest/plugins/development/tutorial-first-plugin-recipe-processor.html#test-the-preparation-processor I would like to completely delete …
    Answered ✓
    Started by SunghoPark
    Most recent by SunghoPark
    0
    2
    SunghoPark
    Solution by SunghoPark

    hello

    I already deleted the plugin in DATA_DIR/plugins/installed, but Hide-color exists in Process-library in Prepare recipe.
    I think I need to find the folder containing this Process-library and delete it, but I can't find the path.

    This process library appears the same in other projects and needs to be deleted.

    SunghoPark
    Solution by SunghoPark

    hello

    I already deleted the plugin in DATA_DIR/plugins/installed, but Hide-color exists in Process-library in Prepare recipe.
    I think I need to find the folder containing this Process-library and delete it, but I can't find the path.

    This process library appears the same in other projects and needs to be deleted.

  • Hi team, I currently encountered an urgent issue with our project automation pipeline. In a Python code recipe, I used builder to create a new python recipe, and have added a dataset as input, 3 new d…
    Question
    Started by AnnaZ
    Most recent by sanshekh
    0
    8
    Last answer by
    sanshekh
    Last answer by sanshekh

    Hi All,

    I'm trying to export my final df of a python recipe to a managed folder & I get the below mentioned error:

    Job failed: Error in Python process: At line 35: <class 'Exception'>: Managed folder EyABfYFjV cannot be used : declare it as input or output of your recipe

    Any code snippet via which I can push my df data into a csv file & then when I go back to the recipes it should show the managed folder icon wherein my file should be present.

    Dataiku_.PNG

  • I have a folder of data which can be converted to a desired format using a .exe application. The input data and .exe isgiven by 3rd party. How can I do this in DSS? Operating system used: Cloud
    Question
    Started by ziriamaze
    Most recent by ziriamaze
    0
    2
    Last answer by
    ziriamaze
    Last answer by ziriamaze

    Thanks, I will try doing it thatway

  • Hello Dataiku Community. I am connecting to an FTP folder to build a dataset. The data I got is the union of all the data in all the files in that directory. What I want is to process only the new fil…
    Question
    Started by alain008
    Most recent by Turribeach
    0
    1
    Last answer by
    Turribeach
    Last answer by Turribeach

    In terms of only loading new files there are no built-in ways of doing this so you will have to develop custom Python code. Ideally you would want to have subfolders within your folder and move files as you process them. For instance files arrive in ./landing then they get moved to ./processing for loading and then after successful load you move them to ./loaded. That way you always know what files you processed and if it fails loading a file you have right there in the ./processing folder to fix the code or the file and try again the same file. This post will give you some guidance but you will need Python skills to achieve this requirement.

  • Hi there - new to dataiku, Lets say i have an excel sheet of 2 columns where one has app reviews and the other has dates they were posted. Is there a video tutorial anywhere or example where i can cre…
    Question
    Started by James17001
    Most recent by AdrienL
    0
    1
    Last answer by
    AdrienL
    Last answer by AdrienL

    The simplest would be to use the Classify Text recipe showcased in this tutorial.

    It does require a LLM connection, which can be either

  • Hi, I am using a python recipe to get embeddings from my text features in Dataiku and everything worked out fine, but the embeddings did not come out with comma delimiters when I write the output to a…
    Question
    Started by afolaba4success
    0
  • Hello Community, I am actually helping one of my team work on preparation of data with: - Prepare recipe (running in Partially in database) engine. (Due to multiple data connections) - with some simpl…
    Answered ✓
    Started by Islam
    Most recent by Grixis
    0
    2
    Solution by
    JuanE
    Solution by JuanE

    Please avoid duplicate posting here and in the Support channel. We'll get back to you there in due course. Thanks.

  • We are considering a migration to Dataiku Online. To All Dataiku Online Users What database do you use with your Dataiku online instance? How much data are you loading into that data? How do you find …
    Question
    Started by tgb417
    0
  • Hi, I have a table that resides in data warehouse. Once I create a connection to that table in Dataiku, I would like to directly explore the data without doing any additional processing, so I was hopi…
    Question
    Started by Doga
    Most recent by JordanB
    0
    1
    Last answer by
    JordanB
    Last answer by JordanB

    Hi @DogaS
    ,

    This error usually occurs if there are null values in the input data (in this case, it would be the date column). Can you please check? If there are nulls, I'd recommend filling or removing them.

    Note, you can fill empty values with a Prepare recipe using processor "fill empty cells...".

    Thanks!

  • Hello, I would like to get a Python recipe to upload easily my XML file in Dataiku. If anyone as this magic recipe, I would appreciate Many thanks
    Question
    Started by Devian31_M
    Most recent by JordanB
    0
    1
    Last answer by
    JordanB
    Last answer by JordanB

    Hi @Devian31_M
    ,

    The simplest way to upload an XML to DSS is by creating a dataset from the UI. In this case, the DSS autodetects the format (XML) and the schema (XPath).

    For example, you could upload XML files to a managed folder (or connect your folder to an external database with XML files) > select the menu icon on the XML file > "create a dataset". In the next window, select "test & get schema" then "load preview"

    Screenshot 2024-06-17 at 4.44.18 PM.png

    And, the output should look something like this:

    Screenshot 2024-06-17 at 4.42.47 PM.png

    This is the simplest way to load an XML file. To do this in Python recipe instead, you will need to use pandas read_xml, in which case you would need to specify the XPath. You would also need a code env with pandas 1.3 and the library "lxml" installed in the env.

    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    # Read recipe inputs
    XML_folder = dataiku.Folder("3trhpQ3P")
    
    with XML_folder.get_download_stream("books.xml") as f:
        data = pd.read_xml(f, xpath='/catalog/book')
    
    books_df = data
    
    # Write recipe outputs
    books = dataiku.Dataset("books")
    books.write_with_schema(books_df)

    Let us know if you have questions!

    Thanks

521 - 530 of 5.2k53