re-build a dataset from dataiku webapp

rakesh99
rakesh99 Registered Posts: 5 ✭✭✭✭

I trying to create project ,where I can recalculate my data based the user input . below is my flow in dataiku project

Data - this data set ,i have uploaded through csv file

webapp_input - This dataset ,I am loading from webapp (created in html,js and python backend )

Output- this data set I am creating using python recipe

Problem statement :

whenever ,I am adding more data in webapp_input through dataiku webapp .I have to run the python receipt manually to re-build the "output"dataset .

But I am trying to trigger it through webapp itself .As soon as will update the data in UI .it should store in webapp_input dataset (which is working fine ) and re-build the "output" dataset .

I have tried following python scripts for this .

import pandas as pd

import numpy as np

import dataiku

import json

import dataikuapi

client = dataiku.api_client()

def jobrun():

project = client.get_project('Testing')

definition = {

"type" : "NON_RECURSIVE_FORCED_BUILD",

"outputs" : [{

"id" : "output",

"partition" : "NP"

}]

}

job = project.start_job(definition)

state = ''

while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':

time.sleep(1)

state = job.get_status()['baseStatus']['state']

Even though the dataset is present in the Project, it’s giving us below error message.

DataikuException: java.lang.IllegalArgumentException: Dataset not found or not buildable: Testing.output

I have tried second option :

from dataiku.scenario import Scenario

client = dataiku.api_client()

scenario.build_dataset("output ")

Non of the options are working .would you be to help me with solving this issue .or any other approach for my requirement .

Thanks

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Hi,

    Are you sure that your project key is "Testing" ? The project key is not the name that displays on the home page, but the portion in the URL. It is usually all-caps
  • rakesh99
    rakesh99 Registered Posts: 5 ✭✭✭✭
    Hi Stenac,

    Thanks for your quick response .
    "Testing " is my project name and I am writing this script within the dataiku webapp .Do I still need to get the project key for that .

    Can you please suggest ,where I should put my project key

    import pandas as pd

    import numpy as np

    import dataiku

    import json
    import dataikuapi
    import dataiku, os.path
    from flask import Flask

    handle = dataiku.Folder("my")

    path = handle.get_path()




    def jobrun():
    client = dataiku.api_client()
    project = client.get_project('Testing')
    definition = {
    "type" : "NON_RECURSIVE_FORCED_BUILD",
    "outputs" : [{
    "id" : "output",
    "partition" : "NP"
    }]
    }
    job = project.start_job(definition)
    #state = ''
    #while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':
    #time.sleep(1)
    #state = job.get_status()['baseStatus']['state']


    jobrun()













    I was referring below link :

    https://doc.dataiku.com/dss/latest/publicapi/client-python/jobs.html
  • rakesh99
    rakesh99 Registered Posts: 5 ✭✭✭✭
    Thanks ... Stenac

    I am able to run it now .
Setup Info
    Tags
      Help me…