re-build a dataset from dataiku webapp
I trying to create project ,where I can recalculate my data based the user input . below is my flow in dataiku project
Data - this data set ,i have uploaded through csv file
webapp_input - This dataset ,I am loading from webapp (created in html,js and python backend )
Output- this data set I am creating using python recipe
Problem statement :
whenever ,I am adding more data in webapp_input through dataiku webapp .I have to run the python receipt manually to re-build the "output"dataset .
But I am trying to trigger it through webapp itself .As soon as will update the data in UI .it should store in webapp_input dataset (which is working fine ) and re-build the "output" dataset .
I have tried following python scripts for this .
import pandas as pd
import numpy as np
import dataiku
import json
import dataikuapi
client = dataiku.api_client()
def jobrun():
project = client.get_project('Testing')
definition = {
"type" : "NON_RECURSIVE_FORCED_BUILD",
"outputs" : [{
"id" : "output",
"partition" : "NP"
}]
}
job = project.start_job(definition)
state = ''
while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':
time.sleep(1)
state = job.get_status()['baseStatus']['state']
Even though the dataset is present in the Project, it’s giving us below error message.
DataikuException: java.lang.IllegalArgumentException: Dataset not found or not buildable: Testing.output
I have tried second option :
from dataiku.scenario import Scenario
client = dataiku.api_client()
scenario.build_dataset("output ")
Non of the options are working .would you be to help me with solving this issue .or any other approach for my requirement .
Thanks
Answers
-
Hi,
Are you sure that your project key is "Testing" ? The project key is not the name that displays on the home page, but the portion in the URL. It is usually all-caps -
Hi Stenac,
Thanks for your quick response .
"Testing " is my project name and I am writing this script within the dataiku webapp .Do I still need to get the project key for that .
Can you please suggest ,where I should put my project key
import pandas as pd
import numpy as np
import dataiku
import json
import dataikuapi
import dataiku, os.path
from flask import Flask
handle = dataiku.Folder("my")
path = handle.get_path()
def jobrun():
client = dataiku.api_client()
project = client.get_project('Testing')
definition = {
"type" : "NON_RECURSIVE_FORCED_BUILD",
"outputs" : [{
"id" : "output",
"partition" : "NP"
}]
}
job = project.start_job(definition)
#state = ''
#while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':
#time.sleep(1)
#state = job.get_status()['baseStatus']['state']
jobrun()
I was referring below link :
https://doc.dataiku.com/dss/latest/publicapi/client-python/jobs.html -
Thanks ... Stenac
I am able to run it now .