General Discussion

Sort by:

1 - 3 of 3

Running your ml flow from bokeh web-app submit button in Dataiku
Hello everyone! So, i am working on a project where we have a webApp in bokeh and ml pipeline in flow section of DataIku. So, the application has several input fields which should be supplied to the m…
Question
Variables
Dates & timezones
Syncing
Started by rajat_malik
Most recent by pmasiphelps
Sep 14, 2021
0
1
Last answer by pmasiphelps
Hi,

Here's some python code using the API that assumes you have a flow starting with an "Uploaded File" type dataset. This assumes you've written the code to read in an uploaded file from the webapp user - then it places this file in the initial dataset in your flow.

import dataiku from dataiku import pandasutils as pdu import pandas as pd client = dataiku.api_client() proj = client.get_project("PROJECT_KEY") ds = proj.get_dataset("UPLOADED_FILE_DATASET_NAME") #clear any existing file from this dataset ds.clear() upload_file_path = "your_path_here" with open(upload_file_path, "rb") as f: ds.uploaded_add_file(f, upload_file_path)

Then, assuming you've created a scenario in your project that runs your ML pipeline, you can run this scenario via another API call in your webapp.

scenario = proj.get_scenario("SCENARIO_ID") trigger_fire = scenario.run()

There are a number of variants to running scenarios via the API - doc here: https://doc.dataiku.com/dss/latest/python-api/scenarios.html#run-a-scenario

If using Bokeh, the way to load the latest trained model from the flow would indeed be via the API.

If you're interested in developing applications using flow components (replacing files in datasets, running scenarios, pulling ML model info) without having to code them yourself using the APIs - project applications would be a great thing to check out. Here's some hands-on tutorials: https://academy.dataiku.com/dataiku-applications-tutorials-open

Best,

Pat
Reply to Discussion
Bluk replace a connection for datasets
Hello We want to import multiple projects and while importing into new environment, we would like to bulk replace connection from "file system" to "HDFS". Standard Export - Import only allows to choos…
Question
Spark
Dates & timezones
Syncing
Started by GirishLakade
Most recent by Manuel
Jun 3, 2021
0
2
Last answer by Manuel
Even if there are multiple projects, at the project level it is fairly easy to change the connection in bulk:
In your flow,
(bottom left) View > Connections > connection > Select all checked
(bottom right) Other Actions > Change Connection
See the image below.
I hope this helps.
Screenshot 2021-06-03 at 10.46.17.png
Reply to Discussion
PySpark exit recipe with Warning status
I have a PySpark recipe which reads a dataset, and extracts a column based on first index (first row). In a scenario when the input dataset partition is empty, it throws a normal error: 'index out of …
Question
Backup
Hadoop
Syncing
Started by jacksonisaac
Most recent by emher
May 11, 2021
0
2
Last answer by
Reply to Discussion

1 - 3 of 31

Trending Discussions

How to get information about jobs' CPU and ressource usage?
Answered
1

Leaderboard

Member	Points
Turribeach	3765
tgb417	2523
Ignacio_Toledo	1083

General Discussion

Running your ml flow from bokeh web-app submit button in Dataiku

Bluk replace a connection for datasets

PySpark exit recipe with Warning status

Top Tags

Trending Discussions

How to get information about jobs' CPU and ressource usage?

Leaderboard