Using Dataiku
- I'm creating a python function endpoint with this script: And I don't know how to deal with this error: Dev server deployment FAILED Failed to initiate function server : <class 'Exception'> : Default …Last answer by Velichka
Hello all,
i am still looking for a solution to my problem. I have the following Jupyter Notebook:
import dataiku
import pickle
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import dataikuapi
def load_data(folder_name="Recommender"):
managed_folder = dataiku.Folder(folder_name)
with managed_folder.get_download_stream("cosine_similarity.pkl") as stream:
cosine_sim = pickle.load(stream)
with managed_folder.get_download_stream("tfidf_vectorizer.pkl") as stream:
vectorizer = pickle.load(stream)
with managed_folder.get_download_stream("tfidf_matrix.pkl") as stream:
X_tfidf = pickle.load(stream)
sachnummer = dataiku.Dataset("LTM_prep")
df = sachnummer.get_dataframe()
df.drop(['lieferant_name', 'lieferant_ort', 'LIEFERANT_NAME_ORT', 'LT_GEBINDE_NUMMER', 'MDI'], axis=1, inplace=True)
return cosine_sim, vectorizer, X_tfidf, df
def recommend_filtered1(input_bennenung, vectorizer, X_tfidf, df, top_n=10):
try:
if not input_bennenung:
return {"error": "Die Eingabe-Benennung darf nicht leer sein."}
input_bennenung = input_bennenung.upper()
input_vector = vectorizer.transform([input_bennenung])
similarities = cosine_similarity(input_vector, X_tfidf).flatten()
top_indices = similarities.argsort()[-top_n:][::-1]
recommendations = [
{"test": df.iloc[idx]['test'],
"test2": df.iloc[idx]['test2'],
"SIMILARITY_SCORE": round(similarities[idx], 2)}
for idx in top_indices if similarities[idx] > 0
]
return recommendations if recommendations else {"message": "Keine ähnlichen Benennungen gefunden."}
except Exception as e:
return {"error": f"Fehler: {str(e)}"}
def recommend_from_input(input_bennenung):
folder_name = "Recommender"
if not input_bennenung:
return {"error": "Fehlender Parameter 'input_bennenung'"}
try:
# Lade alle benötigten Objekte
cosine_sim, vectorizer, X_tfidf, df = load_data(folder_name)
# Empfehlung berechnen
return recommend_filtered1(input_bennenung, vectorizer, X_tfidf, df)
except Exception as e:
return {"error": f"Fehler beim Laden der Daten oder der Empfehlung: {str(e)}"}and want to call the method
recommend_from_input
from it. I am in the API Designer. I have a managed folder called "Recommender," which I can also see in the Flow. The structure im Folder isUnter Folder Settings i see type is Amazon S3 and i have a setted connection and see also the path in bucket. So when i call def recommend_from_input(input_bennenung): return input_bennenung in the api designer code section with the test query
{
"input_bennenung": "Stern"
}there are no errors and i get "Stern back". So now i just pasted my notebook code in the api designer code Section and when i run it there is a error:
Result:
{"error":"Fehler beim Laden der Daten oder der Empfehlung: Default project key is not specified (no DKU_CURRENT_PROJECT_KEY in env)"}
In the logs are no errors only info and debug.
I would appreciate any help. I have already read the documentation on Exposing Python Functions, but I still don't know where my mistake is.
- I am starting this thread to learn about how others are using code studios (such as VSCode, JupyterLab, and Streamlit) and for what purposes. In our organization, we were initially excited about the f…Solution by
- I've followed the tutorial here: Importing serialized scikit-learn pipelines as Saved Models for MLOps - Dataiku Developer Guide and I've been able to develop a model using the darts==0.30.0 library, …Solution by
- I am trying to install pytorch in python3 in a code environment in data science studio. I can install it in the python3.5 install on the system that Data Science Studio is installed on. I've tried put…Solution by
- While going through the tutorials, I am getting following error for training the model: [2018/06/17-18:42:07.394] [MRT-164] [ERROR] [dku.analysis.ml.python] - Processing failed com.dataiku.dip.io.Sock…Solution by
- I'm working on something that would benefit greatly from using a newer version of pandas. I am aware that pandas is fixed to version 0.20.3 because that is the version the dataiku module requires. I h…Solution by
- Hi there guys! I just installed Dataiku on MacOS (so gpu processing is not supported). Trying to build a Deep learning model with keras and tensorflow, got an import error for tensorflow (using the de…Solution by
- I have a need to create an R code environment through the Dataiku Python API. I am running the following code to do so: import dataiku client = dataiku.api_client() params = { "desc": { "usableByAll":…
- Hi, I tried to train a model using code env of python3 and sklearn, but it failed with ImportError: cannot import name 'logsumexp' So, is the python3 supported for training?Last answer by
- Hi, I am trying to install the "quanteda" package in my code environment but getting an error message. The error message is as follows: In addition: Warning messages:1: In install.packages(toInstall, …