Job failed: Error in python process: At line 16: Python process is running remotely

Options
Ashlin
Ashlin Registered Posts: 2 ✭✭✭

Dear Experts,

currently using Dataiku online. I am trying to read the cleaned dataset , train it and get the model stored in the folder

I am new in using Dataiku api, would like to take help here .

Then use it in dataiku to test and create api.

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE# -*- coding: utf-8 -*-import dataikuimport pandas as pd, numpy as npfrom dataiku import pandasutils as pduimport gensimimport nltkfrom gensim.models import Word2Vecfrom nltk import word_tokenize# Read recipe inputsCleansedData = dataiku.Folder("YEM8QpBl")#CleansedData_info = CleansedData.get_info()Source_Path = CleansedData.get_path()path_Of_CSV = os.path.join(folder_path, "CleanDataForSentence2Vec.csv")df = pd.read_csv(path_of_csv)#Problem here# get array of titlestitles = df['title'].values.tolist()# tokenize the each titletok_titles = [word_tokenize(title) for title in titles]# refer to here for all parameters:# https://radimrehurek.com/gensim/models/word2vec.htmlmodel = Word2Vec(tok_titles, sg=1, size=100, window=5, min_count=5, workers=4,iter=100)#model.save('./data/job_titles.model')# Write recipe outputs #Problem here tooModel = dataiku.Folder("rlACbXYw");path = Model.get_path();model.save(path/'job_titles.model')Model_info = Model.get_info()

Please find the code above!

Do suggest the needful!

Br

Ash

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Answer ✓
    Options

    Hi Ash,

    Based on the context on another channel what is happening here is your managed folders are not local.

    This means you will need to use managed folder read/write APIs instead. e.g get_download_stream and upload_stream. Please see some suggested changes in the code below.

    https://knowledge.dataiku.com/latest/courses/folders/managed-folders.html

    Let me know if that works for you!

    import dataikuimport pandas as pd, numpy as npfrom dataiku import pandasutils as pduimport gensimimport nltkfrom gensim.models import Word2Vecfrom nltk import word_tokenizefrom io import BytesIO# Read recipe inputsCleansedData = dataiku.Folder("YEM8QpBl")#CleansedData_info = CleansedData.get_info()Source_Path = CleansedData.get_path()# you can also use list_paths_in_partition()# change to something likewith CleansedData.get_download_stream("/CleanDataForSentence2Vec.csv) as stream:df = pd.read_csv(stream)# get array of titlestitles = df['title'].values.tolist()# tokenize the each titletok_titles = [word_tokenize(title) for title in titles]# refer to here for all parameters:# https://radimrehurek.com/gensim/models/word2vec.htmlmodel = Word2Vec(tok_titles, sg=1, size=100, window=5, min_count=5, workers=4,iter=100)#model.save('./data/job_titles.model')# Write recipe outputs #Problem here too#change this part to something likeModel = dataiku.Folder("rlACbXYw")path = Model.get_path()bytes_container = BytesIO()model.save(bytes_container)bytes_container.seek(0)Model.upload_stream("saved_model.model", bytes_container)Model_info = Model.get_info()

Answers

Setup Info
    Tags
      Help me…