Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Translating column Russian to English with Textblob using Python Dataiku Recipe

Translating column Russian to English with Textblob using Python Dataiku Recipe

Hi There,


I am trying to make use of the Textblob package within a Dataiku recipe.

More specifically I'm trying to create a python recipe which translates a column "Description" from Russian to English using this package.

I'm basing myself on the script which I found here in the context of a Kaggle competition:

I wanted to have a try to to see how I could incorporate this into a Dataiku Recipe (I took out the references to the progres bar part, which I don't need here).



My input is "translate_2" which consists out of two columns

-"ID": Integers

-"Description": Russian words with a few missings

My output is "output"




I have reworked the code into the result below to integrate it into Dataiku:


# -*- coding: utf-8 -*-

import dataiku

import pandas as pd, numpy as np

from dataiku import pandasutils as pdu

import sys

import textblob

# Read recipe inputs

train_Raw_filtered = dataiku.Dataset("translate_2")

x = train_Raw_filtered.get_dataframe()



#Takes data frame as input, then searches and fills missing description with недостающий (russian for "missing")


def desc_missing(x):


    if x['Description'].isnull().sum()>0:


        return x


        return x




def translate(x):


        return textblob.TextBlob(x).translate(to="en")


        return x





#Map to new column

def map_translate(x):


    return x


# Write recipe outputs to dataiku

train_Raw_Translated = dataiku.Dataset("output")




The code runs without error. It does impute the "missing" value, but I do not seem to succeed to write the actual translation 

into the Dataiku recipe output. It just inherits the original values:


When I take a look at the logs I find this line which I don't know how to interpret at this point:


Bottom line:

  • I would expect the en_desc to contain the translation but it does not.

  • Do you guys have any input what I'm doing wrong here? I seem not to be able to figure out what is going wrong here.

Any help would be appreciated.

Thanks a million.


Kind Regards,



0 Kudos
1 Reply
Dataiker Alumni
Hi Tim,

This is a python question, not linked to Dataiku DSS. Actually, the log is fine, and the way you read and write through the dataiku package is correct.

Then it is a matter of debugging your code.

We advise prototyping in a jupyter notebook first so you can execute block by block interactively. Some advice: prototype on a smaller sample, add print statements and never use an except clause without returning the error. Otherwise your code could be wrong but you would not be able to see it.

In particular I would inspect the behaviour of your translate function.


0 Kudos


Labels (2)
A banner prompting to get Dataiku