Issues in Running Similarity Search Plug-in

rennyjosetm
rennyjosetm Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 6 Partner

I am getting the following error when I try to run the Similarity Search in DATA IKU

We have tried multiple methods - Glove , ELMO , FastText , Word2Ve but still end up in the same error . The Sentence Embeding works fine , But When we try to use Similarity Search it throws the below error

Any pointers will help

----------------------------------------------- Error Logs --------------------------------------------------------

search_managed/bin/python","cpuUserTimeMS":0,"cpuSystemTimeMS":0,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":0,"cpuCurrent":0.0,"vmSizeMB":120,"vmRSSMB":4,"vmHWMMB":4,"vmRSSAnonMB":1,"vmDataMB":1,"vmSizePeakMB":121,"vmRSSPeakMB":4,"vmRSSTotalMBS":0,"majorFaults":2,"childrenMajorFaults":0}}

[2021/05/20-09:07:13.930] [FRT-33-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_R0KAzeJy_NP - Error file found, trying to throw it: /home/dataiku/dss/jobs/PLUGIN_NLP/Build_word2vec_NNSI_2021-05-20T09-07-05.683/compute_R0KAzeJy_NP/custom-python-recipe/pyout2BA6E19kip4w/error.json

[2021/05/20-09:07:13.930] [FRT-33-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_R0KAzeJy_NP - Raw error is{"errorType":"\u003cclass \u0027TypeError\u0027\u003e","message":"a bytes-like object is required, not \u0027str\u0027","detailedMessage":"At line 35: \u003cclass \u0027TypeError\u0027\u003e: a bytes-like object is required, not \u0027str\u0027","stackTrace":[]}

[2021/05/20-09:07:13.931] [FRT-33-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_R0KAzeJy_NP - Now err: {"errorType":"\u003cclass \u0027TypeError\u0027\u003e","message":"Error in Python process: a bytes-like object is required, not \u0027str\u0027","detailedMessage":"Error in Python process: At line 35: \u003cclass \u0027TypeError\u0027\u003e: a bytes-like object is required, not \u0027str\u0027","stackTrace":[]}

[2021/05/20-09:07:13.934] [FRT-33-FlowRunnable] [INFO] [dku.flow.activity] act.compute_R0KAzeJy_NP - Run thread failed for activity compute_R0KAzeJy_NP

com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 35: <class 'TypeError'>: a bytes-like object is required, not 'str'

at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:221)

at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:186)

at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:103)

at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:48)

at com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner.run(CustomPythonRecipeRunner.java:71)

at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)

[2021/05/20-09:07:13.990] [ActivityExecutor-28] [INFO] [dku.flow.activity] running compute_R0KAzeJy_NP - activity is finished

[2021/05/20-09:07:13.992] [ActivityExecutor-28] [ERROR] [dku.flow.activity] running compute_R0KAzeJy_NP - Activity failed

com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 35: <class 'TypeError'>: a bytes-like object is required, not 'str'

at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:221)

at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:186)

at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:103)

at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:48)

at com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner.run(CustomPythonRecipeRunner.java:71)

at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)

[2021/05/20-09:07:13.992] [ActivityExecutor-28] [INFO] [dku.flow.activity] running compute_R0KAzeJy_NP - Executing default post-activity lifecycle hook

[2021/05/20-09:07:13.994] [ActivityExecutor-28] [INFO] [dku.flow.activity] running compute_R0KAzeJy_NP - Done post-activity tasks

Answers

  • KimmyC
    KimmyC Dataiker Posts: 34 Dataiker

    Hi,

    Which DSS version are you using? Please note that to use this plugin, you need to be on DSS 8.0.2 or higher.

    If you are on DSS 8.0.2 and above, then could you please open a new support ticket , alongside with an instance diagnosis.

    Thanks,

    Kim

  • rennyjosetm
    rennyjosetm Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 6 Partner

    Unfortunately we are with D88 8.00 Version . Is there any workaround that we can use or are there any other similar plug-ins that that we can potentially look at

    Appreciate the quick response

  • KimmyC
    KimmyC Dataiker Posts: 34 Dataiker

    Hi @rennyjosetm
    ,

    Unfortunately there are no workarounds for this but to upgrade. There are no similar plugins as an alternative either.

    Thanks,

    Kim

  • larabrian
    larabrian Registered Posts: 1 ✭✭✭

    The reason for this error is that in Python 3, strings are Unicode, but when transmitting on the network, the data needs to be bytes instead. We can convert bytes to string using bytes class decode() instance method, So you need to decode the bytes object to produce a string. In Python 3 , the default encoding is "utf-8" , so you can use directly:

    b"python byte to string".decode("utf-8")

    Python makes a clear distinction between bytes and strings . Bytes objects contain raw data — a sequence of octets — whereas strings are Unicode sequences . Conversion between these two types is explicit: you encode a string to get bytes, specifying an encoding (which defaults to UTF-8); and you decode bytes to get a string. Clients of these functions should be aware that such conversions may fail, and should consider how failures are handled.

Setup Info
    Tags
      Help me…