Correcting Typos - Text Preparation Plugin?

Rhizom
Rhizom Registered Posts: 3 ✭✭

Hi everyone,

Context: I have data from a survey. One question in the survey is multiple choice and answers predefined, but there is no data validation built into the survey. As a result, I have typos in the data. For example, a column "Genre" can include "Rcok, Clasisc, Jaz".

Question: Is there are smart/quick way to correct spelling mistakes in Dataiku?

Attempted solution: I tried using the Text Preparation Plugin but I get the following error: Error in Python process: At line 4: <class 'ModuleNotFoundError'>: No module named 'regex'. Could this be related to the Code environment (I'm using python 3.11 but the text preparation plugin seems to work with version 3.6 or 3.7.)?

Thanks!

Tagged:

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,248 Dataiker

    Hi,

    The plugin will work with the python mentioned here: https://github.com/dataiku/dss-plugin-nlp-preparation/blob/main/code-env/python/desc.json
    Curently : "PYTHON36", "PYTHON37", "PYTHON38", "PYTHON39"
    Could you try with one these python and see if it works for you?

    Thanks

Setup Info
    Tags
      Help me…