Plugin Tesseract

Options
safa94
safa94 Registered Posts: 3 ✭✭✭

Hello! I have a problem in running the plugin Tesseract with the recipe text extraction.

Even though, pytesseract 0.3.7 is installed in the managed environment,I have an error that it's not installed or not found in path.

Answers

  • StanG
    StanG Dataiker, Registered Posts: 52 Dataiker
    Options

    Hi,
    In addition to the python package pytesseract, the Tesseract system package must be installed on the machine that runs Dataiku (it's written in the How to setup section of the plugin webpage: https://www.dataiku.com/product/plugins/tesseract-ocr/).

    The python package is just a wrapper to call the Tesseract system package that cannot be installed by Dataiku.

    You can check that Tesseract has been installed by typing the tesseract command in your terminal.

  • safa94
    safa94 Registered Posts: 3 ✭✭✭
    Options

    Hi,

    Thank you for your response.Do you know how to get to the terminal of Dataiku to install tesseract on the machine?

  • StanG
    StanG Dataiker, Registered Posts: 52 Dataiker
    Options

    Sorry but you need to have admin access to the machine on which Dataiku is installed and install tesseract yourself.
    This cannot be done directly from Dataiku.

  • safa94
    safa94 Registered Posts: 3 ✭✭✭
    Options

    Hello,

    Thank you for your answer.I'am in the group on administrators in Dataiku.But ,I don't know how to get to the terminal of the machine. I don't know if it's sufficient ?

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron
    Options

    @StanG
    bumping this to keep info consolidated

    would we benefit from containerized exec with this plugin? if so, im assuming we'll need to install os libraries in base images as well?

  • StanG
    StanG Dataiker, Registered Posts: 52 Dataiker
    Options

    Hi,

    Yes exactly, if you install the tesseract library in the base image as well as building the plugin code env for your container image, then containerized execution should work.

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron
    Options
Setup Info
    Tags
      Help me…