Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
In addition to the python package pytesseract, the Tesseract system package must be installed on the machine that runs Dataiku (it's written in the How to setup section of the plugin webpage: https://www.dataiku.com/product/plugins/tesseract-ocr/).
The python package is just a wrapper to call the Tesseract system package that cannot be installed by Dataiku.
You can check that Tesseract has been installed by typing the tesseract command in your terminal.
Hi,
Thank you for your response.Do you know how to get to the terminal of Dataiku to install tesseract on the machine?
Sorry but you need to have admin access to the machine on which Dataiku is installed and install tesseract yourself.
This cannot be done directly from Dataiku.
Hello,
Thank you for your answer.I'am in the group on administrators in Dataiku.But ,I don't know how to get to the terminal of the machine. I don't know if it's sufficient ?
@StanG bumping this to keep info consolidated 🙂
would we benefit from containerized exec with this plugin? if so, im assuming we'll need to install os libraries in base images as well?
Hi,
Yes exactly, if you install the tesseract library in the base image as well as building the plugin code env for your container image, then containerized execution should work.
thank you!