Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I am a python script that uses tesseract engine in order to extract text from scanned pdf files. I have already tried to use tesseract OCR plugin but the results aren't what I am looking for. The python script that I wrote in my laptop is working fine. However, When I am using the same code in dataiku server I got this error.
both python script and dataiku notebook error are attached here.
please let me know how to fix this issue
The error message is quite clear. You need to install Tesseract version 3.05 or newer in the DSS server so that the pytesseract library can work properly. There are more detail's in the library documentation: