Skip to content

lgdd/tess4j-rest

Repository files navigation

Tess4J-REST

OCR REST API using Tesseract OCR Engine (via Tess4J)

Docker Image

Docker image available: https://hub.docker.com/r/lgdd/tess4j-rest

Try and run:

docker run -it --rm -p 8000:8000 lgdd/tess4j-rest

Usage

Run docker-compose up --build (also available as make dev).

Note: You can also run ./mvnw quarkus:dev (or quarkus dev). But for this method to work, you would need the environment variable TESSDATA_PREFIX to be set to the absolute path of this project resource: src/test/resources/test-tessdata/eng.traineddata

You can navigate to http://localhost:8000/q/swagger-ui and test uploading an image.

Or you can quickly test the endpoint with curl (from this project root):

curl -X 'POST' \
     'http://localhost:8000/detect-text' \
     -H 'accept: text/plain' \
     -H 'Content-Type: multipart/form-data' \
     -F 'file=@src/test/resources/test-data/eurotext.png'

Environment variables

# Parent folder path for tesseract data files
ENV TESSDATA_PREFIX="/opt/tesseract/tessdata"

# Suffix for the data repository to use.
# Either "best", "fast" or "".
# See https://github.com/tesseract-ocr/tessdata#readme
ENV TESSERACT_DATA_SUFFIX="best"

# Version of the data repository.
# See https://github.com/tesseract-ocr/tessdata#readme
ENV TESSERACT_DATA_VERSION="4.1.0"

# Additional languages to download on the application startup.
# For the possible values, see https://github.com/tesseract-ocr/tessdata
ENV TESSERACT_DATA_LANGS="fra,spa,deu"

Health probes

Readiness: /q/healh/ready

Liveness: /q/healh/live

Application is ready and live when all additional languages has been downloaded.

License

MIT

About

OCR REST API using Tesseract OCR Engine (via Tess4J)

Topics

Resources

License

Stars

Watchers

Forks