[squeak-dev] Squeak and Tesseract
Ben Coman
btc at openinworld.com
Fri Nov 2 13:06:16 UTC 2018
On Fri, 2 Nov 2018 at 18:44, Edwin Ancaer <eancaer at gmail.com> wrote:
> Hello list,
>
> As I'm looking at a way to automate the search of documents in my humble
> administration, I read some articles about OCR. I came along an article
> about using Python with Tesseract, to transform an scan of a document into
> text, that is searchable.
>
> My question now is if I can do something similar with Squeak. To my
> inexperienced eye, it seems like I should use FFI to call the functions in
> the Tesseract API, but this API is in C++, and I don't know if it is
> possible to use FFI to call C++ functions?
>
You are right C++ is difficult because of the name mangling of function
symbols,
but good fortune I notice Tesseract has C bindings...
https://github.com/tesseract-ocr/tesseract#for-developers
https://github.com/tesseract-ocr/tesseract/blob/master/src/api/capi.h
so it looks like you are in the clear.
Or should I forget the API and use OSProcess to start the tesseract
> program?
>
FFI will be more flexible.
Could anyone point me in the right direction, or just tell if the whole
> idea is insane?
>
I think its a great idea and actually Tesseract FFI is something I've
wanted to play with before but not had the time.
I'd be interested to hear how you go with it.
cheers -ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20181102/ea146b78/attachment.html>
More information about the Squeak-dev
mailing list
|