I'm working with short-form text question answers that have been hand-typed into online forms. As is typical with human hand-entered data there are typographical errors all over the place. A small percentage of folks who fill out these forms are providing non-English responses.
I'm trying to use the Dataiku Text Preparation plugin. As I'm using it, I'm finding a large number of errors. Language identification is of poor quality. Therefore multilingual spelling correction is poor. And I've not even tried language translation.
I'm on a non-profit budget right now. I'm wondering if there are folks out there who are dealing with this kind of challenge without resorting to sending the data to service providers like google, amazon, or azure. And if you have no other way to get this kind of thing done. Who have folks found most economical.