this post was submitted on 28 Jun 2024
30 points (100.0% liked)
Open Source
31258 readers
249 users here now
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Useful Links
- Open Source Initiative
- Free Software Foundation
- Electronic Frontier Foundation
- Software Freedom Conservancy
- It's FOSS
- Android FOSS Apps Megathread
Rules
- Posts must be relevant to the open source ideology
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
- !libre_culture@lemmy.ml
- !libre_software@lemmy.ml
- !libre_hardware@lemmy.ml
- !linux@lemmy.ml
- !technology@lemmy.ml
Community icon from opensource.org, but we are not affiliated with them.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
How good is good do you say?
We got a pretty good results with CER at 4% and WER at 15%!
This was on a limited dataset used to test and train which most likely means that if you introduced an even larger dataset with greater variations in handwriting style for testing the numbers might be even worse.
Very simplified: A risk of a character wrong every 20th character and a word wrong every 7th word. The SER was around 20%.
There's an reason why no one has released a good model for western letters yet and why companies pay up to 1€ for capturing data from 10 handwritten pages.
It will come but OCR isn't as sexy as developing text2image solutions.