Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocr #1

Open
wants to merge 67 commits into
base: master
Choose a base branch
from
Open

ocr #1

wants to merge 67 commits into from

Conversation

aidinkrmz
Copy link

Hello
I'm a software engineering student and i use tesseract OCR engine in a university project. For persian language, traineddata which it's a file and it made by Training tesseract 4.00 and LSTM method, has a good result and output in Arial fonts but it doesn't have any good result in some specific fonts for persian. So the questions are :
1- did you use specific fonts like B Nazanin , B Roya or etc in Training Tesseract 4.00 with LSTM or not?
2- if they haven't used how can we use these fonts for getting better result?
I prepared a text that all the cases of litrates have repeated for 10 or 15 or more than 15 times in this text. Also i used the process of training tesseract 3.05 for this text but i didn't get better and beneficial output.
For achieving to a good result in persian in Tesseract OCR engine we need your experience and your help.
Thanks for your attention
Sincerely.

Ce Ge and others added 30 commits May 15, 2016 15:13
Corrected "One this" to "Once this" and added comma for proper punctuation on line 702.
Revert "Use open() instead of tf.gfile.FastGFile()"
Fix broken link in inception readme
added python3 support to read_label_file
For English News Corpus,
[Ling et al. (2015)](http://www.cs.cmu.edu/~lingwang/papers/emnlp2015.pdf)'s score is 
97.78 -> 97.44 (lower than SyntaxNet and Parsey Mcparseface)
according to [Andor et al. (2016)](http://arxiv.org/abs/1603.06042).
Fix POS tagging score of Ling et al.(2005)
"threads" declared twice, so delete one
Add Inception-ResNet-v2 pre-trained model
Fix comment of parameter "output_codes"
Fix end point collection to return a dict
panyx0718 and others added 30 commits October 27, 2016 21:37
Explicitly set state_is_tuple=False.
Differential privacy analysis for the privacy model tutorial
Added STREET model for FSNS dataset
Consolidate privacy/ and differential_privacy/.
Now differential_privacy and privacy are
under the same project.
Remove privacy/ after consolidation.
val_captions_file -> captions_val2014.json
Remove comment that TensorFlow must be built from source.
Update compression model README with results for comparison.
Adding list of maintainers
Changing model links to point to tensorflow/models repository.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.