|
Hello,
I am using OCR method MODI for character recognition. I want to recognize character other than english language. So i pass language as argument in OCR method as below.
miDoc.OCR MODI.MiLANGUAGES.miLANG_RUSSIAN, True, True
But this gives error "OCR : Bad language". This error occurs for every language than the english.
This works for only english language.
miDoc.OCR MODI.MiLANGUAGES.miLANG_ENGLISH, True, True
So, please can you give me the solution. I want to recognize characters other than english language using OCR method of MODI.
Thanks in advance. | | Snawvy Tuesday, January 20, 2009 7:39 AM | I am using OCR method MODI for character recognition. I want to recognize character other than english language. So i pass language as argument in OCR method as below. miDoc.OCR MODI.MiLANGUAGES.miLANG_RUSSIAN, True, True But this gives error "OCR : Bad language". This error occurs for every language than the english.
This works for only english language. miDoc.OCR MODI.MiLANGUAGES.miLANG_ENGLISH, True, True
So, please can you give me the solution. I want to recognize characters other than english language using OCR method of MODI.
HiSnawvy,
Welcome to MSDN forums!
To supportNon-English languages inOCR, you need to install the corresponding Language Pack.
Please check the following article for detailed step-by-step instruction. Non-English OCR in Microsoft Office Document Imaging (MODI) http://technojumble.wordpress.com/2008/12/26/non-english-ocr-in-office-2007-document-imaging-modi/
- If the Microsoft Office Proofing Tools aren’t installed, install them.
- If you want to do OCR of English, French, or Spanish, you’re set - it’s included by default with MODI.
- If you want to do OCR of an East Asian Language (Chinese (simplified or traditional), Japanese, Korean, etc.), make sure Windows East Asian language support is installed. To install it (or to check to see if it’s installed):
- Open the Control Panel.
- Run the “Regional and Language Options�applet.
- Select the “Languages�tab.
- If the “Install files for East Asian languages�box is NOT checked:
- Check the box.
- Press the “OK�button.
- Follow the install prompts (you will need your original install media)
- Install the Office 2007 Language Pack for the language you want to do OCR on.
- Install the service pack for the Language Pack you just installed.
- Important: Every Language Pack has its own separate service pack. Installing SP1 for Office, or the English version of SP1 for Office Language Packs, that won’t do it - you need to get the service pack for your specific Language Pack.
- To find the language pack you need:
- Search the Microsoft downloads site for “Office 2007 Language Pack SP1�
- Click through to the page with information about the service pack, but don’t click the “Download�button.
- In your browser address bar, the address for the page you’re on will end with �en� Change this to �xx� where “xx�is the 2 letter code for the language you want to do OCR on.
- For example, for Korean, change �en�to �kr�
- Press Enter to go to that page.
- The page you go to will be in the language you want to do OCR on (not English). But the Download button is in the same place, so even if you can’t read the language, press that button.Open Microsoft Office Document Imaging
- (The following is doing Manual confirmation)
- Select Programs->Microsoft Office->Microsoft Office Tools->Microsoft Office Document Imaging from the Start menu.
- Select Options�/span> from the Tools menu.
- Select the “OCR�tab.
- In the “OCR Language�combo box, select the language you want to perform OCR on.
- If the language you want doesn’t appear in the list, then the Language Pack wasn’t installed properly. Go back to step 5, above.
- Make sure the “Auto rotate�checkbox is NOT checked.
- I found that MODI tends to get this wrong for languages with non-Latin characters, and sometimes rotates the document upside-down before trying to perform OCR on it - which, predictably, doesn’t work very well.
- Press the “OK�button
- Load or scan the document you want to perform OCR on.
- Select Recognize Text Using OCR�/span> from the Tools menu.
- Select Send Text to Word�/span> from the Tools menu.
Best regards, Martin Xie
- Marked As Answer byMartin Xie - MSFTMSFT, ModeratorFriday, January 23, 2009 4:15 AM
-
| | Martin Xie - MSFT Wednesday, January 21, 2009 9:09 AM | I am using OCR method MODI for character recognition. I want to recognize character other than english language. So i pass language as argument in OCR method as below. miDoc.OCR MODI.MiLANGUAGES.miLANG_RUSSIAN, True, True But this gives error "OCR : Bad language". This error occurs for every language than the english.
This works for only english language. miDoc.OCR MODI.MiLANGUAGES.miLANG_ENGLISH, True, True
So, please can you give me the solution. I want to recognize characters other than english language using OCR method of MODI.
HiSnawvy,
Welcome to MSDN forums!
To supportNon-English languages inOCR, you need to install the corresponding Language Pack.
Please check the following article for detailed step-by-step instruction. Non-English OCR in Microsoft Office Document Imaging (MODI) http://technojumble.wordpress.com/2008/12/26/non-english-ocr-in-office-2007-document-imaging-modi/
- If the Microsoft Office Proofing Tools aren’t installed, install them.
- If you want to do OCR of English, French, or Spanish, you’re set - it’s included by default with MODI.
- If you want to do OCR of an East Asian Language (Chinese (simplified or traditional), Japanese, Korean, etc.), make sure Windows East Asian language support is installed. To install it (or to check to see if it’s installed):
- Open the Control Panel.
- Run the “Regional and Language Options�applet.
- Select the “Languages�tab.
- If the “Install files for East Asian languages�box is NOT checked:
- Check the box.
- Press the “OK�button.
- Follow the install prompts (you will need your original install media)
- Install the Office 2007 Language Pack for the language you want to do OCR on.
- Install the service pack for the Language Pack you just installed.
- Important: Every Language Pack has its own separate service pack. Installing SP1 for Office, or the English version of SP1 for Office Language Packs, that won’t do it - you need to get the service pack for your specific Language Pack.
- To find the language pack you need:
- Search the Microsoft downloads site for “Office 2007 Language Pack SP1�
- Click through to the page with information about the service pack, but don’t click the “Download�button.
- In your browser address bar, the address for the page you’re on will end with �en� Change this to �xx� where “xx�is the 2 letter code for the language you want to do OCR on.
- For example, for Korean, change �en�to �kr�
- Press Enter to go to that page.
- The page you go to will be in the language you want to do OCR on (not English). But the Download button is in the same place, so even if you can’t read the language, press that button.Open Microsoft Office Document Imaging
- (The following is doing Manual confirmation)
- Select Programs->Microsoft Office->Microsoft Office Tools->Microsoft Office Document Imaging from the Start menu.
- Select Options�/span> from the Tools menu.
- Select the “OCR�tab.
- In the “OCR Language�combo box, select the language you want to perform OCR on.
- If the language you want doesn’t appear in the list, then the Language Pack wasn’t installed properly. Go back to step 5, above.
- Make sure the “Auto rotate�checkbox is NOT checked.
- I found that MODI tends to get this wrong for languages with non-Latin characters, and sometimes rotates the document upside-down before trying to perform OCR on it - which, predictably, doesn’t work very well.
- Press the “OK�button
- Load or scan the document you want to perform OCR on.
- Select Recognize Text Using OCR�/span> from the Tools menu.
- Select Send Text to Word�/span> from the Tools menu.
Best regards, Martin Xie
- Marked As Answer byMartin Xie - MSFTMSFT, ModeratorFriday, January 23, 2009 4:15 AM
-
| | Martin Xie - MSFT Wednesday, January 21, 2009 9:09 AM |
|