Visual Basic Development Bookmark and Share   
 Home > Visual Basic Language > OCR method of MODI
 

OCR method of MODI

Hello,

I am using OCR method MODI for character recognition.
I want to recognize character other than english language.
So i pass language as argument in OCR method as below.

miDoc.OCR MODI.MiLANGUAGES.miLANG_RUSSIAN, True, True

But this gives error "OCR : Bad language". This error occurs for every language than the english.

This works for only english language.

miDoc.OCR MODI.MiLANGUAGES.miLANG_ENGLISH, True, True

So, please can you give me the solution. I want to recognize characters other than english language using OCR method of MODI.

Thanks in advance.
Snawvy  Tuesday, January 20, 2009 7:39 AM
Snawvy said:
I am using OCR method MODI for character recognition.
I want to recognize character other than english language.

So i pass language as argument in OCR method as below.
miDoc.OCR MODI.MiLANGUAGES.miLANG_RUSSIAN, True, True
But this gives error "OCR : Bad language". This error occurs for every language than the english.

This works for only english language.
miDoc.OCR MODI.MiLANGUAGES.miLANG_ENGLISH, True, True

So, please can you give me the solution. I want to recognize characters other than english language using OCR method of MODI.

HiSnawvy,

Welcome to MSDN forums!

To supportNon-English languages inOCR, you need to install the corresponding Language Pack.

Please check the following article for detailed step-by-step instruction.
Non-English OCR in Microsoft Office Document Imaging (MODI)
http://technojumble.wordpress.com/2008/12/26/non-english-ocr-in-office-2007-document-imaging-modi/

  1. If the Microsoft Office Proofing Tools aren’t installed, install them.
  2. If you want to do OCR of English, French, or Spanish, you’re set - it’s included by default with MODI.
  3. If you want to do OCR of an East Asian Language (Chinese (simplified or traditional), Japanese, Korean, etc.), make sure Windows East Asian language support is installed. To install it (or to check to see if it’s installed):
    • Open the Control Panel.
    • Run the “Regional and Language Optionsâ€?applet.
    • Select the “Languagesâ€?tab.
    • If the “Install files for East Asian languagesâ€?box is NOT checked:
      • Check the box.
      • Press the “OKâ€?button.
      • Follow the install prompts (you will need your original install media)
  4. Install the Office 2007 Language Pack for the language you want to do OCR on.
  5. Install the service pack for the Language Pack you just installed.
    • Important: Every Language Pack has its own separate service pack. Installing SP1 for Office, or the English version of SP1 for Office Language Packs, that won’t do it - you need to get the service pack for your specific Language Pack.
    • To find the language pack you need:
      • Search the Microsoft downloads site for “Office 2007 Language Pack SP1â€?
      • Click through to the page with information about the service pack, but don’t click the “Downloadâ€?button.
      • In your browser address bar, the address for the page you’re on will end with â€?enâ€? Change this to â€?xxâ€? where “xxâ€?is the 2 letter code for the language you want to do OCR on.
      • For example, for Korean, change â€?enâ€?to â€?krâ€?
      • Press Enter to go to that page.
      • The page you go to will be in the language you want to do OCR on (not English). But the Download button is in the same place, so even if you can’t read the language, press that button.Open Microsoft Office Document Imaging
    • (The following is doing Manual confirmation)
  6. Select Programs->Microsoft Office->Microsoft Office Tools->Microsoft Office Document Imaging from the Start menu.
  7. Select Options�/span> from the Tools menu.
  8. Select the “OCR�tab.
  9. In the “OCR Language�combo box, select the language you want to perform OCR on.
    • If the language you want doesn’t appear in the list, then the Language Pack wasn’t installed properly. Go back to step 5, above.
  10. Make sure the “Auto rotate�checkbox is NOT checked.
    • I found that MODI tends to get this wrong for languages with non-Latin characters, and sometimes rotates the document upside-down before trying to perform OCR on it - which, predictably, doesn’t work very well.
  11. Press the “OK�button
  12. Load or scan the document you want to perform OCR on.
  13. Select Recognize Text Using OCR�/span> from the Tools menu.
  14. Select Send Text to Word�/span> from the Tools menu.



Best regards,
Martin Xie

Martin Xie - MSFT  Wednesday, January 21, 2009 9:09 AM
Snawvy said:
I am using OCR method MODI for character recognition.
I want to recognize character other than english language.

So i pass language as argument in OCR method as below.
miDoc.OCR MODI.MiLANGUAGES.miLANG_RUSSIAN, True, True
But this gives error "OCR : Bad language". This error occurs for every language than the english.

This works for only english language.
miDoc.OCR MODI.MiLANGUAGES.miLANG_ENGLISH, True, True

So, please can you give me the solution. I want to recognize characters other than english language using OCR method of MODI.

HiSnawvy,

Welcome to MSDN forums!

To supportNon-English languages inOCR, you need to install the corresponding Language Pack.

Please check the following article for detailed step-by-step instruction.
Non-English OCR in Microsoft Office Document Imaging (MODI)
http://technojumble.wordpress.com/2008/12/26/non-english-ocr-in-office-2007-document-imaging-modi/

  1. If the Microsoft Office Proofing Tools aren’t installed, install them.
  2. If you want to do OCR of English, French, or Spanish, you’re set - it’s included by default with MODI.
  3. If you want to do OCR of an East Asian Language (Chinese (simplified or traditional), Japanese, Korean, etc.), make sure Windows East Asian language support is installed. To install it (or to check to see if it’s installed):
    • Open the Control Panel.
    • Run the “Regional and Language Optionsâ€?applet.
    • Select the “Languagesâ€?tab.
    • If the “Install files for East Asian languagesâ€?box is NOT checked:
      • Check the box.
      • Press the “OKâ€?button.
      • Follow the install prompts (you will need your original install media)
  4. Install the Office 2007 Language Pack for the language you want to do OCR on.
  5. Install the service pack for the Language Pack you just installed.
    • Important: Every Language Pack has its own separate service pack. Installing SP1 for Office, or the English version of SP1 for Office Language Packs, that won’t do it - you need to get the service pack for your specific Language Pack.
    • To find the language pack you need:
      • Search the Microsoft downloads site for “Office 2007 Language Pack SP1â€?
      • Click through to the page with information about the service pack, but don’t click the “Downloadâ€?button.
      • In your browser address bar, the address for the page you’re on will end with â€?enâ€? Change this to â€?xxâ€? where “xxâ€?is the 2 letter code for the language you want to do OCR on.
      • For example, for Korean, change â€?enâ€?to â€?krâ€?
      • Press Enter to go to that page.
      • The page you go to will be in the language you want to do OCR on (not English). But the Download button is in the same place, so even if you can’t read the language, press that button.Open Microsoft Office Document Imaging
    • (The following is doing Manual confirmation)
  6. Select Programs->Microsoft Office->Microsoft Office Tools->Microsoft Office Document Imaging from the Start menu.
  7. Select Options�/span> from the Tools menu.
  8. Select the “OCR�tab.
  9. In the “OCR Language�combo box, select the language you want to perform OCR on.
    • If the language you want doesn’t appear in the list, then the Language Pack wasn’t installed properly. Go back to step 5, above.
  10. Make sure the “Auto rotate�checkbox is NOT checked.
    • I found that MODI tends to get this wrong for languages with non-Latin characters, and sometimes rotates the document upside-down before trying to perform OCR on it - which, predictably, doesn’t work very well.
  11. Press the “OK�button
  12. Load or scan the document you want to perform OCR on.
  13. Select Recognize Text Using OCR�/span> from the Tools menu.
  14. Select Send Text to Word�/span> from the Tools menu.



Best regards,
Martin Xie

Martin Xie - MSFT  Wednesday, January 21, 2009 9:09 AM

You can use google to search for other answers

Custom Search

More Threads

• How to integrate Fingerprint reader built using Matlab in .NET
• Docking split Forms
• Multi-threading problem
• ONE BindingNAVIGATOR for 2 tables?
• Audio stream capture from web
• VB.NET Datagrid - Easy way to change selected row based on user key press?
• VS2005 managed WebBrowser opening Office docs
• Dynamic ToolStripMenuItem
• String to date parsing
• Service Component, referencing??