TABLE OF CONTENTS


Introduction

In a world where data has become a key factor for every significant decision-making process, large enterprise organizations have come to a conclusion that meaningful data from within unstructured data like audios, videos, or image files should be made extractable, readable, and searchable. Many algorithms have been introduced henceforth to cater to this rising demand for data extraction and manipulation. One such algorithm - Optical Character Recognition (OCR) has evolved over the years to solve the challenge of processing visual text from within ambiguous, non-queryable data into clean, searchable forms. 


Concept

VIDIZMO uses a combination of powerful libraries to identify text characters in digital media such as videos, images, and documents. The fundamental process of VIDIZMO OCR Engine includes examining the text within media and translating the characters into code that can be used for data retrieval, such as search. To learn more about search in VIDIZMO, see: Understanding Search in VIDIZMO


How does OCR work in VIDIZMO?

Based on your app settings, when content is ingested in an OCR-enabled portal, workflows for content processing are initiated. These workflows include pre-processing activities that include aligning the orientation of the detected text to be able to be deciphered by the algorithms, breaking a video into frames with an interval of 500 ms to be able to analyze those frames independently, etc. After pre-processing, the content chunks are then processed using multiple libraries to detect meaningful sequence of text within them. This text is then inserted within the database and indexed to be made searchable by the end-users.


After a successful OCR activity, end-users can search words to obtain trackable results that navigate them to the page or point in a video where the text was recognized on the screen. This essentially helps sifting through webinar presentations and searching for a keyword in the middle of a training session.



Supported Languages

VIDIZMO OCR Engine supports more than 100 languages including their dialects for OCR generation. Below is the list of a few languages supported by the VIDIZMO OCR Engine. You can choose one language out of the available languages in your app settings to run processing workflows and extract text corresponding to that language.


 Afrikaans Chinese Greek Macedonian Spanish
 Albanian Croatian Hindi Maltese Swahili
 Arabic Danish Hungarian Malay Swedish
 Azerbaijani Dutch Indonesian Norwegian Tagalog
 Basque English Italian Polish Tamil
 Belarusian Esperanto Japanese Portuguese Telugu
 Bengali Estonian Kannada Romanian Thai
 Bulgarian Finnish Korean Russian Turkish
 Catalan French Latvian Serbian Ukrainian
 Czech Galician Lithuanian SlovakUrdu
 Cherokee German Malayalam SlovenianVietnamese


How-to Guide

Enabling VIDIZMO OCR

OCR can be enabled by following the below steps.


1. From the Portal Homepage;

i. Click on the navigation menu on the top left corner.

ii. Click on the Admin tab.

iii. Click on Portal Settings, to open the portal settings page.



2. From the Portal Settings:

i. Click on the Apps option, to expand it. 

ii. Navigate to the Content Processing, and click to open it. 

iii. Now, click on the gear icon against the VIDIZMO OCR

3. VIDIZMO OCR - Settings screen will be opened in this screen


1. Media Formats: Check the Media Formats against which you want to run OCR indexing

2. Default Language: Select the language against which you want to run OCR. 

3. Click on the Save Changes button.



4. Now, enable OCR using the toggle button



Note: Only Manager+ portal users are authorized to enable VIDIZMO OCR.


Retrieving OCR Results

1. To retrieve results generated through VIDIZMO OCR follow the steps below.

i. In the Search box, enter keyword(s) against which you want to fetch information from the existing content in the Portal.

ii. The media items having the keyword (if detected from OCR) would be displayed on the library page. Click on the expand button to see the keyword occurrence(s) timeline.

iii.  Click on any point in the timeline to navigate to that specific point in the video. 



Note: The above representation is for video only. In case of document or image complete text would be displayed.


Use Cases

1. Search Within Videos

An average worker spends 20% of their - which is equivalent to one whole day in a week for searching information they need to do their job effectively. 

Source: McKinsey & Company

As more and more businesses share information using videos, and this wastage of time will only worsen without having a video search solution in a place. To cater to this requirement VIDIZMO's OCR functionality can be used. VIDIZMO OCR indexes your video (inclusive of images and documents) in such a way that its VIDIZMO Search Engine can find and return all words shown on-screen.


Only a few industries generate as much paperwork as the legal industry, and therefore OCR has numerous applications herein.  Reams and Reams of legal documents, filings, judgments, affidavits, especially the printed ones, can be stored and made searchable using VIDIZMO OCR Engine.  For an industry that depends heavily on judicial precedent, fast access to legal documents from millions of past cases is undoubtedly a place.


VIDIZMO OCR can extract images from documents having textual information in them and then processes them making image text searchable within a document, so you do not miss any important piece of textual information within any document.

Limitations

Though OCR is a powerful tool, yet there are some limitations associated with it.

  • It can not recognize handwriting
  • If media file contains language other than the Default Language, then results would be futile. 
  • Low-quality scans may produce inferior quality OCR
  • Does not support multiple angles rotated text
  • Works on 2D media only
  • Most of the media formatting may lose during text scanning. The output from a processed media would be a single column plain text.