Roles & Permissions 

Manager+ roles could configure settings to enable OCR from VIDIZMO Indexer and Moderator+ user can utilize OCR capabilities on their content. 


Introduction

 With the increase of data in different forms the need to utilize it effectively is becoming crucial. As VIDIZMO helps you to manage your content, which could contain important data in different formats of media. To increase the searchability of these media in VIDIZMO portal we have introduced OCR (Optical Character Recognition) capability which will analyze the text present in your media and will extract and save it for further use. This text is then inserted within the database and indexed to be made searchable by the end-users.


Overview

Now VIDIZMO indexer is enriched with the capability to perform OCR as well. OCR could be performed on Videos, Images, and Documents. We trained a powerful OCR engine to cater your business needs and to perform exceptionally on different type of data. Additionally, you can also perform OCR on multilingual data present inside a single media (English, Chinese and any other language). This powerful functionality will help you manage media across the portal efficiently. High accuracy of OCR results can be obtained by providing sharp, high-quality media.


Use Cases

Search Within Videos

As more and more businesses share information using videos, and this wastage of time will only worsen without having a video search solution in a place. To cater to this requirement VIDIZMO's OCR functionality can be used. VIDIZMO OCR indexes your video (inclusive of images and documents) in such a way that its VIDIZMO Search Engine can find and return all words shown on-screen.


Enhanced Use of Legal Paperwork

Only a few industries generate as much paperwork as the legal industry, and therefore OCR has numerous applications herein. Reams and Reams of legal documents, filings, judgments, affidavits, especially the printed ones, can be stored and made searchable using VIDIZMO OCR Engine. For an industry that depends heavily on judicial precedent, fast access to legal documents from millions of past cases is undoubtedly a place.


OCR For Video

OCR can be configured to recognize optical characters appearing in your videos. Video OCR detects, extracts, and read areas having characters or texts present in your digital video data. Once you run Video OCR on your desired media we extract frames out of your video which is then passed to our powerful OCR engine and optical characters are then extracted from the processed frames. Video OCR not only makes your media searchable within VIDIZMO portal but you can also use this OCR data to redirect to the timestamp that Optical Character was appeared in. Processing time of video OCR depends on the duration and size of respective video


OCR For Image

VIDIZMO provides you the option to run OCR on your images and make them searchable as well. These images are passed to OCR engine and the OCR data is then dumped with the metadata of that image. Image OCR process takes less time than OCR video as it does not constitute of multiple frames.


OCR For Document

Extracting textual data from a scanned document and utilize it to make the document searchable on the basis of its content is possible with VIDIZMO Document OCR. Secondly, With OCR data you can redirect to the relevant page while browsing a multipage page scanned document.

As our OCR engine takes image as an input so for your editable documents we create a copy image and pass it to our OCR engine then map the OCR data based on time on your original media.


The document OCR data is segmented on the basis of line and we also stores the corresponding co ordinates with it so that the exact position of these characters are stored as well.


Note: To perform OCR on any of the mentioned media types you have to configure the relevant media type from VIDIZMO Indexer.


How OCR in VIDIZMO Works ?

Once the OCR activity is triggered it divides your media in multiple images incase of video and document OCR and pass it to the OCR engine which perform OCR and provide OCR data in return which is then mapped on your media and the result OCR data is saved in database so that it becomes searchable throughout the portal.



Supported Languages 

Below is the list of a few languages supported by the VIDIZMO OCR Engine. You can choose one language out of the available languages in your app settings to run processing workflows and extract text corresponding to that language. Our powerful OCR Engine has the capability of detecting English and Chinese characters by default from your content which means even If you have selected any language other than English or Chinese it will detect the character in these two languages automatically.



Language

Abbreviation

Language

Abbreviation

Abaza

abq

Goan Konkani

gom

Afrikaans

af

Icelandic

is

Albanian

sq

Tabassaran

tab

Azerbaijani

az

Kurdish

ku

Belarusian

be

Irish

ga

Bosnian

bs

Lithuanian

lt

chinese and english

ch

Arabic

ar

chinese traditional

ch_tra

Occitan

oc

Czech

cs

Latvian

lv

Danish

da

Malay

ms

Dutch

nl

Kabardian

kbd

english

en

Hindi

hi

french

fr

Uyghur

ug

german

german

Persian

fa

Italian

it

Marathi

mr

japan

japan

Urdu

ur

korean

korean

Serbian(latin)

rs_latin

Maltese

mt

Adyghe

ady

Mongolian

mn

Newari

new

Norwegian

no

Avar

ava

Polish

pl

Dargwa

dar

Portuguese

pt

Serbian(cyrillic)

rs_cyrillic

Romanian

ro

Ingush

inh

Russia

ru

Bulgarian

bg

Saudi Arabia

sa

Hungarian

hu

Slovak

sk

Lak

lbe

Slovenian

sl

Lezghian

lez

Spanish

es

Nepali

ne

Swahili

sw

Maithili

mai

Swedish

sv

Bihari

bh

Tagalog

tl

Angika

ang

Tamil

ta

Indonesian

id

Telugu

te

Croatian

hr

Turkish

tr

Bhojpuri

bho

Ukranian

uk

Estonian

et

Uzbek

uz

Magahi

mah

Vietnamese

vi

Nagpur

sck

Welsh

cy

Maori

mi


Searching OCR Data

Searching on the basis of OCR data is achievable both from the both Media Library and and media playback.


OCR based Search in Media Library

While you are searching for all media having similar keywords or don’t remember the title of your media OCR based search in your media library comes to the rescue. You can type any keyword which appeared as a text in your video, audio or was a part of your document and the relevant results will be displayed.

In these search results all occurrences of the searched keyword will be displayed and you can choose the desired one. By selecting your desired occurrence of the searched keyword you will be redirected to the instance of media that occurrence has taken place. 



OCR based Search From Media Playback

While browsing through a document or watching a video you can search for a specific character and all instances of that character is displayed, You can select the desired one and it will redirect you to the corresponding duration or page of your media which will make the process of viewing media more seamless.

 


Read Next