Overview
Today, Artificial Intelligence has revolutionized the way people look at information collection and distribution. Easier uploading, targeted sharing, and enhanced accessibility of content have made organizations realize how imperative it is to power their content with advanced media processing capabilities.
VIDIZMO offers AI-powered visual and audio insights for smarter analysis and search optimization using Amazon Rekognition and Transcribe. To learn more about its offerings, see Understanding Video Insights.
Here is how you may configure video insights in your VIDIZMO portal:
Before you start
- To configure video insights in the VIDIZMO Portal, make sure you log in as an administrator or manager.
- Permissions required to generate and fetch video insights from AWS include Amazon Rekognition Full Access and Amazon Transcribe Full Access. To learn more about permissions requirements, see Amazon Rekognition Identity-Based Policy Examples.
- Ensure that you have the Redaction product package for redaction purposes.
AWS Indexer Configuration
1. Use this tutorial to get your AWS account's Access and Secret Key. These keys will be used in the VIDIZMO Portal Configuration section, so keep them in a safe place.
2. Create an Amazon S3 bucket. Click to see: How do I create an S3 Bucket. This storage bucket creation is the pre-requisite for the next section.
VIDIZMO Portal Configuration
I. From the Portal's Homepage:
1. Click on the menu icon on the top left-hand corner of the screen to bring up the left navigation pane.
2. Then click on the down arrow to expand the Admin section.
3. Select the Portal Settings from the navigation panel.
II.On Portal Settings navigation pane:
1. Click on the Apps to expand the list.
2. Select Content Processing, where you can set up AWS Indexer.
3. Click the settings icon against AWS Indexer to connect its app to VIDIZMO and enable its services in your portal.
III. From the AWS Indexer - Settings screen:
The AWS Indexer App comprises a diverse range of preconfigured jobs, each designed to cater to specific user features. The configuration process is closely tied to the selected feature, with fields dynamically adjusting and revealing options based on the user's choice.
To facilitate user interaction and enhance ease of use, the indexer meticulously separates actions within distinct sections. Let's delve into each section individually to explore and discuss the nuanced functionalities and options available.
AWS Credentials
- Enter or paste the Access key you copied in Step 1 of AWS Indexer Configuration.
- Enter or paste the Secret Access Key you copied in Step 1 of AWS Indexer Configuration.
- Select the Region of your Amazon S3 storage bucket. The Region of your Amazon account and S3 bucket must be the same. Otherwise, the Video Insights will not be generated.
- Enter the name of your already created AWS S3 bucket.
Transcription/CC
- Choose the Transcription Language Mode from the following options:
- Specific Language: Opt for this mode if your media file consists of a single language. Here, you can designate a single language and generate your transcript exclusively in that chosen language.
- Auto Detect: This mode automatically identifies the primary language spoken in your media file if you're unsure about its language content. It then generates the transcript based on the identified language. For improved transcription accuracy, you have the option to select at least two dominant languages manually. If uncertain, you can leave this field empty.
- Auto-Detect Multi-Language: Enable this option if your media file includes multiple languages. This feature automatically identifies the languages spoken in your media file and generates the transcript accordingly. Additionally, you can manually select at least two primary languages to improve transcription accuracy if desired. If uncertain, you can leave this field empty.
2. Choose the languages for transcription generation. This selection is optional for Auto-Detect and Multi-Language modes but required for Specific Language mode.
3. Custom Vocabulary Name: Leverage custom vocabularies to enhance transcription accuracy for targeted words. State the filename of the custom vocabulary you created within the AWS Management Console. For further information, consult the documentation on custom vocabularies.
Advanced Processing Option
- Comprehend: Enable this toggle button and utilize the Comprehend API for detection. This feature will help you detect more PII entities. The Comprehend does not provide support for languages other than English and Spanish.
- Insights: Select the detection type, such as Keywords, label detection, Face recognition, tags, keywords, Personally Identifiable Information (PII), or PPE type you wish to identify. If no language is specified in the auto-detect or auto-detect multilingual Transcription Language Mode, please note that PII entities may not appear in the detection types.
- When opting for the Personally Identifiable Information (PII) selection, the Redaction Type field becomes accessible, enabling users to designate the specific PII elements requiring redaction.
Personally Identifiable Information (PII)
If the detection type selected is PII, the subsequent options essential for PII detection and redaction will become visible.
- Store Raw Data File: Enable the toggle button to store raw data files generated by AI Services during AI processing. When you turn this on, it collects detailed information. This includes breaking down spoken content into parts and showing attributes like pronunciation, confidence levels, and timing.
- Confidence Threshold: The setting is configured to 45 by default; however, you can adjust it within the range of 25- 99. The indexer utilizes this range to determine the level of accuracy of detected entities for performing redaction.
- Time Interval Threshold: Enter the time interval threshold in ms for audio redeaction. This value is essential for determining the duration during which audio events are identified for redaction. The configured start time and end time corrections will be applied to audio segments that fall within the specified time interval in ms. Choose the duration based on the use case: shorter for speech, longer for environmental sounds like door slams or car passing.
- Start Time Correction: Enter the start time in milliseconds (ms). This parameter adjusts the accuracy of start time detection for audio events, used for redaction purposes.
- End Time Correction: Enter the end time in milliseconds (ms). This parameter adjusts the accuracy of end-time detection for audio events used for redaction purposes.
- From the dropdown menu, select the Original File Preservation Preferences to determine the fate of the original file post-redaction. This functionality is applicable exclusively when automatic processing is enabled and the content file is uploaded onto the portal for automated redaction.
- Retain File: Opting for this setting will preserve the original file within the portal alongside the redacted version in the published tab.
- Delete and Move to Recycle bin: Selecting this option results in the deletion of the original file, relocating it to the recycle bin, while the redacted file remains accessible in the published tab of the portal.
- Overridden Orignal File: Choosing this option indicates that when you upload a file and activate automatic processing, the system will perform redaction directly on the same file. In this case, there won't be a separate original file; instead, the original content will be transformed into the redacted content during the process.
Kindly note that when utilizing the AWS S3 Bucket app in VIDIZMO for content ingestion and simultaneous automated redaction of the imported content, the preservation preferences for the original file will operate in accordance with the described process above. The content will be ingested based on the user-configured settings in the AWS S3 bucket, and this original file preservation functionality is specifically applicable to automated redaction processes.
Note: If you're using the Auto Detect or Auto Detect Multi-Language features to redact PII, keep in mind that if you don't choose a language, PII won't be detected in audio spoken in languages other than English and Spanish. This is because Comprehend doesn't support languages other than English and Spanish. However, transcription will still be generated. Also, make sure to enable the Comprehend toggle in the AWS indexer so that PII AI insights show up in the on-demand processing modal.
Personal Protective Equipment (PPE)
If the detection type selected is PPE, the subsequent options essential for PPE detection will become visible.
- Frames Frequency per second: No. Of frames per second of video time, which should be processed for PPE. The system achieves this by extracting frames at a customizable frequency, allowing users to specify up to a range of 1-10 frames.
- Min Track Detection Duration: Tracked "people" with a duration in seconds higher than this are checked for PPE.
Automatic Processing
- Automatic Processing: By keeping this On, the insights will run over all the videos automatically, whereas if you make it Off, then the insight has to be run individually manually on the video after uploading.
- Click on the Save Changes button.
IV. From the Content Processing screen:
1. You can now see that the toggle button against the indexer has been unblocked. Enable the toggle button to configure video insights in your portal.
A notification will appear briefly stating that App Settings Updated Successfully.
Read Next
How to Enable and View Video Insights
How to Configure Video Insights in VIDIZMO Portal using Azure Media Indexer