Overview

The seamless integration of VIDIZMO with AWS S3 buckets revolutionizes how organizations manage multimedia content. As a leading platform for enterprise video content and digital evidence management, VIDIZMO now extends its capabilities to ingest content from S3 effortlessly. 

In this article, we explore a step-by-step guide to how to ingest content from S3 Bucket in VIDIZMO.


Prerequisites 

  • You must have Administrator or Manager privileges in the VIDIZMO Portal to perform the necessary configurations.
  • Ensure that you have access to an active AWS account with the necessary permissions to create and manage an S3 bucket.
  • An AWS S3 Bucket from which you want to ingest content into the VIDIZMO Portal. Note the S3 Bucket name, Access Key, and Secret Key, as these will be required during the configuration process.


Note: In the VIDIZMO Portal, users can set up a storage provider of their choice. It is not mandatory to configure the AWS storage provider specifically for ingesting content from an AWS S3 bucket. Users can select and configure a storage provider according to their preferences.


Ingesting Content from S3 Bucket

To ingest content from S3 Bucket in VIDIZMO, follow the steps:

  1. Log into the Portal and click the Menu icon on the top left-hand corner of the screen to open the left navigation pane. 

  2. Expand the Admin section by clicking on the down arrow.  

  3. Click on Portal Setting from the navigation panel.




4. On the Portal Settings navigation pane.  
5. Select Apps
6. Go to Content Ingestion.
7. Select the gear icon to configure the S3 Bucket App.

Configuring S3 Bucket App



  1. Enter the Access Key for the AWS account containing the S3 bucket from which you intend to ingest your content. This key is essential for authentication and access to the specified AWS resources.
  2. Enter the Secret Key for the AWS account that has the S3 bucket from which you intend to ingest your content. This key is necessary for authentication and accessing the specified resources.
  3. Specify the region of your AWS account. Ensure that the region specified for your AWS S3 bucket aligns with the designated region. 
  4. Provide the name of the AWS S3 bucket from which you intend to ingest content.
  5. Ingestion Method
    1. Item Import Mode: Select the import mode for displaying ingested content in VIDIZMO. Choose "Hierarchy" to maintain the original folder structure or "Flat" to import each item individually without folders.
    2. Click the Add button to include folders in content ingestion.
    3. Specify the path of the folder in the prompted text field. Example: "Users\Username1\Documents\ProjectFiles." 
    4. To add more folders, press the Add button again, generating an additional text field for each subsequent folder.
    5. Similarly, specify the path of the folder you want to exclude from the content ingestion process.


Note: Please be advised that our system may not ingest the files from the S3 bucket containing special characters in their file names. This includes but is not limited to, symbols, spaces, and certain non-alphanumeric characters. Files with such names will be skipped during ingestion.



6. Content Organization Method 

As part of your content ingestion process, you can configure file grouping to organize your files better. Choose a File Group Type to determine your preferred method for organizing content:

  • None: By default the system group files with the same name together. 
  • Substring: Group files based on common character count in a file name.
  • Regular Expression: Group files using a specific pattern.
  •  Last Folder: Group files based on the last folder.

Note: Please be advised that to categorize files based on File Group Type, such as Substring/Regex/Last Folder, it's important for the folder names to be similar. Without commonalities among the folder names, the content may not be successfully ingested. For example audio123.mp3, audio45.vtt, and audio89.mp4.


If you selected Substring, configure the following fields:

  1. Start Index: Specify the numeric start index for substring grouping. A substring will be extracted from the file name starting at this position.  
  2. Character Count: Provide the number of characters to take for substring from the file name after Start Index.
  3. Minimum File Count in a Group (Regardless of the file group type chosen, this field is mandatory). Set the minimum number of files for each group. For optimal content ingestion in VIDIZMO, it is recommended to input a minimum count of 3 files to make a group.



If you chose Regular Expression, configure this field:

  1. Regex Pattern: Define the regex pattern for grouping media files. 

Sample Regex: (?<GroupName>(\d|[a-zA-Z])+)\.(mp4|vtt|json|txt|wav|png|ext) ?<GroupName>.This part in regular expression is mandatory. After the group name part, provide an RE pattern to extract common strings from file names that belong to a group. Regex provided for grouping must contain "(?<GroupName>your_pattern)". "your_pattern" should be replaced with the desired pattern, which would then be used as a group name. "?<GroupName>" is a variable that would contain the name of the group to be created. A group is equivalent to a mashup. Therefore, multiple groups mean multiple mashups to be ingested. 


It is pertinent to mention that the users should themselves verify and input valid regex at their discretion.

 

  • Example 1: (?<GroupName>[a-z].*)\.wav

The above regex creates a group from a file's name having any number of characters in lowercase. This is because the "?<GroupName>" appears on the left side of ".". 

  • Example 2: .*(?<GroupName>best).*\.wav

The above regex creates a group from the file name containing the word "best." The group name will be "best".

  • Example 3: .*(?<GroupName>best|ant).*\.wav

The above regex creates a group from the file name containing either "best" or "ant". Files containing "best" would be grouped in the "best" group, and the same applies to "ant".

 

The ".wav" extension should be replaced with the extension of the files the users may wish to group. Multiple patterns can be given separated by a pipe operator "|".


B. Minimum File Count in a Group: Set the minimum number of files in a group.



If you chose Last Folder, configure this field:

  1.  In the last folder, only one field is necessary, specifically the "Minimum File Count in a Group."



7. Mapping Rules for Metadata Files


Media File Sections Rule: Define rules for mapping associated metadata files post-content ingestion.

  1. Regex Pattern: Specify the regex pattern for media files section rules. Example: Regex Pattern   .*\.mp4 will select all .mp4 files and then store them in the chosen Media File Section.
  2. Media File Sections: Choose media file sections to store associated media files in the selected section from the drop-down menu. Add more by pressing "Add" for additional text fields.
VIDIZMO has introduced a function dedicated to ensuring the integrity of user files, known as Media File Sections. These sections consist of content file parts, each responsible for storing specific types of files. Users can define rules to determine which file type belongs to which part. Multiple rules can be specified, each with its own criteria and associated content file part. If a file meets any of the provided criteria, it is placed into the corresponding content part.

Note: To include all files, input ".*" in the Regex pattern field.

The flexibility of this function allows users to store files in various formats, including .vtt and .json. No strict linkage exists between a file format and a specific part; it is the user's choice to specify which format of a file should be placed in a particular part. For instance, a user might instruct the system to move .vtt files to the Thumbnails part of the content file. However, users need to exercise caution when defining rules to avoid placing files in parts that are not intended for such formats, as it may lead to malfunctioning.

The following media file sections are as follows:
  • Audio PCM: Reserved for digitally encoded audio data using PCM.
  • Closed Caption:  Designated to store closed captions associated with video content.
  • Content: A section dedicated to the primary content files.
  • Supporting Files: This section is capable of storing files that support the main content, such as metadata, additional documentation, or related files.
  • Thumbnails: Designated for storing thumbnail images associated with the content.
  • Original Content: Reserved for the storage of the original content file.


Note: Having at least one rule for the media file section with the media file section option "OriginalContent" is mandatory. Moreover, if a user does not specify a media file section rule for a file, then the file can be located in the Supporting File section.



8. Post Ingestion

  1. Select from the drop-down menu to define post-ingestion action.
  • Leave as is:  Implies that the content remains unchanged after ingestion
  • Delete:  Opting for this action will automatically delete content inside the bucket after it has been successfully ingested.
  • Move to Folder:  By selecting this option, users can specify a target AWS bucket and a designated folder path within that bucket where the content should be relocated post-ingestion. Selecting this option will include additional information that you need to provide:
    1. AWS Post Ingestion Bucket: Enter the name of the AWS bucket to which the content will move after ingestion.
    2. Folder Path: Specify the S3 bucket directory where the content should be relocated after ingestion in the portal. Example: folder/subfolder.


B. Utilize the drop-down menu to specify the content state after ingestion:

  • Publish: Automatically publish ingested content.
  • Drafted: Retain ingested content in the draft Tab.

C. Choose the viewing access for ingested content.

D. Specify the time interval, i.e., the number of seconds the system enters a state of rest with no active tasks or operations after completing one ingestion cycle.


 

9. Select Save Changes.


Enabling S3 Bucket App

  1. Initiate the content ingestion process by enabling the toggle button on the content ingestion screen.



2. Click on the progress option to view the status of content ingestion. 



The application operates in three distinct states: 

  • Iteration Start: This initial state indicates that the ingestion process has started, transferring content from the AWS S3 bucket to VIDIZMO as the current state shows Importing Content.
  • Importing Content: In this phase, the modal displays information on the 

    ingesting content.

  • Files Discovered to Ingest displays the count of files identified for ingestion into the portal from the bucket.

  • Ingested Content Count displays the current file count that is ingested along with the content file parts that will be ingested in Media File Sections.

  • Total Content to ingest in iteration displays the overall content count, reflecting the total number of files in the system in an iteration.

  • Iteration Completed: The final state signifies the completion of content ingestion from the AWS S3 bucket to VIDIZMO.



To get a complete understanding of this feature, kindly refer to our article Ingesting Content from an AWS S3 Bucket in VIDIZMO.