Introduction

In the current era of technology, protecting sensitive personal data is of utmost importance.VIDIZMO's Spoken PII detection and Redaction feature is an avant-garde feature that enhances privacy and security. This feature allows users to easily upload audio or video files automatically scanned for Personally Identifiable Information (PII) such as names, addresses, SNN, phone numbers, and other sensitive data. The system detects speech and extracts any PII found within the content from pre-defined categories. Once detected, the sensitive information is automatically redacted, ensuring data privacy.


Concept

VIDIZMO streamlines the process of PII detection and Redaction by enabling users to effortlessly upload batches of audio and video files to the VIDIZMO portal. If a media file includes transcribed content, the system will automatically detect and redact PII information within the media playback as well as within its associated transcription file when uploaded to the VIDIZMO portal.


Within VIDIZMO, you can seamlessly redact specific audio segments from any video or audio file. Furthermore, any redacted information will also be eliminated from the associated transcription. This is accomplished by furnishing the precise audio timing information for the targeted audio segments.


It's important to note that the approach for handling Personally Identifiable Information (PII) within speech differs from other use cases. To address this distinction, dedicated articles on this topic can be found in the comprehensive guide titled "How to Redact Audio."


Incorporating the Spoken PII feature within VIDIZMO revolves around the automated Redaction of sensitive personal information (PII). This feature seamlessly operates on batches of audio files or video content stored in an Amazon S3 Bucket, all without necessitating any transcoding. When a user uploads a batch of audio or video files, the system automatically undergoes a detection process to identify any PII within the content. Subsequently, it initiates the redaction process while simultaneously generating a redacted transcription.


To enable this powerful capability within VIDIZMO, integration with the AWS Indexer application is essential. VIDIZMO has leveraged the Amazon Transcribe entity for this integration, harnessing its advanced functionalities.


On-Demand PII Detection and Redaction 

In VIDIZMO, users have the capability to initiate automatic Personal Identifiable Information (PII) detection and Redaction from the process modal of media files, offering a versatile solution for on-demand processing. This functionality empowers administrators to exercise precise control over PII processing and detection at the time of uploading video and audio files. To achieve this, users can leverage the Processing Option available within the media file settings in the library or portal settings.

 

Once enabled, both moderators and regular users can upload media files. Subsequently, within the media settings tab, users are provided with the option to select PII detection and redaction preferences. This streamlined approach enables the application of PII redaction even after media files have been uploaded to the portal, ensuring robust privacy and compliance measures are upheld throughout the content management process.


PII Detection and Redaction in Studio Space


In VIDIZMO's Studio Space, users can access a comprehensive suite of tools for managing Personal Identifiable Information (PII) within their media content. This powerful functionality enables users to search for PII information based on various PII types, providing precise control over sensitive data within their media assets.


Within Studio Space, PII information is intelligently presented as audio segments, allowing users to review and manage detected PII segments effortlessly. Users can easily edit or delete existing automatically detected PII segments, ensuring compliance with privacy regulations and content security.


Moreover, this capability extends to transcription data as well. Users can apply these edits and deletions to transcriptions, enhancing the accuracy and security of their content.


Users can leverage a sophisticated filtering system to refine their PII management process further. This filter empowers users to select PII segments based on their confidence scores, providing insights into the system's confidence level regarding the detected PII. Users can adjust these confidence scores as needed for added flexibility, tailoring the system's PII detection to meet specific requirements.


The PII detection features of VIDIZMO are created to be user-friendly and ensure a seamless experience when managing sensitive information within media content. This robust feature set empowers users to maintain compliance, enhance data security, and streamline content management within VIDIZMO's platform.


To get familiar with the VIDIZMO Studio Space tool, kindly consult Redaction using Studio Space: A Comprehensive Walkthrough."


Language Support

Currently, the Spoken PII feature exclusively provides support for the English language.


PII Entity Types

The PII entity represents a distinct category of personally identifiable information (PII). In VIDIZMO, the system detects and redacts the following PII entities:


  • PERSON NAME     
  • ADDRESS               
  • CREDIT_DEBIT_CVV               
  • CREDIT_DEBIT_NUMBER               
  • CREDIT_DEBIT_EXPIRY               
  • EMAIL               
  • CREDIR_DEBIT_PIN               
  • PHONE_NUMBER               
  • BANK_ACCOUNT_NUMBER               
  • BANK_ROUTING_NUMBER               
  • SSN     


Enabling Advanced PII  will result in the detection and redaction of additional PII entities.

  • BANK_ACCOUNT_NUMBER
  • BANK_ROUTING
  • CREDIT_DEBIT_NUMBER
  • CREDIT_DEBIT_CVV
  • CREDIT_DEBIT_EXPIRY
  • PIN
  • NAME
  • ADDRESS
  • PHONE
  • EMAIL
  • AGE
  • USERNAME
  • PASSWORD
  • URL
  • AWS_ACCESS_KEY
  • AWS_SECRET_KEY
  • IP_ADDRESS
  • MAC_ADDRESS
  • SSN
  • PASSPORT_NUMBER
  • DRIVER_ID
  • DATE_TIME


Use Cases

In enterprise operations, the detection and Redaction of Personal Identifiable Information (PII) constitute a pivotal aspect of data security and privacy management. An exemplary and widely recognized application of this capability lies in customer support interactions, particularly within call center environments, telemarketing, and similar customer-facing contexts.


Consider, for instance, a scenario involving customer support calls. These interactions are replete with sensitive information exchanged between agents and customers. In this context, the need arises to identify and promptly remove PII data, such as names and other personal details, that may inadvertently surface during the conversation.


Organizations can leverage specialized services designed to recognize and redact PII elements to address this pressing requirement. Such services are instrumental in preserving the confidentiality and integrity of customer interactions. By automatically identifying and eliminating PII, businesses can ensure compliance with data protection regulations and bolster their commitment to safeguarding customer information.


This use case underscores the critical role that PII detection and Redaction play in enhancing data security, privacy compliance, and overall operational integrity within customer support and similar customer-centric endeavors.


Next Step

For instructions on using this feature, follow our step-by-step guide titled "How to Detect and Redact PII in Speech."