Thought Leadership

Face, Voice, and Emotion Recognition - Artificial Intelligence (AI) Tools

Rajiv Ramanjani | Senior AI Architect, FIS Global Automation and Testing Services

December 03, 2018

Benefits and Uses of AI Recognition Tools

This white paper highlights advanced Artificial Intelligence (AI) tools that are used for verification/recognition. This includes the ability of AI tools to recognize faces, voices, and even emotions. We include examples of where such tools are used in banking and other industries today and highlight potential benefits that can be achieved by their usage.

Digital Authentication Challenges and Opportunities

We live in an era with an ever-expanding digital footprint, with more and more customers making digital transactions over the internet. Security is essential as there is a financial component to most transactions, as well as personally identifiable information. As we know the cost of identity theft and fraud are rising every year. Hence authentication and verification solutions need to always be ahead of the game.

Digital applications and systems also solicit user responses and impact customer satisfaction. Such information is typically gleaned by using surveys and calls – which at times can annoy the users. Emotion recognition provides a high-tech automated alternative – providing relevant data without relying on sending surveys or bothering customers with unnecessary calls.

With the emergence of advanced Artificial Intelligence (AI) technologies we can leverage the technology stack to make system authentication and verification more personable with additional layers of security such as face and voice verification. Soliciting feedback can become immediate, automated, and made personable by employing Emotion Recognition AI tools. Expanded usage of this technology includes customizing workflows based on the interpreted emotions of the end user/customer.

This white paper is organized as follows:

  • Face and Voice Recognition Benefits and Use Cases
  • Emotion Recognition Benefits and Use Cases
  • Case Study: Proposed FISecure Solution

Face and Voice Recognition Benefits and Use Cases

Following are various Face and Voice Recognition/Verification use cases, the benefits the AI solutions can provide, and actual usage in the financial industry today.

Online Payment Verification

Traditional online payment solutions rely on passwords, RSA tokens, security questions etc. to authenticate and verify a genuine user. These mechanisms often depend on user memory and can be tedious at times. However, using Face and Voice Verification gets around this challenge by using two biometric parameters to verify users: the face of the user and the user’s voice.

Industry example: WorldCore is an online payment service based out of Prague which uses Face and Voice verification; they also provide their customers with the option of replacing their password with an easy to remember pass-phrase.

Caller Identification and Verification

Rather than relying on security questions/answers and entry of PINs, customer verification and workflow can be automated via voice recognition when a customer calls into a customer service center. Subsequent call flows can be quickly customized based on the identity of the verified customer.

Industry example: Upwire is an organization that does just this. Upwire provides a drag-and-drop Interactive Voice Response (IVR) workflow builder for communication workflows. This type of voice biometric authentication solution can be added to your institution’s call center workflow within minutes – without the need for any code.

Employee Time and Attendance

In many organizations false employee log-ins, attendance and “buddy punching” of time cards is a common occurrence. For example, employees share their PIN numbers to have a colleague login on their behalf and mark their attendance in proxy. A solution to this problem is fingerprint-based biometric attendance technology. Likewise, employees could use a simple mobile device for face- and voice-based check-ins and checkouts.

Industry example: TEAMSoftware has deployed a face- and voice-based Time and Attendance solution to automate timekeeping processes, control labor costs, and manage distributed workforces effectively.

U.S. Regulatory Compliant Healthcare Data Access

In the United States the Health Insurance Portability and Accountability Act of 1996 (HIPAA) regulations related to patient information require that only authorized persons have access to the patient data. While not directly tied to banking, banks may have access to employee-related healthcare information (e.g., for insurance/benefits purposes), and this data is protected under HIPAA. If institution users unofficially share user credentials (such as User ID and Passwords) there is a potential breach of the HIPAA regulation. Biometric face and voice verification can be used to enforce compliance in this regard, ensuring that the user is who they purport to be, and can only access the data that is authorized for him or her to see.

Industry example: Ankota is a mobile app that uses biometric verification to ensure that patient data is accessed by authorized personnel only. Their Electronic Visit Verification (EVV) management software includes telephony, GPS, fixed number generator fobs, biometric authentication and verification schema.

Banking in India

Aadhaar is a 12-digit unique identification number issued by the Indian government to every individual resident of India. In 2017 the Indian government announced that Indian citizens would need to link Aadhaar to PAN (Permanent Account Numbers – issued by the Central Board of Direct Taxes) for bank accounts. The same obligation applies to a diverse range of savings and investment schemes. One of the major Aadhar changes is also the launch of facial authentication in July 2018.

Given that the Aadhaar number is inextricably linked to the holder's unique biometric data, marrying it to key financial accounts provides a powerful means of meeting KYC (Know Your Customer) obligations, as well as tackling modern menaces such as money laundering and tax evasion. Approximately 558 million bank accounts have already been linked with Aadhaar, out of an estimated 1.1 billion accounts in the country.

Emotion Recognition Benefits and Use Cases

The following use cases illustrate the potential benefits of AI Emotion Recognition technology.

Facial Emotion in Interviews

A candidate-interviewer interaction is susceptible to many categories of judgment and subjectivity which may make it hard to determine whether a candidate's personality is a good fit for the job. Identifying what a candidate is trying to say involves multiple layers of language interpretation, cognitive biases, and the context that lies in between. AI Emotion Recognition technology can measure the candidate's facial expressions to help assess their moods and personality traits. Employee morale can also be perceived using this technology by holding and recording interactions on the job. In the Human Resources (HR) field, this tool can be useful for recruiting strategies and potentially to help design HR policies that bring about best performance from employees.

Industry example: Unilever is starting to incorporate this technology into their recruitment process. This technology provides the recruiter with an overall confidence level regarding the interviewee to help decide whether the candidate would perform well at client-facing jobs.

Emotion Detection of Car Drivers

Car manufacturers worldwide are increasingly focused on making cars more personal and safe for us to drive. In their pursuit to build more smart car features, it makes sense for automakers to use AI to help identify human emotions. Using facial emotion detection, smart cars can alert the driver when he or she appears drowsy. This can be life-saving, as the U.S. Department of Transportation estimates that driving-related errors cause 95% of fatal road accidents. Facial Emotion Detection can find subtle changes in facial micro-expressions that precede drowsiness and send personalized messages to alert the driver (e.g., to suggest a coffee break or changing music or temperature settings).

Product Testing and Client Feedback

Recording of product testing and client feedback sessions, coupled with AI Emotion Recognition technology, provides an excellent means of gauging user response to new product rollouts. The technology can detect facial emotions during the product testing interaction with the client study groups, average the different emotions detected, and gauge the predominant emotions that were elicited. This is prominently used for testing video games.

Market Research

The same concept can be used for market research, including when institutions host events organized for senior executives from their organizations and/or client base. The AI Emotion Recognition tools can be used to see how well a certain idea/product is being received by the audience.

Traditional market research companies have employed verbal methods, primarily in the form of surveys, to find the consumers’ wants and needs. However, such methods assume that consumers can formulate their preferences verbally and the stated preferences correspond to future actions which may not always be right.

Another popular approach in market research is to employ behavioral methods that let users act while trying the product. Such methods are considered more objective than verbal methods. Behavioral methods use video feeds of a user interacting with the product, and the video is then analyzed manually to observer the users’ reactions and emotions. This can quickly become very labor-intensive and expensive. A better alternative may be to use facial emotion recognition AI technology to automatically detect facial expressions on the users’ faces and automate the video analysis completely. Market research companies can use this technology to scale the data collection efforts and rely on the technology to quickly perform the data analysis.

Summary of Benefits

Several aspects of the banking industry can benefit by the selective use of AI tools for Face, Voice, and Emotion recognition, and these technologies are being actively integrated into software solutions and product roadmaps. Perhaps the most immediate benefit will be realized by utilizing this technology in products that interface with bank customers to process mobile-based payments and other related banking solutions. Since most mobile devices today have built-in cameras and microphones, the requirement for additional hardware is minimal.

Case Study: FISecure Solution

As a case in point, following is a description of the FISecure solution which leverages various cloud-based API solutions to deliver face and voice verification in banking software. This solution includes:

  • Microsoft Cognitive Services for Emotion Detection
  • Voice IT services for Face and Voice Verification
  • Google Maps service for Geo-coding and User Location Detection
  • SendGrid for Email service

All the above can be used as proprietary services.

Note that this solution can be implemented either as a product or as an API solution. As an API solution it is device independent and can be leveraged by any device. Following is the architecture for this approach:

The core face and voice authentication flow is depicted below:

Key Features of FISecure

The HTTPS protocol is used for all communications with the API host servers. All third-party API keys are stored on the SQL server on Azure and retrieved based on successful user login. This architecture prevents reverse-engineering.

A key feature of the tool is three-factor security authentication: User ID/Password authentication, Face authentication, and Voice authentication.

If someone attempts to hack into a primary customer’s account, the tool reports the hack by sending an email to the primary customer. This email includes details of the device, location, and a picture of the person who tried to login to their account.

The video authentication service provides “liveness” recognition. What this means is that you cannot use a person’s photo and voice recording to successfully login using this tool. The person in front of the camera must be alive, not a static photo/voice.

The FISecure solution can be used internationally by customers who speak various languages such as English, French, Spanish, Portuguese, Chinese, Japanese, Kannada, Tamil, Hindi etc. In fact, the solution works with 80+ languages.

Additional References

Click below for further information related to AI Face, Voice, and Emotion Recognition technologies and trends:

The Top 7 Trends for Facial Recognition in 2018

AI Beats Humans At Emotional Recognition Test In Landmark Study

Artificial Intelligence in Video Marketing – Emotion Recognition, Video Generation, and More

New Trends in Biometrics (March 2018) with Isabelle Moeller from the Biometrics Institute