Hybrid Training of Speaker and Sentence Models for One-Shot Lip Password

384

Views

0

Downloads

Ruengprateepsang, Kavin, Wangsiripitak, Somkiat and Pasupa, Kitsuchart (2020) Hybrid Training of Speaker and Sentence Models for One-Shot Lip Password In: Neural Information Processing, Lecture Notes in Computer Science Springer International Publishing, 363-374.

Abstract

Lip movement can be used as an alternative approach for biometric authentication. We describe a novel method for lip password authentication, using end-to-end 3D convolution and bidirectional long-short term memory. By employing triplet loss to train deep neural networks and learn lip motions, representation of each class is more compact and isolated: less classification error is achieved on one-shot learning of new users with our baseline approach. We further introduce a hybrid model, which combines features from two different models; a lip reading model that learns what phrases uttered by the speaker and a speaker authentication model that learns the identity of the speaker. On a publicly available dataset, AV Digits, we show that our hybrid model achieved an 9.0% equal error rate, improving on 15.5% with the baseline approach.

Item Type:

Book Section

Identification Number (DOI):

Deposited by:

ระบบ อัตโนมัติ

Date Deposited:

2021-09-06 03:38:22

Last Modified:

2021-10-04 02:55:58

Impact and Interest:

Statistics