Abstract
Speaker verification is a rapidly maturing technology that is becoming available for commercial applications. In this paper, we investigate the application of data fusion methods to sub-word implementations of speaker verification. At a sub-word level, we utilize the diversity of the information provided by the neural tree network and Gaussian mixture model to provide a more robust sub-word model. The phrase-level scores for each modeling approach are obtained and then combined. The data fusion method we use for combining the model scores is the linear opinion pool. In addition to using the diversity of the model scores, we also apply the concept of redundancy by using a leave-one-out approach to partition the input data. This allows us to generate several models and accommodate the small training sample issues imposed by our specific applications. The theoretical results of the above analysis have been integrated into a system that has been tested with several databases that were collected within landline and cellular environments. These results are included in this paper. We have found that the proper data fusion techniques will typically reduce the error rate by a factor of two.
Original language | English (US) |
---|---|
Pages | 531-540 |
Number of pages | 10 |
State | Published - 1997 |
Event | Proceedings of the 1997 7th IEEE Workshop on Neural Networks for Signal Processing, NNSP'97 - Amelia Island, FL, USA Duration: Sep 24 1997 → Sep 26 1997 |
Other
Other | Proceedings of the 1997 7th IEEE Workshop on Neural Networks for Signal Processing, NNSP'97 |
---|---|
City | Amelia Island, FL, USA |
Period | 9/24/97 → 9/26/97 |
All Science Journal Classification (ASJC) codes
- Signal Processing
- Software
- Electrical and Electronic Engineering