Robust speaker identification using pole filtering

Devang Naik, Khaled Assaleh, Richard Mammone

Research output: Contribution to conferencePaper

Abstract

In this paper we introduce a new philosophy of extracting robust features in speech systems based on intelligent processing of the eigenmodes of speech. Intrinsic to this philosophy is an explanation why linear predictive (LP) cepstra of speech provide a powerful feature set for recognition systems. Poles or the eigenmodes of a frame of speech are investigated under mismatches created by varying channel conditions for speaker identification systems. The study of modes of speech has led to two related processing techniques, each of which provide a measurable degree of robustness under cross channel environments. One technique emphasizes processing of speech in the interframe domain (across many speech frames), while the other technique carries out an adaptive cepstral weighting of the intraframe(within a speech frame) LP spectral components. Experiments for the interframe techniques are presented using speech in the TIMIT database processed through a telephone channel simulator and a part of San Deigo portion of the King Database. Experiments of the intraframe technique are presented on the San Deigo portion of the King database. The techniques are shown to offer improved speaker identification performance when compared to related common methods in the interframe and intraframe domains.

Original languageEnglish (US)
Pages225-230
Number of pages6
StatePublished - Jan 1 2019
EventESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994 - Martigny, Switzerland
Duration: Apr 7 1994Apr 9 1994

Conference

ConferenceESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994
CountrySwitzerland
CityMartigny
Period4/7/944/9/94

Fingerprint

Poles
Processing
Telephone
Identification (control systems)
Simulators
Experiments

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Software
  • Human-Computer Interaction

Keywords

  • Bandwidths
  • Center frequencies
  • Cepstrum
  • Eigenmodes
  • Interframe
  • Intraframe
  • Poles
  • Residues

Cite this

Naik, D., Assaleh, K., & Mammone, R. (2019). Robust speaker identification using pole filtering. 225-230. Paper presented at ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994, Martigny, Switzerland.
Naik, Devang ; Assaleh, Khaled ; Mammone, Richard. / Robust speaker identification using pole filtering. Paper presented at ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994, Martigny, Switzerland.6 p.
@conference{21bfbdd3e9ba41e6980d7a59a8252be9,
title = "Robust speaker identification using pole filtering",
abstract = "In this paper we introduce a new philosophy of extracting robust features in speech systems based on intelligent processing of the eigenmodes of speech. Intrinsic to this philosophy is an explanation why linear predictive (LP) cepstra of speech provide a powerful feature set for recognition systems. Poles or the eigenmodes of a frame of speech are investigated under mismatches created by varying channel conditions for speaker identification systems. The study of modes of speech has led to two related processing techniques, each of which provide a measurable degree of robustness under cross channel environments. One technique emphasizes processing of speech in the interframe domain (across many speech frames), while the other technique carries out an adaptive cepstral weighting of the intraframe(within a speech frame) LP spectral components. Experiments for the interframe techniques are presented using speech in the TIMIT database processed through a telephone channel simulator and a part of San Deigo portion of the King Database. Experiments of the intraframe technique are presented on the San Deigo portion of the King database. The techniques are shown to offer improved speaker identification performance when compared to related common methods in the interframe and intraframe domains.",
keywords = "Bandwidths, Center frequencies, Cepstrum, Eigenmodes, Interframe, Intraframe, Poles, Residues",
author = "Devang Naik and Khaled Assaleh and Richard Mammone",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
pages = "225--230",
note = "ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994 ; Conference date: 07-04-1994 Through 09-04-1994",

}

Naik, D, Assaleh, K & Mammone, R 2019, 'Robust speaker identification using pole filtering', Paper presented at ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994, Martigny, Switzerland, 4/7/94 - 4/9/94 pp. 225-230.

Robust speaker identification using pole filtering. / Naik, Devang; Assaleh, Khaled; Mammone, Richard.

2019. 225-230 Paper presented at ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994, Martigny, Switzerland.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Robust speaker identification using pole filtering

AU - Naik, Devang

AU - Assaleh, Khaled

AU - Mammone, Richard

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In this paper we introduce a new philosophy of extracting robust features in speech systems based on intelligent processing of the eigenmodes of speech. Intrinsic to this philosophy is an explanation why linear predictive (LP) cepstra of speech provide a powerful feature set for recognition systems. Poles or the eigenmodes of a frame of speech are investigated under mismatches created by varying channel conditions for speaker identification systems. The study of modes of speech has led to two related processing techniques, each of which provide a measurable degree of robustness under cross channel environments. One technique emphasizes processing of speech in the interframe domain (across many speech frames), while the other technique carries out an adaptive cepstral weighting of the intraframe(within a speech frame) LP spectral components. Experiments for the interframe techniques are presented using speech in the TIMIT database processed through a telephone channel simulator and a part of San Deigo portion of the King Database. Experiments of the intraframe technique are presented on the San Deigo portion of the King database. The techniques are shown to offer improved speaker identification performance when compared to related common methods in the interframe and intraframe domains.

AB - In this paper we introduce a new philosophy of extracting robust features in speech systems based on intelligent processing of the eigenmodes of speech. Intrinsic to this philosophy is an explanation why linear predictive (LP) cepstra of speech provide a powerful feature set for recognition systems. Poles or the eigenmodes of a frame of speech are investigated under mismatches created by varying channel conditions for speaker identification systems. The study of modes of speech has led to two related processing techniques, each of which provide a measurable degree of robustness under cross channel environments. One technique emphasizes processing of speech in the interframe domain (across many speech frames), while the other technique carries out an adaptive cepstral weighting of the intraframe(within a speech frame) LP spectral components. Experiments for the interframe techniques are presented using speech in the TIMIT database processed through a telephone channel simulator and a part of San Deigo portion of the King Database. Experiments of the intraframe technique are presented on the San Deigo portion of the King database. The techniques are shown to offer improved speaker identification performance when compared to related common methods in the interframe and intraframe domains.

KW - Bandwidths

KW - Center frequencies

KW - Cepstrum

KW - Eigenmodes

KW - Interframe

KW - Intraframe

KW - Poles

KW - Residues

UR - http://www.scopus.com/inward/record.url?scp=85073353697&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073353697&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85073353697

SP - 225

EP - 230

ER -

Naik D, Assaleh K, Mammone R. Robust speaker identification using pole filtering. 2019. Paper presented at ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, ASRIV 1994, Martigny, Switzerland.