9th Speech in Noise Workshop, 5-6 January 2017, Oldenburg

Modulating speaker- and language-specific effects in speech intelligibility

Sabine Hochmuth(a), Marc René Schädler(b)
Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, Germany

Birger Kollmeier
Cluster of Excellence Hearing4all, Universität Oldenburg, Germany

(a) Presenting
(b) Attending

Recently, the simulation framework for auditory discrimination experiments (FADE) was presented which could successfully predict speech reception thresholds (SRTs) for the German matrix sentence test in stationary, speech-shaped noise. In principle, FADE consists of a standard automatic speech recognition system using separable Gabor filter bank features and hidden Markov models as back-end. In this contribution, we investigated if FADE could also predict speaker- and language specific effects on SRTs measured with normal-hearing listeners using matrix sentence tests of different languages. Such effects were recently reported for speech materials recorded with bilingual Spanish/German and German/Russian speakers of almost accent-free pronunciation of both of their languages. The data showed that SRT differences between different speakers within one language were generally larger than differences between languages. Speaker-specific intelligibility transferred across languages, i.e., speakers that were well intelligible in one of their languages were also well-intelligible in the other language. Speaker-specific intelligibility also transferred across noise types and were observed for stationary noise, amplitude-modulated noise and multi-talker babble. A systematic language effect was observed in that Spanish had consistently higher SRTs. The present study tested FADE on these data. The model was generally able to model speech in noise perception for different talkers and also different languages. Performance was generally overestimated in stationary speech-shaped noise by about 3 dB. In modulated noise and multi-talker babble the predictions matched well the data. Across all noise types, speakers and languages, FADE could predict 86% of the observed SRT variance. FADE predictions are also compared to predictions of the intelligibility index (SII).

Last modified 2017-01-04 23:51:47