Speech summarization - Information extraction from speech

Sadaoki Furui and Chiori Hori

Department of Computer Science
Tokyo Institute of Technology
2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8552 Japan
furui@cs.titech.ac.jp

For processing spontaneous speech, it is necessary to introduce a paradigm shift from speech recognition to understanding. In this paragigm, underlying messages of the speaker are extracted instead of transcribing all the spoken words. As a step toward speech understanding, we have proposed a new method of automatically summarizing speech by extracting a limited number of relatively important words from its automatic transcription according to a target compression ratio for the number of characters. To determine a word set to be extracted, we defined a summarization score consisting of a topic score (significance measure) of words and a linguistic score (linguistic likelihood) of the word concatenation. A set of words maximizing the score is efficiently selected using a dynamic programming (DP) technique. This method was applied to Japanese broadcast news speech and very encouraging results were obtained. This talk also briefly introduces a new Japanese national project entitled "Spontaneous Speech: Corpus and Processing Technology".