英文摘要 |
Speaker recognition is an important biometric identification method. The biggest advantage of using such method is the simple requirement of its hardware, which only consists of a microphone. Therefore, it is widely implemented in mobile phones and call centers. The purpose of this thesis is to create a text-related speaker verification system, for which we conduct three different approaches to analyze their result: dynamic time warping compares the differences between the MFCCs for digits at registration and digits at testing after applying forced alignment; sentence-level uses cosine similarity or PLDA to rate the two groups of i-vector retrieved from the audios at registration and testing respectively; digit-level uses cosine similarity or PLDA to rate each i-vector of every digit in the audios after applying forced alignment. |