2018年12月20日 星期四

以知識表徵方法建構台語聲調群剖析器

Computational Linguistics and Chinese Language Processing Vol. 22, No. 2, December 2017, pp. 73-86 73
The Association for Computational Linguistics and Chinese Language Processing

以知識表徵方法建構台語聲調群剖析器 
 A Knowledge Representation Method to Implement A Taiwanese Tone Group Parser 

張佑竹 Yu-Chu Chang 

摘要 
聲調群剖析器是台閩語語音輸出系統的主要元件之一。本文提出聲調管轄假說, 主張先將句內語詞定調,亦能決定台閩語聲調群分界的觀點,並以聲調群剖析 器實作加以驗證。除了敘述如何應用預設調型、預設詞類和模式三種標記符號, 將語言知識和經驗轉換為知識庫,並說明經由推論引擎與知識庫的連結,完成 語詞定調的運作過程。目前內部測試平均變調正確率為 98.5%。外部測試平均 變調正確率為 94%。本研究的實驗數據也顯示一個重要的線索:符號系統標記 比規則推論對變調正確性有相對較高的貢獻率。

關鍵詞:台灣話,變調,聲調群剖析器,知識表徵,模擬

Abstract 
A tone group parser could be one of the most important components of the Taiwanese text-to-speech system. In this paper, we offered the hypothesis of tonal government to emphasis the idea that if the allotone selection can be made for each word in a sentence then the tone groups will be separated within the sentence and supported our viewpoint with the implementation of a Taiwanese tone group parser. In addition to the description of using the symbol system to convert language expertise and heuristic knowledge into a knowledge base to cope with a frame-based corpus and a tone sandhi processor, the procedure of connecting the inference engine and the knowledge base to make allotone selection was also discussed. In the current version of the tone group parser, the average accuracy of inside test is 98.5%. The average accuracy of outside test is 94%. The experiment data of the study also reveals an important clue: the marking of the symbol system makes a higher contribution rate to the tone sandhi accuracy than the rule inference.

Keywords: Taiwanese, Tone Sandhi, Tone Group Parser, Knowledge Representation, Simulation

全文下載