汉语语句合成的韵律生成模型

PSYCH OpenIR > 认知与发展心理学研究室

	汉语语句合成的韵律生成模型
其他题名	A Model of Prosody Generation for Sentence Synthesis in Mandarin Chinese
	程宗军
	2003-07
摘要	本研究基于文本处理和韵律生成两部分的工作，在汉语的句子层面上，探索如何实现输入的汉语句子文本到其韵律特征之间的自动映射，即根据文本处理的结果和韵律规则的归纳，建立一个适用于汉语文语转换系统的韵律生成模型。在文本处理方面，首先设计了各具特色的专用词典、常规词典和综合词典，接着使用基于词典的最大匹配算法对句子文本进行分词和词性标注，然后以浅层句法分析的方法获得句子组块，最后对句子文本进行规格化和拼音标注。在韵律生成方面，基于已有的研究结果整合了韵律边界、停顿和重音等设置规则，对句子文本的韵律模型进行了精细的设计。根据文本处理得到的句子组块及其语法特征，实现了韵律等级的自动设置和韵律参数的自动生成。结果表明，在文本处理中分级分词有效的降低了未登录词的漏召率和错误率。在语法词分词的基础上进行韵律词组合，也加快了韵律层级结构的模拟速度。另外，在韵律生成中引入动态模拟韵律特征的随机因子，将定性描述的韵律规则定量化，这对提高合成语音自然度很有帮助。本研究将基础研究的结果应用到工程技术中，虽然有许多不够成熟的地方，但希望有助于我国语音合成技术的发展。关键词:文本韵律停顿重音
其他摘要	Based on the work of text processing and prosody generation, the present research was conducted to explore the automatic mapping from Chinese sentence text to its corresponding prosodic features. According to the results of text processing and the sum-up of prosodic rules, a model of prosody generation was developed for the text-to-speech system in Mandarin Chinese. First of all, a comprehensive dictionary as well as a special one and a regular one was designed for text processing. Second, lexical parsing and POS (abbr. part of speech) labeling for sentence text were accomplished with maximal matching algorithm. Third, chunks of sentences were acquired by means of shallow parsing. In the end, sentence text was standardized and labeled with pinyins. Rules of the setting of prosodic boundaries, pause and stress were integrated for prosody generation on the basis of the previous research. The prosodic model of sentence text was elaborated. Prosodic grades setting and prosodic parameters generation were automatically realized based on the chunks of sentences and their syntactical features. The results showed that the leakage and error of unknown words were both reduced by hierarchical parsing in text processing and that the simulating of prosodic hierarchy was quickened through the combination of prosodic words. In addition,random factors for dynamically simulating prosodic features were introduced in prosody generation, and prosody rules stated qualitatively were quantitated. These were considered to be helpful for improving the spontaneity of synthetic speech. In this research, the results of basic study were applied to engineering technology. Although there were some problems remained to be resolved, it was hoped to conduce to the development of speech synthesis technology in China.
关键词	文本韵律停顿重音
学位类型	硕士
语种	中文
学位专业	基础心理学
学位授予单位	中国科学院研究生院
学位授予地点	北京
文献类型	学位论文
条目标识符	http://ir.psych.ac.cn/handle/311026/21690
专题	认知与发展心理学研究室
作者单位	中国科学院心理研究所
推荐引用方式 GB/T 7714	程宗军. 汉语语句合成的韵律生成模型[D]. 北京. 中国科学院研究生院,2003.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
程宗军-硕士学位论文.pdf（7878KB）	学位论文		限制开放	CC BY-NC-SA	请求全文