Vari-gram language model based on word clustering

来源 :中南大学学报(英文版) | 被引量 : 0次 | 上传用户:coppi
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.
其他文献
The characteristics of inclusions in high-Al steel refined by electroslag remelting (ESR) were investigated by image analysis,scanning electron microscopy (SEM)
Cu-based and Cu-alloy-based diamond composites were made by high-pressure-high-temperature (HPHT) sintering with the aim of maximizing the thermal conductivity
By utilizing the time difference of arrival (TDOA) and frequency difference of arrival (FDOA) measurements of signals received at a number of receivers,a constr
The hot deformation behavior of TiNiFe shape memory alloy was studied by isothermal compression tests.It was performed on a Gleeble-3500 thermal simulation mach
The high velocity oxy-fuel(HVOF)based thermal spray process has developed as a potential advantageous approach for fabricating various kinds of functional coati
In order to accurately estimate the anti-penetration capacity of yaw-inducing bursting layer with irregular barriers on surface impacted by projectile,the theor
This paper presents a scheme of fault diagnosis for flexible satellites during orbit maneuver.The main contribution of the paper is related to the design of the
Mo-Cu composites were fabricated by powder metallurgy with addition of various Ni contents.The effect of Ni contents on mechanical and thermal properties of Mo-
Grate process is an important step in grate-kiln pellet production.However,as a relatively closed system,the process on grate is inaccessible to direct detectio