资源列表
classifier-1.12
- 能对从Google中搜索出来的文本进行聚类,提供了Java包,及调用源代码.-can right from the Google Search for the text clustering, a Java package, source code and call.
CJCorpus
- 一个日汉平行的双语语料库,含有4053个句子-a parallel to the Japanese and Chinese bilingual corpus, containing 4,053 Sentence
AboutWiz_src
- 智能拼音源程序,很好用的,希望大家喜欢,还不错的-intelligent phonetic source, the good, hope you like them, but also good
TaskVision
- 一个管理系统,可以帮助您轻松学习C#,成为编程高手-a management system that can help you easily learn C#, as programming experts
xdgf
- 字符处理这是一个基于Java的分词、N-gram统计、分段 、分句等功能的程序,支持多种语-characters to deal with this is a Java-based segmentation, N-gram to statistics, subparagraph Clauses function procedures, multiple language support
ictclas_Source_Code
- 计算所汉语词法分析系统ICTCLAS介绍 词是最小的能够独立活动的有意义的语言成分。 但汉语是以字为基本的书写单位,词语之间没有明显的区分标记,因此,中文词语分析是中文信息处理的基础与关键。为此,我们中国科学院计算技术研究所在多年研究基础上,耗时一年研制出了汉语词法分析系统ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System),该系统的功能有:中文分词;词性标注;未登录词识别。分词正确率高达97
ICTCLASCaller
- ICTCLAS的JNI调用接口文件: Title:ICTCLAS Caller * <p>Descr iption:do chinese word segmentation.don t change the pakage and CLASS name, orelse you can t use it. * 请不要改变包名、类名以及native的方法名,否则调用将失效。 * 由于ICTCLAS本身存在很多鲁棒性问题,调用segSentence时,strin
CheckNum
- 从预料中抽取汉字数字变成英文数字(作信息抽取用)-taken from the expected number of Chinese characters into English figures (used for information extraction)
Pwswnr
- 一个人名识别的程序,可用在需要对人名进行搜索的系统中。-a name identification procedures can be used in the names of the need to search for the system.
SpellChecker
- 这是一个取中文拼音首字的DOTNET控件-This is a first from the Chinese phonetic characters DOTNET Controls
GB2Spell
- 网上收集到的,中文转拼音的java代码,没找到作者,不好意思-online collection of the Chinese alphabet to the java code, it can not find the author, sorry
CDevideSentence
- 用c++写的分词算法,简单,实用,详情看里面的帮助文件!-using c + + to write the sub-term algorithm is simple, practical, inside look at the details of the help files!