搜索资源列表
SogouW.20061127
- 互联网词库来自于对SOGOU搜索引擎所索引到的中文互联网语料的统计分析,统计所进行的时间是2006年10月,涉及到的互联网语料规模在1亿页面以上。统计出的词条数约为15万条高频词,除标出这部分词条的词频信息之外,还标出了常用的词性信息。 语料库统计的意义:反映了互联网中文语言环境中的词频、词性情况。 应用案例:中文词性标注、词频分析等。 词性分类: N 名词 V 动词 ADJ 形容词 ADV 副词 CLAS 量词 ECHO 拟声词
POSTagger
- (1)从已经标注好词性的语料中统计得到词性标记的二元转移矩阵,以及每个词以确定的词性标记出现的次数等数据(训练阶段) (2)利用动态规划算法快速选取词性标记路径,得到词性标记结果 (3)可以选择不同的词性标记集 -(1) from the good part-of-speech tagging has been the Corpus statistics to be part of speech marking the transfer of binary matrix, a
POS_tagging_and_HMM
- 词性标注与隐马尔可夫模型.ppt,相当好的说明材料。-part-of-speech tagging and Hidden Markov Model. Ppt, very good descr iptive material.
wordpos
- 给定带有分词和词性标注信息语料,从中总结单词的词频,并按照出现次数排序输出-given with sub-term and part-of-speech tagging information corpus, it is concluded that the words and phrases, and in accordance with the order of the output frequency
postag_convert
- 一个集分词、词性标注和格式转换的强大的工具包-a word, part of speech tagging format conversion and a powerful tool kits
svm_hmm
- SVMhmm: Learns a hidden Markov model from examples. Training examples (e.g. for part-of-speech tagging) specify the sequence of words along with the correct assignment of tags (i.e. states). The goal is to predict the tag sequences for new sentences.
AutoSummary-0.1.0a-src.tar
- AutoSummary uses Natural Language Processing to generate a contextually-relevant synopsis of plain text. It uses statistical and rule-based methods for part-of-speech tagging, word sense disambiguation, sentence deconstruction and semantic anal
broadvoice 32
- BroadVoice?is a family of speech coding algorithms created by Broadcom and standardized by CableLabs? SCTE? and ANSI for Voice over IP applications in cable telephony. BroadVoice is also part of the ITU-T Recommendations J.161 and J.361. To encourage
Part_Of_Speech_Label.rar
- JAVA实现的基于隐马尔科夫模型的词性标注。有指导的学习,附带语料,供参考,JAVA realization of hidden Markov model based on the part of speech tagging. Guided learning, incidental corpus, for reference
detectendpoint1.rar
- 计算过零率,短时能量,调整能量门限,端点检测都是语音信号当中不可缺少的部分,Calculation of zero-crossing rate, short-term energy and adjust the energy threshold, endpoint detection of speech signals which are an integral part of
svm_perf.tar.gz
- SVMstruct is a Support Vector Machine (SVM) algorithm for predicting multivariate or structured outputs. It performs supervised learning by approximating a mapping h: X --> Y using labeled training examples (x1,y1), ..., (xn,yn). Unlike regula
preprocess0
- 语音信号处理前的预处理部分,包括预加重,分frame,加窗,是语音信号编程入门的一个很好的参考程序-Speech signal processing part of the pre-pre-treatment, including pre-emphasis, sub-frame, plus window, the speech signal a good entry-programming reference procedures
CRF1-2
- CRF1.2,条件随机场软件包,很好用很流行的一个文本分类软件,可以用于自然 语言的处理,标签,分类,词性发现,用户只需要着重构造特征函数既可以,实验结果和应用表明crf要优于隐马尔科夫模型。实现环境为java语言。-CRF1.2, conditions package with the airport, very good very popular with a text classification software, can be used in natural language proc
lingpipe-3.6.0
- 一个自然语言处理的Java开源工具包。LingPipe目前已有很丰富的功能,包括主题分类(Top Classification)、命名实体识别(Named Entity Recognition)、词性标注(Part-of Speech Tagging)、句题检测(Sentence Detection)、查询拼写检查(Query Spell Checking)、兴趣短语检测(Interseting Phrase Detection)、聚类(Clustering)、字符语言建模(Character
2
- c#中文分词源码,基于词频,词性等,可提取自定义数量的关键词-Chinese word c# source code, based on word frequency, part of speech, can customize the number of keywords extracted
posTagger
- thish file include Part-of-Speech tegger text in natural language prossesing
POStag
- 词性标注。首先根据预料库训练模型,然后用得到的模型对未标记词性的语句进行词性标注。-Part of Speech Tagging. First, according to the training model is expected to libraries, and then get the model right part of speech of the statement is not marked for POS Tagging.
speech-emotion-recognition-system
- gmm模型下的语音情感识别系统,GMM只是一个数学模型,只是对数据形态的拟和,但是和你所看到的数据分布存在出入也是正常的,因为用EM估计GMM的那些参数时,一般假设我们所得到的数据是不完备的(也就是说假设我们看到的数据分布不是真正的分布,它在运算时把那部分丢失或者叫隐藏的数据“补”上了)-gmm model speech emotion recognition system, GMM is a mathematical model, but fitting the data form, but
Chinese-part-of-speech-tagging
- 自然语言处理中汉语词性标注的C程序以及实例实验,-Chinese part-of-speech tagging C++ program as well as instances of experimental
speech-recognition-system-
- 该项目主要以语音识别技术为支持,实现通过不同的语音命令达到控制蜡像做出相应的动作和反应。项目主要分为两个系统:语音识别系统和蜡像的动作控制系统,其中后者的硬件工作已经基本完成,只需在与语音识别部分结合时做软件上的编写和考虑,因而项目重心在语音识别技术上。-The project mainly take the speech recognition technology for support, so as to realize the different voice commands to co