搜索资源列表
ProbWordSeg1
- 基于最大概率的分词,首先读入.mdb数据库(字典与其统计词频),然后读入你要分词的.txt-based on the maximum probability of the word, first read into. Mdb database (with dictionary word frequency statistics). Then you should read into the word. txt
wordpos
- 给定带有分词和词性标注信息语料,从中总结单词的词频,并按照出现次数排序输出-given with sub-term and part-of-speech tagging information corpus, it is concluded that the words and phrases, and in accordance with the order of the output frequency
seg_delphi
- delphi版的基于词频字典的分词算法,其中dict目录下为词频字典。对于未注册词也有一定的识别能力。
ictclas4j_0[1].9.1
- 基于java语言的分词系统,可以标注词性、词频等信息,可用于二次开发
MFC查词典、分词、词频统计程序
- MFC编程,功能是查词典(用户可自己导入文本),分词,统计词频,还可以保存结果!我们MFC课的期末作业,强烈推荐!-MFC programming function is to check dictionary (users can import their own version), participle, statistical, frequency, the results can be saved! We MFC class at the end operations, strongly
中文分词技术及最新发展
- 搜索引擎通常由信息收集和信息检索两部分组成。对于英文,由于英文中词 与词之间是用空格隔开,检索起来很方便,故计算机采用了词处理的方式,大大 减轻了用户与计算机的工作量:相对来讲,中文的情形就复杂得多。中文的词与 词之间是没有分隔符的,因此若想建立基于词的索引,就需要专门的技术,这种 技术被称之为“汉语词语切分技术”。根据是否采用词语切分技术,中文搜索引 擎又可分为基于字的搜索引擎和基于词的搜索引擎。由于中文信息处理的特殊 性,开发中文搜索引擎决不像西文软件的汉化那样简单。在实
MFC编程,功能是查词典(用户可自己导入文本),分词,统计词频
- AppWizard has created this RMM application for you. This application not only demonstrates the basics of using the Microsoft Foundation classes but is also a starting point for writing your application. This file contains a summary of what you
TF/IDF 算法
- 统计词频,和对文档进行分词处理,计算tf-idf值,JAVA实现
tfidf
- 文本的词频计算,用到了lucene的分词工具,用java实现-Text of the word frequency calculations, the word used in the sub-lucene tools to achieve with java
ChineseSplit
- 一个基于VB.NET开发的中文分词及关键词提取系统,采用双向最大匹配、词频统计、快速排序等算法实现。-VB.NET developed based on Chinese word segmentation and Key Extraction System, the largest two-way matching, word frequency statistics, such as quick sort algorithm.
word-frequency
- java 编写的词频统计,包含极易分词软件的包,Lucene包,程序调试通过-java written word frequency, word that contains the software package easy points, Lucene package, program debugging by
2
- c#中文分词源码,基于词频,词性等,可提取自定义数量的关键词-Chinese word c# source code, based on word frequency, part of speech, can customize the number of keywords extracted
ictclaszyfc-v2009
- 中科院分词系统,包含添加词汇、统计词频等。-Chinese Academy of Sciences segmentation system, including adding vocabulary, word frequency and other statistics.
WindowsApplication1
- 处理的对象是:完成分词和词性标注的语料,实现的结果是:统计出现词频完成降序排列。-Dealing with the object are: the completion of word segmentation and POS tagging of the corpus, the results achieved are: the completion of word frequency statistics appear in descending order.
1
- 最大概率分词法,词频词典用的是北语版的也有可能是词典的原因-Maximum probability sub-lexical, word frequency dictionary used in the North language version of the dictionary is also possible that the reasons for
zhengdike
- (个人原创)《中文网页自动分类》 牵扯的技术有:分词,统计词频,踢出网页中一些特殊字符(用正则表达式),还有需要提取培训集等等!! 此软件禁止商业活动,版权所属“qyTT论坛--www.qyclass.org/bbs” 本文来自: qyTT论坛 http://www.qyclass.org/bbs 我们的使命:让世界认识qyTT,让qyTT认识世界! 结果分析的思想:就是把得到的词频与建立的词库里每一类进行比较,如果存在一个最大匹配程度,就去这个类作为结果,如果存
ngrams
- 自然语言处理相关程序,有关分词的和词频统计-Natural language processing procedures, the statistical segmentation and word frequency
课程设计作业
- 用分词包进行分词,并通过分词统计每个词频出现次数(use to seperate an article, and use the dictionary to find the frequency of each word)
wordseg
- 运用R语言进行中文分词处理,得到词频统计,并绘制词云图直观表示(Chinese word segmentation and word cloud drawing)
python词频统计分词
- 利用其可以对csv文件进行分词统计词频,并保持成txt文件,利于科研