文件名称:Stemmer
介绍说明--下载内容来自于网络,使用问题请自行百度
在英语中,一个单词常常是另一个单词的“变种”,如:happy=>happiness,这里happy叫做happiness的词干(stem)。在信息检索系统中,我们常常做的一件事,就是在Term规范化过程中,提取词干(stemming),即除去英文单词分词变换形式的结尾。
应用最为广泛的、中等复杂程度的、基于后缀剥离的词干提取算法是波特词干算法,也叫波特词干器(Porter Stemmer)。详见官方网站。比较热门的检索系统包括Lucene、Whoosh等中的词干过滤器就是采用的波特词干算法。-In English, a word often another word variants, such as: happy => happiness happy here called happiness stem (stem). Information retrieval system, we often do things Term normalization process, extract the stem (stemming), that is the end of the word transform the form of removal of English words. The most widely used, moderate complexity, stemming algorithms based on suffix stripped Porter Stemming Algorithm, also known as the Porter stemmer Porter Stemmer. For details, please refer to the official website. More popular retrieval system include the word in Lucene, Whoosh done filter is used Porter stemming algorithm.
应用最为广泛的、中等复杂程度的、基于后缀剥离的词干提取算法是波特词干算法,也叫波特词干器(Porter Stemmer)。详见官方网站。比较热门的检索系统包括Lucene、Whoosh等中的词干过滤器就是采用的波特词干算法。-In English, a word often another word variants, such as: happy => happiness happy here called happiness stem (stem). Information retrieval system, we often do things Term normalization process, extract the stem (stemming), that is the end of the word transform the form of removal of English words. The most widely used, moderate complexity, stemming algorithms based on suffix stripped Porter Stemming Algorithm, also known as the Porter stemmer Porter Stemmer. For details, please refer to the official website. More popular retrieval system include the word in Lucene, Whoosh done filter is used Porter stemming algorithm.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
Stemmer.java
本网站为编程资源及源代码搜集、介绍的搜索网站,版权归原作者所有! 粤ICP备11031372号
1999-2046 搜珍网 All Rights Reserved.