应用最为广泛的、中等复杂程度的、基于后缀剥离的词干提取算法是波特词干算法,也叫波特词干器(Porter Stemmer)。详见官方网站。比较热门的检索系统包括Lucene、Whoosh等中的词干过滤器就是采用的波特词干算法。-In English, a word often another word variants, such as: happy => happiness happy here called happiness stem (stem). Information retrieval system, we often do things Term normalization process, extract the stem (stemming), that is the end of the word transform the form of removal of English words. The most widely used, moderate complexity, stemming algorithms based on suffix stripped Porter Stemming Algorithm, also known as the Porter stemmer Porter Stemmer. For details, please refer to the official website. More popular retrieval system include the word in Lucene, Whoosh done filter is used Porter stemming algorithm.
应用最为广泛的、中等复杂程度的、基于后缀剥离的词干提取算法是波特词干算法,也叫波特词干器(Porter Stemmer)。详见官方网站。比较热门的检索系统包括Lucene、Whoosh等中的词干过滤器就是采用的波特词干算法。-In English, a word often another word variants, such as: happy => happiness happy here called happiness stem (stem). Information retrieval system, we often do things Term normalization process, extract the stem (stemming), that is the end of the word transform the form of removal of English words. The most widely used, moderate complexity, stemming algorithms based on suffix stripped Porter Stemming Algorithm, also known as the Porter stemmer Porter Stemmer. For details, please refer to the official website. More popular retrieval system include the word in Lucene, Whoosh done filter is used Porter stemming algorithm.