文件名称:IR
-
所属分类:
- 标签属性:
- 上传时间:2012-11-16
-
文件大小:3.64mb
-
已下载:0次
-
提 供 者:
-
相关连接:无下载说明:别用迅雷下载,失败请重下,重下不扣分!
介绍说明--下载内容来自于网络,使用问题请自行百度
索引词的选择
1、 切词及词频统计:利用已选择的分词软件对文档进行切词处理,并进行词频统计,形成DocIndex文件,结构为:文档号、频率、词。注意保留中间结果,建立合理的数据结构来存储。
2、 分配词权重: 采用词频标准化(tfi = tfi/Max(tf))和tf*idf两种方式分配词的权重。由DocIndex文件生成DocIndex(tf) 和DocIndex(tf*idf)文件。注意阈值的确定,词的取舍。
3、 形成倒置文档:将DocIndex(tf) 和DocIndex(tf*idf)文件转换为DocInvert(tf) 和DocInvert (tf*idf)文件。-Index word choice, the cut word and word frequency statistics: the use of the selected word segmentation software documentation the cut word processing, and word frequency statistics to the formation DocIndex file structure: document number, frequency, word. Note retain intermediate results, establish a reasonable data structure to store. 2, is assigned the term weight: the using word frequency Standardization (TFI = the TFI/Max (TF)) and tf* idf two ways to allocate the right of the word weight. Generated by DocIndex file DocIndex (tf) and DocIndex (tf* idf) files. Attention to the determination of the threshold, the word choice. 3, the formation of the inverted document: the DocIndex (tf) and DocIndex (tf* idf) files into DocInvert (tf) and DocInvert (tf* idf) files.
1、 切词及词频统计:利用已选择的分词软件对文档进行切词处理,并进行词频统计,形成DocIndex文件,结构为:文档号、频率、词。注意保留中间结果,建立合理的数据结构来存储。
2、 分配词权重: 采用词频标准化(tfi = tfi/Max(tf))和tf*idf两种方式分配词的权重。由DocIndex文件生成DocIndex(tf) 和DocIndex(tf*idf)文件。注意阈值的确定,词的取舍。
3、 形成倒置文档:将DocIndex(tf) 和DocIndex(tf*idf)文件转换为DocInvert(tf) 和DocInvert (tf*idf)文件。-Index word choice, the cut word and word frequency statistics: the use of the selected word segmentation software documentation the cut word processing, and word frequency statistics to the formation DocIndex file structure: document number, frequency, word. Note retain intermediate results, establish a reasonable data structure to store. 2, is assigned the term weight: the using word frequency Standardization (TFI = the TFI/Max (TF)) and tf* idf two ways to allocate the right of the word weight. Generated by DocIndex file DocIndex (tf) and DocIndex (tf* idf) files. Attention to the determination of the threshold, the word choice. 3, the formation of the inverted document: the DocIndex (tf) and DocIndex (tf* idf) files into DocInvert (tf) and DocInvert (tf* idf) files.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
信息检索/ir_work1/.classpath
信息检索/ir_work1/.project
信息检索/ir_work1/.settings/org.eclipse.core.resources.prefs
信息检索/ir_work1/.settings/org.eclipse.jdt.core.prefs
信息检索/ir_work1/bin/org/main/CreateIndexDocument.class
信息检索/ir_work1/bin/org/main/CreateInvertDocument.class
信息检索/ir_work1/bin/org/main/Util$1.class
信息检索/ir_work1/bin/org/main/Util.class
信息检索/ir_work1/bin/pojo/Token.class
信息检索/ir_work1/src/org/main/CreateIndexDocument.java
信息检索/ir_work1/src/org/main/CreateInvertDocument.java
信息检索/ir_work1/src/org/main/Util.java
信息检索/ir_work1/src/pojo/Token.java
信息检索/paoding/analyzer.bat
信息检索/paoding/analyzer.sh
信息检索/paoding/build.bat
信息检索/paoding/build.xml
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$CToken.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$LinePrintGate.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$PrintGate.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$PrintGateToken.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$StringReaderEx.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/TryPaodingAnalyzer.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/CompiledFileDictionaries$1.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/CompiledFileDictionaries.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MaxWordLengthTokenCollector.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsModeDictionariesCompiler$1.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsModeDictionariesCompiler.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsTokenCollector$LinkedToken.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsTokenCollector.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/SortingDictionariesCompiler.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/PaodingAnalyzer.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/PaodingAnalyzerBean.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/PaodingTokenizer.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/TokenCollector.class
信息检索/paoding/classes/net/paoding/analysis/Constants.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/BinaryDictionary.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/Dictionary.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/DictionaryDelegate.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/HashBinaryDictionary$SubDictionaryWrap.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/HashBinaryDictionary.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/Hit.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Detector$1.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Detector.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Difference.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/DifferenceListener.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/ExtensionFileFilter.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Node.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Snapshot$InnerNode.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Snapshot.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/FileWordsReader.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/ReadListener.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/SimpleReadListener.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/SimpleReadListener2.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/Word.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/BoldFormatter.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch1/English.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch1/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch2/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch2/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch3/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch3/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch4/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch4/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch5/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch5/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ContentReader.class
信息检索/paoding/classes/net/paoding/analysis/exception/PaodingAnalysisException.class
信息检索/paoding/classes/net/paoding/analysis/knife/Beef.class
信息检索/paoding/classes/net/paoding/a
信息检索/ir_work1/.project
信息检索/ir_work1/.settings/org.eclipse.core.resources.prefs
信息检索/ir_work1/.settings/org.eclipse.jdt.core.prefs
信息检索/ir_work1/bin/org/main/CreateIndexDocument.class
信息检索/ir_work1/bin/org/main/CreateInvertDocument.class
信息检索/ir_work1/bin/org/main/Util$1.class
信息检索/ir_work1/bin/org/main/Util.class
信息检索/ir_work1/bin/pojo/Token.class
信息检索/ir_work1/src/org/main/CreateIndexDocument.java
信息检索/ir_work1/src/org/main/CreateInvertDocument.java
信息检索/ir_work1/src/org/main/Util.java
信息检索/ir_work1/src/pojo/Token.java
信息检索/paoding/analyzer.bat
信息检索/paoding/analyzer.sh
信息检索/paoding/build.bat
信息检索/paoding/build.xml
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$CToken.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$LinePrintGate.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$PrintGate.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$PrintGateToken.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate$StringReaderEx.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/Estimate.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/estimate/TryPaodingAnalyzer.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/CompiledFileDictionaries$1.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/CompiledFileDictionaries.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MaxWordLengthTokenCollector.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsModeDictionariesCompiler$1.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsModeDictionariesCompiler.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsTokenCollector$LinkedToken.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/MostWordsTokenCollector.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/impl/SortingDictionariesCompiler.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/PaodingAnalyzer.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/PaodingAnalyzerBean.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/PaodingTokenizer.class
信息检索/paoding/classes/net/paoding/analysis/analyzer/TokenCollector.class
信息检索/paoding/classes/net/paoding/analysis/Constants.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/BinaryDictionary.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/Dictionary.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/DictionaryDelegate.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/HashBinaryDictionary$SubDictionaryWrap.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/HashBinaryDictionary.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/Hit.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Detector$1.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Detector.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Difference.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/DifferenceListener.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/ExtensionFileFilter.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Node.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Snapshot$InnerNode.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/detection/Snapshot.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/FileWordsReader.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/ReadListener.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/SimpleReadListener.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/support/filewords/SimpleReadListener2.class
信息检索/paoding/classes/net/paoding/analysis/dictionary/Word.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/BoldFormatter.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch1/English.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch1/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch2/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch2/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch3/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch3/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch4/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch4/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch5/Chinese.class
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ch5/text.txt
信息检索/paoding/classes/net/paoding/analysis/examples/gettingstarted/ContentReader.class
信息检索/paoding/classes/net/paoding/analysis/exception/PaodingAnalysisException.class
信息检索/paoding/classes/net/paoding/analysis/knife/Beef.class
信息检索/paoding/classes/net/paoding/a
本网站为编程资源及源代码搜集、介绍的搜索网站,版权归原作者所有! 粤ICP备11031372号
1999-2046 搜珍网 All Rights Reserved.