资源列表
heritrix-1.6.0-src
- 非常优秀的搜索引擎 LInux下 java版本的 robot-excellent search engine LInux under java version of the robot
News Search3.01
- 一款新闻搜索软件-new information search software
MTGV
- 1, completely solve the source Website thumbnail problems. 2, increase the webmaster often use super chain tool. 3, increase the custom page perfect use, and in the home layout tool column, help to optimize. 4, increase classified catalogue, label
sxt_Lucene.rar
- 尚学堂的一个很不错的搜索引擎开发案例,内有详细开发文档及源码.,The school is still a very good search engine development case, which detailed the development documentation and source code.
1
- 自己动手写搜索引擎第三章代码,随书光盘中的内容,整个太大,只能分别上传-Chapter code search engine to write himself, with the contents of the CD-ROM, the whole is too big, we were only able to upload
LuceneInAction_SourceCode
- lucene是用在搜索引擎的开源工具,可以对所抓爬到的网页进行索引写入,对做好的索引可以进行快速的搜索。-Lucene is used in the open-source search engine tool, which can grasp onto to the website indexing write, the index can do rapid searches.
LuceneInActionSRC.tar
- 搜索引擎Lucene的一本书的源码,对于看那本书确实很有帮助-Lucene search engine, a source book for Look at this book really helpful
ICTCLAS50_Windows_32_C
- 中科院分析系统 ICTCLAS的主要功能有:中文分词;词性标注;命名实体识别;新闻识别;用户词典-ICTCLAS segementword
kgramjac
- 计算两个字符串的k-gram的jaccard系数,是信息检索理论判断两个字符串相似度的应用。-To calculate the jaccard value of the two strings, in terms of the k_gram theory.
lucene-2.1.0
- 开源搜索引擎源码 包含建立索引及搜索的功能-Open source search engine to index and search source code contains the function
JavaSpider
- 是一本介绍搜索引擎的书籍, 是一个类似于GOOGLE和百度的搜索引擎介绍书籍-is a search engine on the books, is similar to the GOOGLE and a search engine Baidu on books
NetBotJava
- 很难得的一本java spider 开发的资料书,写得很全面,有比较全的例子,开发出来的东西可以直接修改一下使用,,<网络机器人Java编程指南>-Hard to come by a java spider information on the development of the book, write a very comprehensive, more full of examples of things that can be developed to directly mo