资源列表
lucene3.1.0
- lucene3.1.0 全文检索 网站搜索引擎开发-lucene3.1.0 search
SearchEngine
- Java实现的搜索引擎,有网页爬虫,查询服务,中文分词,索引建立等- realize search engine in Java
lab2-indexing
- 实现了搜索引擎中的建立索引的部分,里面有详细的文档说明-realized the search engine of the establishment of the index, with a detailed document shows
Classics_of_web_development_search_engine_code
- web开发经典文本搜索引擎代码Classics of web development search engine code-Classics of web development search engine code
chentian.fenci
- 实现了基于词库的nutch中文分词,这一部分是其中的dll文件-realized based on the thesaurus nutch Chinese word, this part is one of the dll file
daima
- 简单的搜索排序,建立索引,基于VSM向量空间模型-Simple search sorting, indexing, based on vector space model VSM
spider1.20PforPwindows
- 微博爬虫连接数据库爬去新浪博客用户数据 配置数据库 1.00 测试版 正常运行 爬虫 新浪博客 3.5 或 4版本 添加图片下载通道开关 -Microblogging reptiles crawled Sina blog users connect to the database data Configuration Database 1.00 beta running reptile Sina blog version 3.5 or 4 channel switch to ad
building_search_applications
- 这本书通过比较几个著名的开源的搜索引擎,深入研究了开发搜索引擎过程中的一些核心技术-This book by comparing the number of well-known open-source search engine, in-depth study of the search engine in the process of developing some of the core technology
Patent
- 一个国内专利检索的小工具。须要安装专利说明书浏览器。-A domestic patent search gadget. Patent specification need to install the browser.
DeepWeb_Search
- DeepWeb分类搜索引擎关键技术研究。kdh-Category Search Engine DeepWeb key technical studies. kdh
domainSpider
- 自己用java写的一个域名扫描程序,扫描网络上未被注册的域名。可以在配置文件里配置字符组成、长度范围、域名机构名。扫描结果存放mysql数据库里同时输出log文件,建库语句压缩包里有。-Own a domain name written in java scanner scans the network is not registered on the domain name. Configuration characters in the configuration file, the len
MSNIMRobot
- MSN机器人IMRobot 很值得研究的