资源列表
google探索算法源码
- 据说是GOOGLE搜索引擎的排序算法,看了一下,不太懂,晕啊-allegedly Google search engine ranking algorithm, looked at them, not really understand, halo ah!
百度分词词库
- 据说是百度以前用的中文分词词典,希望对大家有一点帮助哈,快下快下-allegedly Baidu before the Chinese word dictionaries, we hope to have a bit of help to Kazakhstan, where fast under fast!
Nalanda-iVia-Crawler-1.0.1.tar
- 主题爬行源码.很经典的.对研究主题爬行的人很有帮助.-theme crawling source. Very classic. The themes were very helpful crawling.
lzsearch
- 用javascrip编写的分词系统 可以解决现在许多网站中文搜索支持不好的问题 无解压密码 -javascrip prepared with the sub-term system can solve many Web sites now support Chinese search the problems without extracting passwords
xapian-core-0.9.2.tar
- 开放源码的搜索引擎(Xapian open source search engine)-open source search engine (Xapian open source search engi ne)
LuceneInAction_SourceCode
- lucene是用在搜索引擎的开源工具,可以对所抓爬到的网页进行索引写入,对做好的索引可以进行快速的搜索。-Lucene is used in the open-source search engine tool, which can grasp onto to the website indexing write, the index can do rapid searches.
openwebspider-0.5
- 开源的Web蜘蛛程序,可以多线程现在Web页面-open-source Web spiders procedures can now multithreaded Web pages
分词模块
- 一个非常有用的分词模块,对研究搜索引擎的人有参考价值-a very useful segmentation module, the study of search engines reference value
spidergui
- 本源码简单易懂,便于JAVA初学者参考编程,适合研究搜索引擎-the source straightforward, easy reference beginners JAVA programming, for the study of search engine
WebCrawler
- 本源码简单易懂,便于JAVA初学者参考编程,适合研究搜索引擎-the source straightforward, easy reference beginners JAVA programming, for the study of search engine
Web_Spider_src
- c# spider 源代码 网络爬虫 c# spider 源代码 网络爬虫-source network reptiles c # spider source network reptiles
so
- 一个用java程序。java语言编制的搜索引擎界面。跟baidu的差不多-use a java procedures. Java language search engine interface. With the same engines.