资源列表
搜索引擎
- vc编写的搜索引擎
google Search Engine
- google php 搜索引进,代码简易,功能强大。
EasyXSpider
- EasyXSpider不仅仅是一个简单的Linux下的爬虫程序。更包括了,索引制作,检索,分词(英文及中文二元法切词),以及Google PageRank算法和CGI查询界面的实现。可以看做是一个完整的小型搜索引擎。
Webloup
- WebLoupe is a java-based tool for analysis, interactive visualization (sitemap), and exploration of the information architecture and specific properties of local or publicly accessible websites. Based on web spider (or web crawler) technology. 开源搜索爬
信息检索报告
- Information Retrieval (IR) is the discipline that deals with retrieval of unstructured data, especially textual documents, in response to a query or topic statement, which mayitselfbeunstructured,e.g.,asentenceorevenanotherdocument,orwhichmay be s
PDFBox-0.6.7a
- 采用java编写的处理PDF文档的程序,可从PDF文档中抽取txt文本,可与lucene搜索引擎相结合。-adopting the java programs compiled to dispose the PDF document, taking out the txt text from the PDF document, and combining with the lucene searcher.
rj588_tongyicjuniveralgatsy
- 统一搜集系统 Univeral Gather System(UGS) 是针对搜集程序设计的一套PHP类 ,使用方便 运行于各种Unix系统 linux系统和Win2000/XP/2003 系统中 类内函数 steal 用于搜集页面文字 cut/cutpro 用于剪切文字 filt/filtx用于过滤文字 change用于改变文字 getenterkey用于获得关键位置的链接Array _striplinks和_striptext 用于产生锚点的Array和文字-unified collectio
lucene-1.4.3
- java分词技术,只实现英文分词,但是该分词算法很经典(来源于apache)-java-term technology, achieving only English Word, but the Word algorithm classic (from apache)
BlueSearch
- 搜索数据取自百度网站,可实现站内搜索和互联网搜索,速度超快.-The data of searching comes from www.baidu.com. The software can search not only the site,but the internet.And the speed is quit high!
1575465
- 可以查看你的网站在搜索引擎搜索关键中排第几名,可以在数十个搜索引擎中快速找到你网站的位置,含ASP版和ASP+ASP.net两个版本 -can check your website in search engine key ranked number, the number of 10 quick search engine to find your site location, including ASP and ASP version ASP.net 2 version
aadfd
- 搜索数据取自百度网站,可实现站内搜索和互联网搜索,速度超快 -Baidu search data from websites, can be realized station search and Internet search, speed Ultrafast