资源列表
heritrix-1.14.2-src
- heritrix-1.14.2-src是网络爬虫Heritrix最新版本的源码,希望对大家有帮助-heritrix-1.14.2-src is a network of reptiles Heritrix the latest version of source, in the hope that we have to help
Clucene
- CLucene是Lucene的一个C++移植,Lucene是一个基于java的高性能的全文搜索引擎。CLucene因为使用C++编写,所以理论上要比lucene快。-The CLucene of Lucene a C++ transplant, Lucene is a java-based high-performance full-text search engine. The CLucene because to use C++ write so theoretically than luc
MyLucene
- 自己写的Lucene写搜引擎 简单搜索引擎的设计与实现-Writing the Writing their own search engine Lucene search engine easy design and implementation
interleaver
- interleaver research
searchengine
- This document includes the use of Web data mining expertise to carry out the search engine design, and personalized search engine based on the study of documents, rich, do not miss!
heritrix-1.14.0-src
- 知名网络蜘蛛源码,可以下载整站内容,扩展性强,可以下载动态网页
ZeroCrawler
- 该程序用于抓取某一网页的所有链接,适合爬虫初学者使用-The procedure used to crawl all the links of a web page, suitable for reptiles beginners
paoding-analysis-2.0.4
- Paoding中文分词是一个使用Java开发的,可结合到Lucene应用中的,为互联网、企业内部网使用的中文搜索引擎分词组件。 Paoding填补了国内中文分词方面开源组件的空白,致力于此并希翼成为互联网网站首选的中文分词开源组件。 Paoding中文分词追求分词的高效率和用户良好体验。-Paoding Chinese word is a Java development can be combined with Lucene applications for the word componen
PDFBox-0.6.7a
- 采用java编写的处理PDF文档的程序,可从PDF文档中抽取txt文本,可与lucene搜索引擎相结合。-adopting the java programs compiled to dispose the PDF document, taking out the txt text from the PDF document, and combining with the lucene searcher.
clucene-0.9.8
- clucene是lucene的C版本。这是一个建立索引、搜索的函数库。-clucene lucene is the C version. This is an established index, search the libraries.
bolangjiaoyu
- 一款功能强大的教育门户网站源码,asp+access,很适合参考-A powerful educational portal source asp+access very suitable for reference