资源列表
JTextPro-1.0.tar
- JTextPro: A Java-based Text Processing tool that includes sentence boundary detection (using maximum entropy classifier), word tokenization (following Penn conventions), part-of-speech tagging (using CRFTagger), and phrase chunking (using CRFChunker
heritrix1.14.4
- heritrix1.14.4.zip版,欢迎下载-heritrix1.14.4.zip version, welcome to download
luceneAndnutch
- Lucene+nutch构建搜索引擎原书光般内容-the source code of use Lucene+ nutch to build a search engine
Lucene+Nutch
- 该书首先描述了开发平台的配置, 接着详细介绍LUCENE和NUTCH开发。-The book first describes the development platform configuration, and then details the development of Lucene and NUTCH.
yssfor
- 1、真正的搜索引擎: 2、 网页蜘蛛灵活高效。 3、可控的正文提取。 4、可控的中文分词及新词学习。 5、无人值守。 6、BS架构,虚拟主机支持。 7、强大功能,简单使用。 8、个性化。 9、增强网站软实力-1, the real search engine: 2, Web Spider flexible and efficient. 3, the body of controllable extraction. 4, controlled the Chinese
nutch
- nutch视频 简单搭建环境 搜索引擎 视频讲解 容易-own yourself search engine
PHPSou_v1.2_GBK_20111226
- php开发的搜索引擎,蜘蛛抓爬系统等等,适合个人搜索-php development search engine spider Scratch system, suitable for personal search
introduce-to--search-engine
- 梁斌写的经典搜索引擎入门书籍《走进搜索引擎》,作者为南大毕业,现在在清华读博-Liang Bin, search engines started to write the classic book " into the search engine" , author NTU graduate, and now pursue a Ph.D. degree in Tsinghua University
webcrawler
- 一个java 开发的网络爬虫,采集功能比较强大-Development of a java web crawler, collecting more powerful features
bbk2818
- nutch开发自己的搜索引擎 视频教程 简单 环境搭建-nutch own yourself search engine
demo
- 实现java网页爬虫功能,内容详细,包含了多个预留功能接口(accomplish the spider function and it's very copmpletely)
lucence
- luncen制作搜索引擎学习光盘代码