搜索资源列表
Nutch源代码详细解析
- Nutch源代码,可以帮助快速读懂Nutch代码,用于二次开发
Nutch二次开发总结
- Nutch二次开发总结,在二次开发的时候,需要重点对Nutch的界面及界面显示数据进行适当的调整。Nutch的查询结果中摘要长度是可以改变的,它是以配置工兵方式进行的修改,配置文件是nutch-site:xml.
OReilly.Hadoop.The.Definitive.Guide.June.2009.RETA
- Hadoop got its start in Nutch. A few of us were attempting to build an open source web search engine and having trouble managing computations running on even a handful of computers.-Hadoop got its start in Nutch. A few of us were attempting to buil
Hadoopsource
- Google的核心竞争技术是它的计算平台。Apache上就出现了一个类似的解决方案,目前它们都属亍Apache的Hadoop项目,对应的分删是: Chubby-->ZooKeeper GFS-->HDFS BigTable-->HBase MapReduce-->Hadoop 目前,基亍类似思想的Open Source项目迓径多,Hadoop是其中最为流行的框架,本文就将简要介绍hadoop的一个开发流程。-Hadoop got its start in Nutch. A
Nutch
- Apache-Nutch1.3 学习笔记,很完整的学习笔记,内容很全-Apache-Nutch1.3 study notes, very complete study notes, is the whole content
Nutch
- Nutch使用过程,了解建立到求解,帮助你有个更深的认识-Nutch Nutch Nutch
Hadoop-based-distributed-crawler
- 本文讨论了搜索引擎的基本技术和网络爬虫的基本原理,并对分布式爬虫的技术原型Nutch进行了剖析。 -This article discusses the basic principles and basic techniques of search engine web crawlers, and distributed Nutch crawler technology prototypes were analyzed.
Nutch-Teach
- Nutch搜索引擎架构的学习教程,有需要做爬虫的同学们可以学习下他的理念。-Nutch search engine architecture, tutorials, there is a need to do reptiles students can learn at his ideas.