资源列表
Page98PageRank
- google PageRank算法详解,Google两位创始人在美国申请了PageRank的专利,这是他们对PageRank算法所发表的论文-Google PageRank Algorithm,PageRank Pattern
pagerank
- 现在很多人都在研究搜索引擎,但要自己做一个搜索引擎缺是很难的,所以我把这个搜索引擎发上来,以有利于别人的研究。-Many people are now in search engines, but their lack of a search engine it is very difficult, so I made up the search engine in order to facilitate the research of others.
stop-words-list
- 在搜索中的无效词等,包括中文,英文两个文档。基本包含了见的所有无效词-Invalid words in the search, including the English and Chinese documents. See all basically contains invalid word
bpageloader
- 该程序的编程环境是VC6.0,你可以使用它把整个网站的页面都下载下来。可以保留这些数据给搜索引擎用。-Programming environment of the program is VC6.0, you can use it to download entire websites pages are down. Can retain the data to the search engines.
heritrix-1.14.4
- heritrix-1.14.4 纯JAVA开发的,开源的Web网络爬虫-heritrix-1.14.4 pure JAVA development, open source Web crawler
TwitterData-csharp
- 爬社交网络数据程序, 用C#编写,比较基本,适用于初学者学习交流。-It is used to crawl data from online social networks. Realized basic functions such as making API connection, request data, etc.
vbXML
- VB源码:通过XML读取网页内容并分析取得需要的数据-VB Source: Read through the XML content and analysis of data required to obtain
SearchEngine
- 基于Java平台的一个简单的搜索引擎的完整实现-Implemented based on the integrity of the Java platform, a simple search engine
cn2
- 关于数据挖掘中分类算法的顺序覆盖算法的经典论文-A good paper for sequential algorithm in classification of dataming
SearchCrawler
- java编写的网络爬虫程序用于检索网站资源和信息,多线程实例-java web crawler program written for searching website resources and information ,a multi-threaded example
Video-Crawler_tools
- 视频爬虫,可自动在互联网上搜索MS,Real格式的视频文件.-Video-Crawler
UindexWeb_OpenCpu
- 最新版的搜索引擎,开源软件.大家可以去网站:http://www.opencpu.com-The latest version of the search engines, open source software. You can go to website: http://www.opencpu.com