资源列表
kooxoo
- 在线采集源程序,kooxoo初期代码,供学习研究
Chinesewordsegmentationalgorithm
- 中文分词算法,跟金山词霸一样,当鼠标移动到语句上时,能自动分割词语-Chinese word segmentation algorithm with the same PowerWord, when the mouse moved to sentence when the words automatically partition
JWikiDocs-1.0.tar
- a tool for crawling and downloading Wikipedia documents
SearchEngine1.0
- 实现搜索引擎最基本的下载网页、建立倒排索引、关键词查询功能。程序的实现借助了libcurl库。-Search engine to achieve the most basic functionality of downloading page, seting up inverted index, keyword querying. Program implementation with the libcurl library.
deploy
- 该系统把经常变动的信息,类似公司动态、企业新闻、新产品发布、促销活动和行业动态等更新信息集中管理,并通过信息的某些共性进行分类,最后系统化、标准化发布到网站上,同时提供新闻搜索及相关网站的友情链接。-The system is the constantly changing information, similar to the company dynamic, business news, new product releases, promotions and industry dynami
WebCrawler
- 一个简易的网络爬虫,并进行page权值的计算-A simple web crawler, and the calculation of weights for page
spider_engine
- 分析网页代码,提取url进行散列处理,提交客户端程序进行排重处理,然后存入客户机数据库,随后根据数据库中的url列表遍历整个网络。-Analysis of web code, extract the hashed url, submit re-schedule the client program to deal with, and then stored in the client database, and then the url list in the database through
Lucene_Course
- Lucene电子书(pdf版),包含Lucene的入门到精通的使用(Lucene e-book (PDF version), including the introduction of Lucene to master the use of)
hao123_5.0
- this hao123网址导航源码-this is hao123 site navigation source
the_rank_of_search_engine
- 搜索引擎大家都用过,但是他们的排名规则我们知道吗?怎样让自己的网站在搜索引擎中独占鳌头呢?本书详尽解答。-We all used search engine, but they rank the rules we know? How to make their sites come out on top in search engines do? Detailed answers to the book.
p2pDLL
- p2p搜索,DLL源码 内含需要的超级模块以及精易模块-p2p search, DLL source code contains modules and require super fine easy module
Lucene.Net-1.9.1.doc
- 搜索引擎 搜索引擎