搜索资源列表
Linux-C-Spider
- 可以实现网页中EMAIL地址的爬取,在Linux环境下,使用C实现-Web pages can be achieved crawling EMAIL address, in a Linux environment, using C to achieve
WebCrawler
- 一个简单的爬虫程序,根据用户输入,抓取可能的链接,继续爬取,可控制爬取总页面数,或在爬到特定关键字停止-A simple crawler program, based on user input, to crawl links may continue crawling, can control the to crawling the total number of pages, or stop in the climb to a specific keyword
main
- 一个简单的网络爬虫,不但能爬取网页文本内容,还能把网页中图片爬下来。-A simple web crawler, not only can crawl the web page text content, but also to climb down the pages of pictures.
ZhihuDown
- java写的网络爬虫,可以爬取知乎网站等等网站的文字信息,简单易懂,可以很方便的修改爬取其他网站的关键字段。-java to write the Web crawler can crawl text messages almost known sites, and more websites, easy to understand, you can easily modify key fields crawling other sites.
weather
- 一个简易的python网络爬虫程序,可以爬取某个网站的数据,直接在命令行下运行即可。-A simple Python crawler program, you can crawl to take a website data, directly under the command line to run.
Zhihu-master
- 利用python运用递归对知乎用户信息进行爬取(Using Python to crawl information about known users)
pa3
- 对于有些网址上不能直接下载的图片,利用此代码就能够伪装成浏览器,批量爬取网页上的图片。(It can pretend like a browser to download the pictures on the web page)
xici_proxy
- 爬取西刺前10页(可自行修改参数total_page来管理爬取的页数)有效期大于1天的高匿代理IP,并测试其有效性,最后保存为Proxies.json文件(Unicode),使用时导入文件随机选取一个代理ip使用即可.(Crawl up to 10 pages before the Western thorn, which can modify the parameter total_page to manage the page number of climbing. The high hid
213
- 实现linux shell的高效化编程,linux系统管理以及基本bash 脚本,每天从bing首页上自动爬取照片。(implement high efficient programming of linux shell,linux system administration and basic bash scr ipt.It can obtain photos from bing's website everyday automatically.)
juchaozixun
- 爬取网站上面的数据,示例是爬取巨潮资讯网站上面上市公司数据(Crawling on the site data, sample data above listed companies take up cninfo website)