文件名称:BuptCrawl
-
所属分类:
- 标签属性:
- 上传时间:2013-11-19
-
文件大小:5.41mb
-
已下载:0次
-
提 供 者:
-
相关连接:无下载说明:别用迅雷下载,失败请重下,重下不扣分!
介绍说明--下载内容来自于网络,使用问题请自行百度
使用Java语言编写的一个网络爬虫demo,将爬取下来的网页转化为统一的XML格式,对XML文件进行解析,对各个DOM节点进行编号。根据节点编号可以获取到各元素节点的内容-Using the Java language using a web crawler demo, will climb to take down the web page into a unified XML format, the XML file is parsed for each DOM nodes are numbered. According to the node ID can get to the content of each element node
(系统自动生成,下载前可以参看下载内容)
下载文件列表
BuptCrawl/
BuptCrawl/.classpath
BuptCrawl/.project
BuptCrawl/.settings/
BuptCrawl/.settings/org.eclipse.core.resources.prefs
BuptCrawl/.settings/org.eclipse.jdt.core.prefs
BuptCrawl/bin/
BuptCrawl/bin/com/
BuptCrawl/bin/com/bupt/
BuptCrawl/bin/com/bupt/crawler/
BuptCrawl/bin/com/bupt/crawler/Controller.class
BuptCrawl/bin/com/bupt/crawler/dom4j/
BuptCrawl/bin/com/bupt/crawler/dom4j/Dom4JUtils.class
BuptCrawl/bin/com/bupt/crawler/dom4j/Downloader.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlClean.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlCodeUtil.class
BuptCrawl/bin/com/bupt/crawler/MyCrawler.class
BuptCrawl/bin/edu/
BuptCrawl/bin/edu/uci/
BuptCrawl/bin/edu/uci/ics/
BuptCrawl/bin/edu/uci/ics/crawler4j/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Configurable.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Page.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/WebCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/Cryptography.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/CrawlStat.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/Downloader.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/MultipleCrawlerController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/ControllerWithShutdown.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/CustomFetchStatus.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/IdleConnectionMonitorThread.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$GzipDecompressingEntity.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetchResult.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters$ReservedCounterNames.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/DocIDServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Frontier.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/InProcessPagesDB.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WebURLTupleBinding.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WorkQueues.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/BinaryParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ExtractedUrlAnchorPair.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$Element.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$HtmlFactory.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/Parser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/TextParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/HostDirectives.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtParser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RuleSet.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/
BuptCrawl/bin/edu/uci/ics/crawler4j/url/TLDList.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/URLCanonicalizer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver$Url.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/WebURL.class
BuptCrawl/bin/edu/uci/ics/crawler4j/util/
BuptCrawl/bin/edu/uci
BuptCrawl/.classpath
BuptCrawl/.project
BuptCrawl/.settings/
BuptCrawl/.settings/org.eclipse.core.resources.prefs
BuptCrawl/.settings/org.eclipse.jdt.core.prefs
BuptCrawl/bin/
BuptCrawl/bin/com/
BuptCrawl/bin/com/bupt/
BuptCrawl/bin/com/bupt/crawler/
BuptCrawl/bin/com/bupt/crawler/Controller.class
BuptCrawl/bin/com/bupt/crawler/dom4j/
BuptCrawl/bin/com/bupt/crawler/dom4j/Dom4JUtils.class
BuptCrawl/bin/com/bupt/crawler/dom4j/Downloader.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlClean.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlCodeUtil.class
BuptCrawl/bin/com/bupt/crawler/MyCrawler.class
BuptCrawl/bin/edu/
BuptCrawl/bin/edu/uci/
BuptCrawl/bin/edu/uci/ics/
BuptCrawl/bin/edu/uci/ics/crawler4j/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Configurable.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Page.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/WebCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/Cryptography.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/CrawlStat.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/Downloader.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/MultipleCrawlerController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/ControllerWithShutdown.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/CustomFetchStatus.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/IdleConnectionMonitorThread.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$GzipDecompressingEntity.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetchResult.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters$ReservedCounterNames.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/DocIDServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Frontier.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/InProcessPagesDB.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WebURLTupleBinding.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WorkQueues.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/BinaryParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ExtractedUrlAnchorPair.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$Element.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$HtmlFactory.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/Parser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/TextParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/HostDirectives.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtParser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RuleSet.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/
BuptCrawl/bin/edu/uci/ics/crawler4j/url/TLDList.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/URLCanonicalizer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver$Url.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/WebURL.class
BuptCrawl/bin/edu/uci/ics/crawler4j/util/
BuptCrawl/bin/edu/uci
本网站为编程资源及源代码搜集、介绍的搜索网站,版权归原作者所有! 粤ICP备11031372号
1999-2046 搜珍网 All Rights Reserved.