搜索资源列表
ContentAnalyzer
- 搜索引擎正文提取程序,通过html分析和正则,去掉html代码,保留网页正文,只针对中文有效。英文稍加修改即可使用。-The body of the search engine extraction process, through analysis and regular html remove html code to retain the page text, only effective against the Chinese. Slightly modified to use Engl
joyhtml-0.2.2
- html正文提取,利用匹配来进行正文的抽取-html text extraction, the use of matching to carry out the extraction of the body