资源列表
R
- r语言基础及语法、语句、函数大全,包括R语言的实际应用。(R language base and example application)
data
- 训练NER的语料文件,已全文标注,四个字段(Training NER's corpus file, full text annotation, four fields)
Chinese2SequenceFile
- 中文文档转成sequencefile文件格式,便于在hadoop下使用操作,java代码(Chinese doc to Sequence File)
R Graphics Cookbook2013
- about data visualization in R is available! The book covers many of the same topics as the Graphs and Data Manipulation sections of this website, but it goes into more depth and covers a broader range of techniques.
R爬虫小白实例教程-源代码及爬取后数据
- 基础的爬虫学习文档,来自网站内容,如有疑问请联系原作者,仅供交流学习使用(Scratch. If you have any questions, please contact the original author for the use of communication and learning.)
FM algorithm
- 因子分解机( FM)算法是一种基于矩阵分解的机器学习算法,是一种常用的推荐算法。(Factorization algorithm is a matrix-based machine learning algorithm, which is a commonly used recommendation algorithm.)
DeepLearning
- 斯坦福深度学习课程MATLAB源码,包含各种主流算法(Deep learning program of Stanford University MATLAB source code)
crawler
- 用python和R语音实现爬虫功能,以此获取所需要的数据。(Use Python and R to implement crawler function and obtain data.)
wordcount
- 基于eclipse下的hadoop的wordcount程序(Hadoop's wordcount program)
wordcount3
- hadoop的wordcount程序,去除标点和部分停词(Hadoop's wordcount program, removing punctuation and partial parked words)
hearder.py
- 利用python抽取单个电影的豆瓣影评信息(use python to get all user reviews from douban movie site)
lolksubroutine
- 采用分支定界算法,主程序为intopt,使用与linprog类似()