资源列表
keyword_find
- 实现了将pdf转换为txt,并且进行分模块的关键词抽取算法-Realized convert pdf to txt, and dividing module keyword extraction algorithm
svmMLiA
- 支持向量机是最常用的一种分类器,它通过求解一个二次优化问题来最大化分类间隔,本例采用的SMO算法,可以大大优化运行-Support vector machine is the most commonly used classifier, it can be used to solve a two optimization problem to maximize the classification interval, this example uses the SMO algorithm, ca
fuzzy
- 模糊聚类分析是根据客观事物间的特征、亲疏程度、相似性,通过建立模糊相似关系对客观事物进行聚类的分析方法。-Fuzzy clustering analysis based on objective characteristics, the degree of relatedness, similarity, through the establishment of fuzzy cluster analysis method of objective things.
kmeans
- 对文章进行kmeans聚类,进行网页主体内容的提取-Extraction of articles kmeans clustering for web main content
Hadoop
- 使用hadoop开发,可以对输入文件中出现的关键词统计词频并进行不同文本词频统计高低的排序,本代码需要用户自行定义关键词和输入文件-Use hadoop development, can appear in the input file keyword statistics word frequency and low frequency statistics different sort of text, the code requires a user-defined keywords an
Part1
- 实现了500篇纽约时报新闻的数据挖掘,包括数据预处理、基本数据统计等-Achieved 500 New York Times news data mining, including data preprocessing, basic data statistics, etc.
Subsample_MaxNeighborDistance_R
- A subsample of an input population that has max pair-distance and min projection error
algorithm
- 多线性SVM分类器,在实现分类的同时,能很好的聚类-Multi-linear SVM classifier to classify the same time, can be a good clustering
Apriori-Algorithm
- 频繁模式挖掘的算法Apriori算法,用C++语言实现。-frequent pattern mining
spider-(2)
- 应用python编写的百度指数新闻爬取代码-baiduindex spider
chapter11code
- python 数据挖掘 chapter-python data mining chapter11
svm
- 最经典的机器学习方法svm分类器的python实现-The most classic machine learning svm classifier python realization