搜索资源列表
gmeans
- gmeans-- Clustering with first variation and splitting 文本聚类算法Gmeans ,使用了3种相似度函数,cosine,euclidean ,KL.文本数据使用的是稀疏矩阵形式. -gmeans clustering with first variation and splitting Gmeans,a text clustering algorithm, uses 3 functions,cosine,euclidean and
RepeatedForms
- 根据相似度去重,把文本很相似的删除掉,基于VSM的算法的实现。-According to the similarity to heavy, very similar to the text removed, the algorithm based on VSM realize.
CheckText
- 实现文本相似度检查的文本层次聚类算法和划分算法的C#源码-Text to check the text similarity hierarchical clustering algorithm and classification algorithm C# source code
pLSA_EM
- PLSA EM演算法,用於文本與字詞之間的矩陣工具,測量其相似度-PLSA EM algorithm for the matrix between the text and words tool to measure the similarity
CompareText
- 比对两文本/字符串的相似度,利用LD矩阵算法-Compare two text/string similarity matrix algorithm using LD ..
Character-recognition
- 自己制作基于“欧氏距离的算法”来识别文字的相似性,从而来识别手写文字的程序,开发环境是matlab.需要讲手写的字加到字库才可以哦。-Produce their own based on the " Euclidean distance algorithm" to identify the similarity of the text, handwritten text in order to identify the procedures, the development e
WordSimilarity
- 基于HowNet对中文单词进行相似度计算,实现的是《基于<知网>的词汇语义相似度计算》论文中的算法。-Based on HowNet for Chinese words for similarity computation, to achieve the " based on < Text> vocabulary semantic similarity calculation," the paper' s algorithm.
ImproveStringSimilarity_src
- 通过对两段文本相似度计算,避免出现重复信息。-Compute the similarity between two text.
ShortText_Similarity
- 改程序实现了短文本相似度计算,在信息检索等领域都有广泛的用途,用python实现-Reform program to achieve a short text similarity calculation, has a wide range of applications in areas such as information retrieval, implemented in python
cpp
- 文章查重源码 应用了最短编辑距离算法以及相似度算法原理,用两个待比较的文本框输入要比较的的文本文字,然后求解,最后输出两篇文章的相似度的百分比 -The article source application rechecking the shortest edit distance algorithm and the similarity principle, comparing two stay text box input to compare the text of the text,
Program1
- 这是中文信息处理的分词算法实现,具有很高的正确率和使用价值!这是分词算法的核心内容!具有很高的参考价值。-The text clustering algorithm based on text similarity computing research and implementation, this is an important branch of Chinese information processing.
SimHash
- simhash算法的实现,可快速比较文本的相似性-achieve simhash algorithm can quickly compare the similarity of text
Kl
- 计算文本相似度,输出文本间KL距离,JS距离-Calculate the text similarity, the KL distance and JS distancetext of text。
WIP3
- Kaggle竞赛“Can your AI smarter than a 8th grade student?”的代码。 使用文本相似度计算的方法,对美国八年级学生的科学考试卷(4选1选择题)进行自动回答。-Kaggle contest Can your AI smarter than a 8th grade student? Code. The method of using text similarity calculation, the US eighth-grade stude
DocDistance
- java实现的文本相似度系统,使用向量空间模型以及余弦相似度距离公式,实测可以实现2篇文本的相似度计算且有一定的效果。-Java text similarity system, using the vector space model and the cosine similarity distance formula, the measured results can be achieved two similarity of text and have some effect.
相似度检测
- 可以计算文本相似度,任何语言!!!!!!!!!!!!!!!(Can calculate text similarity, any language!!!!!!!!!!!!!!!!!)
mn
- 对文本的情感分析的划分和测试,相似度进行判断(The classification and test of text sentiment analysis, and the judgement of similarity.)
btm-master
- BTM模型,短文本相似度的处理模型,计算短文本相似度(BTM model, processing model of short text similarity)