搜索资源列表
distributed_word_embedding-master
- The Distributed Word Embedding tool is a parallelization of the Word2Vec algorithm on top of our DMTK parameter server. It provides an efficient scaling to industry size solution for word embedding. -The Distributed Word Embedding tool is a paralle
NLP_word2vec
- nature language processing. Implement of word2vec in R (CBOW + negative sampling) .
word2vec-2014-10-29.tar
- 一个在linux下编译运行的聚类源码,可以实现词库的聚类-In a compiler to run under linux cluster source can be achieved clustering thesaurus
demo.py
- word2vec生成词向量空间,为后续操作做准备,例如做分类学习,聚类学习,等对各种机器学习的方法做铺垫-Word2vec generate word vector space, for the preparation of follow-up operations, such as learning to do classification, clustering learning, and so on a variety of machine learning methods pave th
k-means.py
- k-means算法的python直接通过word2vec生成的向量空间数据放入,可直接获得所需要的聚类结果。可以自己设定输出的类别数目。-K-means algorithm directly into the vector space data generated by word2vec, you can directly obtain the desired clustering results. You can set the number of output categories.
word2vec
- 机器学习词向量训练工具,linux下训练词向量。-Machine learning word vector training tool, linux training word vector.
word2vec
- 用java语言编写的查找中文相似词功能,欢迎大家使用(Find Chinese similar words)
word2vecC实现代码
- 实现词语的特征扩展和相似词查找;关系挖掘;作为系列的初始化输入特征。(Realize word feature extension and similar word search.)
words_1025_dic.txt
- dbscan,暂时不要下载,有误,回头整理(dbscan and word2vec for chinese words)
code
- 自然语言处理中对于某些数据集的清洗功能的完善(natural language process)
3_14
- tensorflow 学习的一些例子 有cnn的 有rnn的 lstm word2vec udacity学院 的 等(Some examples of tensorflow learning are CNN's LSTM word2vec with RNN Udacity College)
kmeans
- jieba分词将中文文本进行分词处理,将分词后的结果使用word2vec转化成词向量,使用kmeans将中文文本进行聚类(Jieba participle segmenting Chinese text, transforming the result of word segmentation into word vector using word2vec, and clustering Chinese text using kmeans.)
TianCheng-master_chusai_qingyu
- 2018年甜橙金融杯大数据建模大赛初赛方案:通过追踪时间、设备、ip和经纬度等属性的变化来建模判断UID是否为黑产链 ## 代码说明: - gen_stat_feat.py 统计特征 - gen_w2v_feat.py word2vec特征 - lgb_train.py lgb训练模型 两份特征建模加权8:2比例融合即可0.792+,单独统计特征加UID列建模即可0.795。(The preliminary scheme of the 2018 Sweet