资源列表
VIPS
- 基于视觉的web页面分割算法(vips)-VIPSa Vision-based Page Segmentation Algorithm
proWordSegment
- 正向最大匹配中文分词c++源程序,在visual studio 2008中调试通过。-Chinese are the largest sub-word match c++ source code, visual studio 2008 in debug through.
PLSA
- PLSA 的Java实现,可以用于图像处理,文本分类,文本聚类等-code of PLSA in JAVA
pinyin
- 将汉字转换为拼音全拼,用C语言编写,iphone开发可以使用。-Chinese characters into pinyin spelling, written in C, iphone development can use.
big5togb
- 一个big5转换gb的例子--A VC code transform BIG5 to GB
JDBC-
- JDBC讲解PPT,JDBC知识点概述,JDBC框架-JDBC explain the PPT, JDBC overview of knowledge points, JDBC framework
stop_wordslk
- 这是一个中文停用词汇表,适合于做学术研究,软件开发-this is a Chinese stop words table which is suitable for studying research and so on
ISN
- 常见的中文内码一般有GB2312,GBK和台湾那边用的BIG5,有时候看一些台湾编程里的资料,都是乱码-the exchage of the ISN
Taobao1
- 多语言淘宝客,程序可以根据域名不同自动翻译成多国语言。例如http://en.test.com就是英文,http://ru.test.com就是俄罗斯文。-Multilingual Taobao off, the program can automatically translate the domain into many different languages. For example http://en.test.com is English, http://ru.test.com is
textFCM
- 应用FCM(模糊c均值聚类)算法到文本聚类 采用两种方法计算文本相似度 采用ShootSeg分词 采用sogou互联网词库简化特征值计算-err
SW_I_WordSegment
- SW-I中文分词算法,MFC程序,在visual studio 2008中调试通过。默认词库为mdb,由于较大未包含在源文件中,请自行下载mdb格式的词典。-SW-I Chinese word segmentation algorithm, MFC procedures, visual studio 2008 in debug through. Default thesaurus for the mdb, as a result of the larger not included in the
OpenCNSegmenter
- 中文分词,可以将中文的句子按照单词进行切分,很优秀的算法,在网络中得到-Chinese word segmentation, Chinese sentence can be carried out in accordance with the word segmentation, it is excellent algorithm, in the network have been