资源列表
zzbds.rar
- 用正则表达式处理语料库,最多可以处理500个句子,如果想拥有更多功能可以注意使用V2.0,Corpus with the regular expression processing, can handle up to 500 sentences, if you want to have more features may take note of the use of V2.0
WordSegmentation.rar
- 很久以前做的最大概率法分词程序,语料比较大,Maximum probability method for Word Segmentation
57320svd.rar
- 奇异值分解源程序的一个程序,很不错的一个程序,欢迎使用,Singular value decomposition of a source program, it is a good program, Welcome
JPEG_Hardware_Compressor_Encod
- JPEG Hardware Compressor Encoder,JPEG Hardware Compressor Encoder
ID3.rar
- 决策树算法ID3的代码实验,编译后可直接使用的ID3代码,ID3 decision tree algorithm code experiments, compiled code can be directly used in ID3
SentenceSimilar.rar
- 先对句子分词,然后根据词来比较句子的相似度,这个算法清晰易懂,欢迎下载!,The first word of the sentence, and then to compare the sentence the word similarity, this algorithm is clear and easy to understand, welcome to download!
windows_c_32.rar
- 中国科学院的最新版本的中文分析程序,可以进行分词、词性标注等,The latest version of the Chinese Academy of Sciences of the Chinese language analysis procedures, can be sub-word-of-speech tagging, etc.
UEWP2009.rar
- 1、操作题共10小题,建议按次序答题。 2、操作题满分45分,答题时间20分钟,到时系统自动交卷判分,可按[交卷]按钮提前交卷。 3、切换试题时速度不易太快,如果出现大题标识与实际测试内容不符,移动滚动条刷新试题可解决。 4、提前交卷必须慎重,一经交卷后不可以再重新答题。 ,1, operating a total of 10 trivial questions, the proposed questions in sequence. 2, operating out of 45 que
Chinese-Segmentation.rar
- 自己编写的中文分词源程序,用vc++编写,附有完整的文档,以及标准的分词数据库,I have written the source code of the Chinese word segmentation, using vc++ to prepare, with complete documentation, as well as sub-standard speech database
xlwt-0.7.1.win32.rar
- xlwt 是用PYTHON生成EXCEL文件的包,和xlrd是读写整套包,巨方便,xlwt is generated PYTHON package EXCEL files, and read and write the entire package is xlrd, giant convenience
stopwords.rar
- 中文词的停用词表,可以作为中文信息处理中停用词删除的索引词典使用。,Chinese Vocabulary words out, Chinese information processing can be used as stop words in the dictionary using the index to delete.
lindatanetwork1.rar
- 主要是数据挖掘中的文本挖掘算法及其分析,其中包括层次聚类,空间向量模型等,处理对象有对于网页的也有针对纯文本的。,Mainly in data mining and analysis of text mining algorithms, including hierarchical clustering, vector space model to deal with subjects of the pages are for plain text.