搜索资源 - 语料 - 搜珍网

CDN加速镜像 | 设为首页 | 加入收藏夹

热门搜索： 源码 Android 整站插件识别 p2p OpenCV 网络编程游戏源码算法更多...

登陆 | 会员注册

当前位置：

搜索资源 - 语料

下载资源主分类

源码下载

Web源码

开发工具

文档下载

其它资源

搜索资源列表

yulao.SogouT.min

0下载：
soguo公司的语料，用于文本，网页分类，很好的语料库。
所属分类：软件工程
- 发布日期：2008-10-13
- 文件大小：844.99kb
- 提供者：马龙

LJClusterDemo

0下载：
文本聚类是基于相似性算法的自动聚类技术，自动对大量无类别的文档进行归类，把内容相近的文档归为一类，并自动为该类生成特征主题词。适用于自动生成热点舆论专题、重大新闻事件追踪、情报的可视化分析等诸多应用。灵玖Lingjoin（www.lingjoin.com）基于核心特征发现技术，突破了传统聚类方法空间消耗大，处理时间长的瓶颈；不仅聚类速度快，而且准确率高，内存消耗小，特别适合于超大规模的语料聚类和短文本的语料聚类。灵玖文档聚类组件的主要特色在于： 1、速度快：可以处理海量规模
所属分类：software engineering
- 发布日期：2017-04-10
- 文件大小：1.05mb
- 提供者：lingjoin

CHMM

0下载：
使用层叠隐马模型解决命名实体识别问题，含有训练语料及测试预料。-Implicit use of cascading Ma Named Entity Recognition Model to solve the problem, containing training materials and tests are expected words.
所属分类：File Formats
- 发布日期：2017-03-26
- 文件大小：482.57kb
- 提供者：糊涂虫

Language_model_learning_in_chinese

0下载：
语言模型学习论文-中文基于最大熵方法的统计语言模型.pdf 基于对话回合衰减的cache语言模型在线自适应研究.pdf 基于Web网页语料构建动态语言模型.pdf 统计语言模型综述.pdf -Language model to study papers- Chinese based on the maximum entropy method of statistical language model. Pdf Round attenuation based on di
所属分类：Development Research
- 发布日期：2017-04-10
- 文件大小：1.19mb
- 提供者：wen6860

Collection-and-Analysis

0下载：
本文首先在欧共体项目（LC-STAR）的资助下，开展了大规模汉语平衡语料的收集与分析工作，其主要目的是建立一个反映现代汉语语言特点的、适用于汉语语言分析、语音识别和语音合成的汉语标注语料库，并以此为基础建立相应的信息词典。-Our work supported by European Union’s project of LC-STAR, which includes collection and analysis of a large-scale balance-corpus, aims to
所属分类：software engineering
- 发布日期：2017-03-29
- 文件大小：254.1kb
- 提供者：叶眸

199801

3下载：
人民日报1998年1月份的标注语料库，本人觉得很好所以共享一下~ 研究自然语言的朋友应该会用到~-People s Daily of January 1998 marked corpus of natural language should be used- friends
所属分类：File Formats
- 发布日期：2017-05-10
- 文件大小：2.12mb
- 提供者：lixiaolong

20news-bydate.tar

1下载：
一个可用的英文语料库，包含20大分类，可用作文本分类语料库-One of the available English corpus, including 20 big classification, usable ZuoWenBen corpus
所属分类：File Formats
- 发布日期：2017-06-02
- 文件大小：13.79mb
- 提供者：liuhaichun

master_thesis

0下载：
音乐领域中文实体关系抽取研究实体关系抽取的任务是从文本中抽取出两个或者多个实体之间预先定义好的语义关系。本文将实体关系抽取定义为一个分类问题，主要研究内容是中文音乐领域的实体关系抽取。针对这一问题，本文首先构建了中文音乐实体关系语料库，然后分别采用了基于序列模式挖掘的无指导的方法和基于特征提取的有指导的方法来解决这一问题。 -Dissertation for the Master Degree in Engineering urgently needed to de
所属分类：Development Research
- 发布日期：2017-05-03
- 文件大小：1.38mb
- 提供者：xz

computer-voice-input

0下载：
将语音录入问题分为三个模块进行研究：语音识别模块、字转换模块和语料库建立模块。-Voice recording is divided into three modules for research: speech recognition module, word conversion module and corpus creation module.
所属分类：software engineering
- 发布日期：2017-11-14
- 文件大小：3.13mb
- 提供者：lhj

616341

0下载：
中文文本语料库适合中文文本分类使用朴素贝叶斯算法整合 -Chinese text categorization corpus
所属分类：Project Manage
- 发布日期：2017-04-29
- 文件大小：86.7kb
- 提供者：Sirius GY

PMl-IR

0下载：
Blog信息源和信息量的广泛增长给中文文本分类带来了新的挑战。本文提出了—种基于PMI—IR算法的四种情感分类方法来对Blog文本进行情感分类。该方法以情感词语为中心，通过搜索引擎返回的结果来计算文本中的情感要素和背景情感词之问的点互信息值，从而对文本进行情感分类。该方法在国家语言资源监测与研究中心网络媒体语言分中-心2008年度的Blog语料和COAE2008的语料上分别进行了测试。与传统方法相比准确率和召回率都有了较大的提高。-Development ofBIog texts info
所属分类：software engineering
- 发布日期：2017-05-03
- 文件大小：661.46kb
- 提供者：guwei

word2vec

0下载：
word2vec：谷歌的开源项目，实现从词语到向量的转换（word to vector），Linux系统下运行，需要较大规模的语料资源用作训练才能体现出很好的效果（中英文均可），并且可以实现测量两个词语之间的距离（cos值表示），词语聚类等。-word2vec: Google' s open-source projects, a word-to-vector conversion (word to vector) running under Linux system, requires
所属分类：software engineering
- 发布日期：2017-03-29
- 文件大小：110.67kb
- 提供者：sherlydunn

RDF3X-a-RISCstyle

0下载：
RDF是为了模式自由的信息提供的一种数据表达方式，在语义网语料库、生命科学、web2.0平台上发展迅速。-RDF is a data in order to model the freedom of expression of information provided by the rapid development of the Semantic Web corpora, life sciences, web2.0 platform.
所属分类：software engineering
- 发布日期：2017-04-16
- 文件大小：247.83kb
- 提供者：冯佳颖

jrxbck

0下载：
用于数据分析的金融细胞词库，详细收集了金融行业的细胞词用户语料分析-For financial cell thesaurus data analysis, detailed analysis of cell collected corpus word user of the financial industry
所属分类：File Formats
- 发布日期：2017-04-25
- 文件大小：129.54kb
- 提供者：guochao

Southeast-Asia

0下载：
这是部分东南亚方面的涉华语料，可以用来分析东南亚与中国方面情况-This is part of the southeast Asia in terms of marking corpus, which can be used to analyze situation in southeast Asia and China
所属分类：Development Research
- 发布日期：2017-04-29
- 文件大小：474.13kb
- 提供者：Jenny

Corpus

0下载：
对话类语料10万条左右，可用于进行机器人对话训练。-Dialogue about 100,000 words can be used for robot dialogue training.
所属分类：Software Testing
- 发布日期：2017-05-25
- 文件大小：7.71mb
- 提供者：马威力

hownet

0下载：
知网完整版，附带相关的各种论文文档,中文语料库-see chinese descr iption
所属分类：software engineering
- 发布日期：2017-12-10
- 文件大小：17.3mb
- 提供者：smith

webquestions.examples.train

0下载：
知识图谱，知识库，问答系统的问答语料，主要是训练语料(webquestions examples QA data using for KBQA QA data using for KBQA QA data using for KBQA)
所属分类：文章/文档
- 发布日期：2017-12-22
- 文件大小：117kb
- 提供者：lbda1

chnsenticorp

3下载：
中文情感分析语料，包含三类：旅馆、书籍、商品评论(a corpus of chinese emotional)
所属分类：文章/文档
- 发布日期：2018-04-22
- 文件大小：5.57mb
- 提供者：json123

文本处理高级语料库

0下载：
自然语言处理语料库代码，能够提供大量方向基础入门信息。
所属分类：编程文档
- 发布日期：2022-07-14
- 文件大小：7.51mb
- 提供者：1312484580@qq.com

« 12 »

搜珍网 www.dssz.com

本网站为编程资源及源代码搜集、介绍的搜索网站，版权归原作者所有！　　粤ICP备11031372号

1999-2046 搜珍网 All Rights Reserved.