- 数据库管理系统存储的一般都是结构化数据,长足于数值的计算、查询、统计与挖掘。随着计算机存储与计算能力的迅猛发展,越来越多的非结构化文本数据存储在数据库中,数据库中的文本搜索日益广泛。 当前,数据库中的文本搜索,一般采用SQL语句中的Like操作符或者采用数据库系统自带的全文索引功能。Like操作往往特别耗时,数据规模超过10万条,查询往往会导致网络连接超时,无法满足在线搜索的需要;同时,Like查询仅仅是简单的字符串匹配,没有考虑语言语义,检索“和服”,同样会命中“产品和服务”。当前
- Introduction to Information Retrieval is the first textbook with a coherent treatment of classical and web information retrieval, including web search and the related areas of text classification and text clustering. Written from a computer sci
- Wumpus is an information retrieval system developed at the University of Waterloo. Its main purpose is to study issues that arise in the context of indexing dynamic text collections in multi-user environments. One particular scenario that we are stud
- 信息检索系统从最初的纯手工检索系统业已发展到现在的以信息技术为支撑的检索系统,在这一过程中,适应新的信息资源、信息技术这些检索环境,提高信息检索系统的查全率、查准率和系统响应时间是不变的主题,在众多文本中掌握最有效的信息始终是信息处理的一大目标。围绕向量空间模型设计了一个文本检索系统,介绍向量空间模型的基础上给出了基于它的信息检索系统的一般结构框架和各部分的功能,探讨了系统中所涉及到的关键技术。用向量空间模型进行特征表达,用TF-IDF(Term-Frequency Inverse-Docume
Oracle Text Application Developer's Guide 11 g Release 1 (11.1)
- Oracle Text enables you to build text quer y applications and document classification applications. Oracle Text provides indexing, word and theme searching, and viewing capabilities for text. To design an Oracle Text application, first determine the
- Query by content, or content-based retri has recently been proposed as an alternative to text-based retri for media such as images, video and audio. Text-based retri is no longer appropriate for indexing such media, for several reasons. Firstly