当前位置：和泉文库 > 计算机 > 浏览文档

上海交通大学：《挖掘海量数据集 Mining Massive Datasets》课程教学资源（PPT讲稿）Lecture 06 搜索引擎 Search Engines

▪ Architecture of Search Engines ▪ Index Construction ▪ Boolean Retrieval ▪ Vector Space Model for Ranked Retrieval

文件格式：PPT，文件大小：2.14MB，售价：18.96元

文档详细内容（约85页）

Search Engines Architecture Indexing process Document data store Text Acquisition Index Creation E-mail, Web pages, News articles, Memos, Letters Index Text Transformation

Search Engines 6 Indexing Process Architecture

Search Engines Architecture Indexing process Text acquisition identifies and stores documents for indexing Text transformation transforms documents into index terms ndex creatⅰon takes index terms and creates data structures( indexes)to support fast searching

Search Engines 7 Indexing Process ▪ Text acquisition ▪ identifies and stores documents for indexing ▪ Text transformation ▪ transforms documents into index terms ▪ Index creation ▪ takes index terms and creates data structures (indexes) to support fast searching Architecture

Search Engines Architecture Query Process Document data store User interaction Ranking Index Evaluation Log data

Search Engines 8 Query Process Architecture

Search Engines Architecture Query Process User interaction supports creation and refinement of query display of results Ranking uses query and indexes to generate ranked list of documents Evaluation monitors and measures effectiveness and efficiency (primarily offline

Search Engines 9 Query Process ▪ User interaction ▪ supports creation and refinement of query, display of results ▪ Ranking ▪ uses query and indexes to generate ranked list of documents ▪ Evaluation ▪ monitors and measures effectiveness and efficiency (primarily offline) Architecture

Search Engines Architecture Details: Text acquisition Crawler Identifies and acquires documents for search engine Many types -web, enterprise, desktop Web crawlers follow links to find documents Must efficiently find huge numbers of web pages( coverage) and keep them up-to-date (freshness) Single site crawlers for site search Topical or focused crawlers for vertical search Document crawlers for enterprise and desktop search Follow links and scan directories

Search Engines 10 Details: Text Acquisition ▪ Crawler ▪ Identifies and acquires documents for search engine ▪ Many types – web, enterprise, desktop ▪ Web crawlers follow links to find documents ▪ Must efficiently find huge numbers of web pages (coverage) and keep them up-to-date (freshness) ▪ Single site crawlers for site search ▪ Topical or focused crawlers for vertical search ▪ Document crawlers for enterprise and desktop search ▪ Follow links and scan directories Architecture

点击进入文档下载页（PPT格式）

共85页，可试读20页，点击继续阅读 ↓↓

您可能感兴趣的文档

《Introduction to Java Programming》课程PPT教学课件（Sixth Edition）Chapter 16 Applets and Multimedia
《计算机组装与维护》课程教学资源（PPT课件讲稿）第9章 BIOS设置（设置BIOS）
香港城市大学：基序检测的随机化算法（PPT讲稿）Randomized Algorithm for Motif Detection
《数据结构》课程教学资源（PPT课件讲稿）第七章图及其应用
3D Reconstruction from Images：Image-based Street-side City Modeling
大连理工大学：《计算机网络》课程教学资源（PPT课件讲稿）Chapter 2 应用层 application layer
四川大学：《操作系统 Operating System》课程教学资源（PPT课件讲稿）Chapter 3 Process Description and Control 3.4 Process Control 3.5 Execution of the Operating System 3.6 Unix SVR4 Process Management 3.7 Linux Process management system calls
《数据结构》课程教学资源（PPT课件讲稿）第七章图 Graph
《数据结构》课程教学资源：实践教学大纲
《网络算法学》课程教学资源（PPT课件讲稿）第三章实现原则
《电脑组装与维护实例教程》教学资源（PPT课件讲稿）第5章多媒体设备介绍及选购
广西医科大学：《计算机网络 Computer Networking》课程教学资源（PPT课件讲稿）Chapter 02 Network Classification
《计算机系统安全》课程教学资源（PPT课件讲稿）第二章黑客常用的系统攻击方法
《C语言程序设计》课程教学资源（PPT课件讲稿）第8章结构体、共用体与枚举类型
香港浸会大学：Introduction to Linux and PC Cluster
南京大学：《计算机图形学》课程教学资源（PPT课件讲稿）第7讲图元填充与裁剪算法
北京航空航天大学：SimplyDroid - Efficient Event Sequence Simplification for Android Application
《The C++ Programming Language》课程教学资源（PPT课件讲稿）Lecture 04 Object-Based Programming
中国科学技术大学：Linux内核源代码导读（PPT讲稿，陈香兰）
《网上开店实务》课程教学资源（PPT讲稿）学习情境3 网店装修
北京大学：《项目成本管理》课程教学资源（PPT课件讲稿）项目范围计划（主讲：周立新）
香港中文大学：Achieving Secure and Cooperative Wireless Networks with Trust Modeling and Game Theory
MSCIT 5210/MSCBD 5002：Knowledge Discovery and Data Mining：Chapter 4：Data Warehousing, On-line Analytical Processing and Data Cube
《程序设计基础》课程PPT教学课件（C++）第3讲 C++程序控制结构

点击购买下载（PPT）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录