wwwxxxx日本,国产精品九九,日日夜夜免费视频,亚洲无av码在线中文字幕

<li id="ko20m"><tr id="ko20m"></tr></li>

頻道

熱門頻道

用戶中心

豆知微信公眾號

微信二維碼

社會實踐報告范文大全

上傳

基于hadoop的海量數(shù)據(jù)處理研究與應用——搜索引擎部分的設(shè)計與實現(xiàn).doc

約65頁DOC格式手機打開展開

基于hadoop的海量數(shù)據(jù)處理研究與應用——搜索引擎部分的設(shè)計與實現(xiàn),論述完整摘　要如今微博已經(jīng)成為人們重要的溝通和交流工具，給人們的生活帶來了極大地便利，與此同時，人們利用微博平臺交流的過程中產(chǎn)生了海量的非結(jié)構(gòu)化的數(shù)據(jù)，對這些數(shù)據(jù)的處理和利用已經(jīng)成為了一個熱門的研究課題，本文介紹利用海量微博數(shù)據(jù)搜索相同興趣的用戶并對結(jié)果進行排序，即興趣搜索。論文的主要工作如下：首先，要解決存儲與處理...
編號:68-245488大小:1.92M
分類: 論文>計算機論文

內(nèi)容介紹

此文檔由會員 danusha 發(fā)布

論述完整

摘　要
如今微博已經(jīng)成為人們重要的溝通和交流工具，給人們的生活帶來了極大地便利，與此同時，人們利用微博平臺交流的過程中產(chǎn)生了海量的非結(jié)構(gòu)化的數(shù)據(jù)，對這些數(shù)據(jù)的處理和利用已經(jīng)成為了一個熱門的研究課題，本文介紹利用海量微博數(shù)據(jù)搜索相同興趣的用戶并對結(jié)果進行排序，即興趣搜索。
論文的主要工作如下：
首先，要解決存儲與處理海量微博數(shù)據(jù)，論文研究與討論了海量數(shù)據(jù)存儲與處理的相關(guān)技術(shù)，對Google的三大核心技術(shù)--BigTable、GFS分布式文件系統(tǒng)、MapReduce分布式編程模型的介紹，著重對搜索引擎原理與Solr平臺的介紹。
其次，對于本課題設(shè)計與實現(xiàn)的系統(tǒng)來說，我們結(jié)合了Hadoop、HBase、Solr等優(yōu)秀的開源框架，本課題分別研究與討論這些框架。
最后，針對本課題所面臨的問題--搜索相同興趣的用戶并對結(jié)果進行排序，我們將Hadoop、HBase、Solr結(jié)合起來，設(shè)計與實現(xiàn)這樣的體系結(jié)構(gòu)：原始微博數(shù)據(jù)存儲在HBase中，利用Hadoop的分布式結(jié)構(gòu)對原始數(shù)據(jù)進行處理并建立索引，索引最終輸出到Solr系統(tǒng)的索引庫中。同時，提出一個基于微博興趣搜索的排名算法，對于微博內(nèi)容、用戶信息權(quán)值的權(quán)衡設(shè)置，搜索時對結(jié)果進行排名。這樣，最終實現(xiàn)基于微博內(nèi)容搜索相同興趣的用戶的應用。

關(guān)鍵詞：海量數(shù)據(jù)處理；Hadoop；Solr

Abstract
Nowadays microblogging has become an important tool for communication in people’s life, and it has brought us significant conveniences. Meanwhile, in the process of communicating using microbloging by so many users, there is massive data unstructured being produced.So ,how to process and user this data has become a hot topic. This dissertation will introduce how to use a sea of microbloging data to search for users of the same interest,and a sorted result will be displayed, as we will it interest searching.
The main work of this dissertation is as follows:
Firstly, we must fix the problem of the massive data processing and research of microblogging data. This dissertation has a research and discuss the correlation techniques of massive data storage and processing. We introduce the three core techniques of Google:BigTable，Google File System, MapReduce. And we highlight on the introducing of search engine and solr plat form
Secondly, in the system designed and implemented by ourselves, we combine some excellent open source frameworks like Hadoop, Hbase and Solr. We will discuss them respectively.
Finally, for solving the problem we are fcacing that how to find the users of same interest and return the sorted result, we combine Hadoop,Hbase and solr together.Our main idea is the primal will be stored in HBase, and we will use it in hadoop to build index for solr. Meanwhile, we design a viable algorithm to rank the search results.We set different weights for microblogging content and user information. Then, we finally implement the application of searching for users of the same interests based on the massive microbloggings.

Key Words：Massive data proessing; Hadoop;Solr

目錄
第一章緒論 1
1.1 研究背景 1
1.2 研究現(xiàn)狀及存在的問題 1
1.3 論文的主要工作 2
1.4 論文組織結(jié)構(gòu) 2
第二章系統(tǒng)相關(guān)技術(shù)介紹 3
2.1 海量數(shù)據(jù)存儲與處理核心技術(shù)與原理 3
2.1.2 BigTable技術(shù)與原理 3
2.1.3 GFS技術(shù)與原理 8
2.1.4 MapReduce編程模型技術(shù)及原理 10
2.2 Hadoop平臺研究 11
2.2.1 Hadoop簡介 11
2.2.2 HDFS文件系統(tǒng) 12
2.3 搜索引擎核心技術(shù)及原理 14
2.3.1 全文搜索介紹 14
2.3.2 索引 15
2.3.2 查詢 16
2.4 本章小節(jié) 19
第三章 Solr平臺研究 20
3.1 Solr介紹 20
3.2 Solr體系結(jié)構(gòu) 20
3.3 Solr重點介紹 21
4.1.1 solrconig.xml解讀 21
4.1.2 schema.xml解讀 23
4.1.2 Solr 服務原理 23
3.4 本章小節(jié) 26
第四章基于海量數(shù)據(jù)處理的微博興趣搜索設(shè)計與實現(xiàn) 27
4.1 系統(tǒng)體系結(jié)構(gòu) 27
4.2 索引生成 28
4.2.1 微博數(shù)據(jù)采集并存入HBase 28
4.2.2 MapReduce建立索引 32
4.2.3 Solr建立索引核心配置 34
4.3 搜索過程 35
4.3.1 查詢分析 36
4.3.1 查詢結(jié)果展示 39
4.4 本章小節(jié) 40
第五章系統(tǒng)運行與分析 41
5.1 實驗環(huán)境 41
5.2 實驗平臺搭建 41
5.3 實驗運行 43
5.3.1 實驗數(shù)據(jù) 43
5.3.2 Solr服務器運行 45
5.4 實驗結(jié)果 46
5.4.1 索引結(jié)果 46
5.4.2 搜索結(jié)果 48
5.5 本章小結(jié) 51
第六章總結(jié)與展望 52
6.1 論文總結(jié) 52
6.2 工作展望 52
參考文獻 54
致謝 56

TA們正在看...

相關(guān)文檔

網(wǎng)站聲明
侵權(quán)處理
免責申明

幫助中心
呼吸機
幫助中心

官方微信

支付寶紅包

豆知網(wǎng) 教育科研學術(shù)文檔分享平臺

可信/實名雙認證網(wǎng)站川公網(wǎng)安備 51010502011102號

豆知 . 豆知文庫版權(quán)所有 - 2008-2024 蜀ICP備2023009049號-1

中文字幕熟女丝袜| 安庆市| 亚洲最新网站| 欧美最新专区| 性小说扣逼喷水视频| 激情内射日本一区二区三区| 四虎影院电视在线| 亚洲另类第五页| 欧洲不卡二卡三卡四卡免费| 熟女视频网国产熟女| 亚洲五月伊人| 亚洲精精精品| 18禁又污又黄又爽的网站不卡| 伊人亚洲AV一级无码| 日韩性爱少妇| 日韩欧美中文字幕在线视频| 国产精品爆乳奶水无码视频| 东京热中字乱伦| 亚洲你懂的| 欧美色欲| 97超碰97| 牛牛AV一及片| 色愁愁久久久| 国产欧美网站| 男女MM视频| 免费福利姬网站| 视频一区在线| 99精品人妻无码| 亚洲日韩欧美激情四射| 在线精品老司机AV| 曰本女人与公拘交酡| 一级黄片A| 日韩成人片一二三区| 网红黄色视频网站免费| 天堂久久久爱| 久久久精品日韩| 国产96精品久久久| 五月丁香啪啪中文| 高要市| 日本黄视频中文字幕| 欧美日韩骚妇|