Fast deduplication data transmission scheme on a big data real-time platform

Sheng Tzong Cheng, Jian Ting Chen, Yin Chun Chen

研究成果: Conference contribution

摘要

In this information era, it is difficult to exploit and compute high-amount data efficiently. Today, it is inadequate to use MapReduce to handle more data in less time let alone real time. Hence, In-memory Computing (IMC) was introduced to solve the problem of Hadoop MapReduce. IMC, as its literal meaning, exploits computing in memory to tackle the cost problem which Hadoop undue access data to disk caused and can be distributed to perform iterative operations. However, IMC distributed computing still cannot get rid of a bottleneck, that is, network bandwidth. It restricts the speed of receiving the information from the source and dispersing information to each node. According to observation, some data from sensor devices might be duplicate due to time or space dependence. Therefore, deduplication technology would be a good solution. The technique for eliminating duplicated data is capable of improving data utilization. This study presents a distributed real-time IMC platform - "Spark Streaming" optimization. It uses deduplication technology to eliminate the possible duplicate blocks from source. It is expected to reduce redundant data transmission and improve the throughput of Spark Streaming.

原文English
主出版物標題BMSD 2017 - Proceedings of the 7th International Symposium on Business Modeling and Software Design
編輯Boris Shishkov
發行者SciTePress
頁面155-166
頁數12
ISBN(電子)9789897582387
DOIs
出版狀態Published - 2017
事件7th International Symposium on Business Modeling and Software Design, BMSD 2017 - Barcelona, Spain
持續時間: 2017 七月 32017 七月 5

出版系列

名字BMSD 2017 - Proceedings of the 7th International Symposium on Business Modeling and Software Design

Other

Other7th International Symposium on Business Modeling and Software Design, BMSD 2017
國家/地區Spain
城市Barcelona
期間17-07-0317-07-05

All Science Journal Classification (ASJC) codes

  • 建模與模擬
  • 軟體

指紋

深入研究「Fast deduplication data transmission scheme on a big data real-time platform」主題。共同形成了獨特的指紋。

引用此