Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System

Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other task...

Full description

Main Authors: Khoiruddin, A.A., Zakaria, N., Alhussian, H.
Format: Article
Institution: Universiti Teknologi Petronas
Record Id / ISBN-0: utp-eprints.23113 /
Published: Insight Society 2020
Online Access: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0
http://eprints.utp.edu.my/23113/
Tags: Add Tag
No Tags, Be the first to tag this record!
id utp-eprints.23113
recordtype eprints
spelling utp-eprints.231132021-08-19T05:26:36Z Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System Khoiruddin, A.A. Zakaria, N. Alhussian, H. Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other tasks from running (the task becomes straggler). Hadoop releases are equipped with algorithms to handle straggler tasks problem. However, the algorithms manage Map and Reduce task similarly, while the straggler root cause might be different for both tasks. In this paper, the Asynchronous Non-Blocking (ANB) method is proposed to improve the performance and avoid the blocking of Reduce task in Hadoop. Instead of using the single queue, our approach uses two queues, i.e. task queue and callback queue. When a task is not ready or detected as a straggler, it is removed from the main task queue and temporarily sent to the callback queue. When the task is ready to run, it will be sent back to the main task queue for running. The performance of the algorithm is compared with rTuner, the latest paper found on handling straggler task in Reduce task. From the comparison, it is shown that ANB consistently gives faster time to complete because any unready tasks will be directly put into the callback queue without blocking other tasks. Furthermore, the overhead time in rTuner is high as it needs to check the straggler status and to find the reason for a task to become straggler. © Insight Society 2020 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0 Khoiruddin, A.A. and Zakaria, N. and Alhussian, H. (2020) Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System. International Journal on Advanced Science, Engineering and Information Technology, 10 (5). pp. 1913-1919. http://eprints.utp.edu.my/23113/
institution Universiti Teknologi Petronas
collection UTP Institutional Repository
description Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other tasks from running (the task becomes straggler). Hadoop releases are equipped with algorithms to handle straggler tasks problem. However, the algorithms manage Map and Reduce task similarly, while the straggler root cause might be different for both tasks. In this paper, the Asynchronous Non-Blocking (ANB) method is proposed to improve the performance and avoid the blocking of Reduce task in Hadoop. Instead of using the single queue, our approach uses two queues, i.e. task queue and callback queue. When a task is not ready or detected as a straggler, it is removed from the main task queue and temporarily sent to the callback queue. When the task is ready to run, it will be sent back to the main task queue for running. The performance of the algorithm is compared with rTuner, the latest paper found on handling straggler task in Reduce task. From the comparison, it is shown that ANB consistently gives faster time to complete because any unready tasks will be directly put into the callback queue without blocking other tasks. Furthermore, the overhead time in rTuner is high as it needs to check the straggler status and to find the reason for a task to become straggler. ©
format Article
author Khoiruddin, A.A.
Zakaria, N.
Alhussian, H.
spellingShingle Khoiruddin, A.A.
Zakaria, N.
Alhussian, H.
Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
author_sort Khoiruddin, A.A.
title Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_short Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_full Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_fullStr Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_full_unstemmed Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_sort asynchronous non-blocking algorithm to handle straggler reduce tasks in hadoop system
publisher Insight Society
publishDate 2020
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0
http://eprints.utp.edu.my/23113/
_version_ 1741196622991196160
score 11.62408