Scheduling of Big Data Workflows in the Hadoop Framework with Heterogeneous Computing Cluster.
Saved in:
| Title: | Scheduling of Big Data Workflows in the Hadoop Framework with Heterogeneous Computing Cluster. |
|---|---|
| Authors: | Rahmani, Amir Masoud1 (AUTHOR) rahmania@yuntech.edu.tw, Chamzini, Ehsan Yazdani2,3 (AUTHOR) Ehsan.yazdani@sco.iaun.ac.ir, pourshaban, Mohsen2,3 (AUTHOR) Pourshaban@sco.iaun.ac.ir, Hosseinzadeh, Mehdi4,5 (AUTHOR) mehdihosseinzadeh@duytan.edu.vn |
| Source: | Arabian Journal for Science & Engineering (Springer Science & Business Media B.V. ). Aug2025, Vol. 50 Issue 15, p12449-12461. 13p. |
| Subjects: | Big data, Heterogeneous computing, Cloud computing, Workflow management systems, Load balancing (Computer networks), Resource allocation, Scheduling |
| Abstract: | Recently, resource allocation in cloud computing has become a popular research topic. Hi-WAY is a scientific workflow management system that facilitates workflows involving large-scale inputs such as big data. Hadoop, a framework designed to implement distributed systems, allows Hi-WAY to be run on thousands of computing nodes with desirable fault tolerance. Task scheduling is not difficult in a homogeneous Hadoop system, where computing nodes have identical specifications. However, task scheduling could be problematic in heterogeneous systems, where specifications such as processor power, memory, and bandwidth may vary from node to node. This paper introduces a workflow scheduler on the Hadoop framework (WSH), accounting for system heterogeneity when scheduling computing- and IO-intensive jobs. WSH uses a training task to collect information before distributing jobs. The results demonstrate effective job allocation and load balancing improvement in Hadoop, leading to increased resource efficiency and reduced makespan. Based on various experiments and the use of different workflows, the proposed method improves the scheduling length ratio by 42%, reduces makespan by 20%, and enhances speedup by approximately 37% compared to the algorithm. [ABSTRACT FROM AUTHOR] |
| Copyright of Arabian Journal for Science & Engineering (Springer Science & Business Media B.V. ) is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Database: | Engineering Source |
|
Full text is not displayed to guests.
Login for full access.
|
|
| FullText | Links: – Type: pdflink Text: Availability: 1 |
|---|---|
| Header | DbId: egs DbLabel: Engineering Source An: 187091477 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Scheduling of Big Data Workflows in the Hadoop Framework with Heterogeneous Computing Cluster. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Rahmani%2C+Amir+Masoud%22">Rahmani, Amir Masoud</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> rahmania@yuntech.edu.tw</i><br /><searchLink fieldCode="AR" term="%22Chamzini%2C+Ehsan+Yazdani%22">Chamzini, Ehsan Yazdani</searchLink><relatesTo>2,3</relatesTo> (AUTHOR)<i> Ehsan.yazdani@sco.iaun.ac.ir</i><br /><searchLink fieldCode="AR" term="%22pourshaban%2C+Mohsen%22">pourshaban, Mohsen</searchLink><relatesTo>2,3</relatesTo> (AUTHOR)<i> Pourshaban@sco.iaun.ac.ir</i><br /><searchLink fieldCode="AR" term="%22Hosseinzadeh%2C+Mehdi%22">Hosseinzadeh, Mehdi</searchLink><relatesTo>4,5</relatesTo> (AUTHOR)<i> mehdihosseinzadeh@duytan.edu.vn</i> – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="JN" term="%22Arabian+Journal+for+Science+%26+Engineering+%28Springer+Science+%26+Business+Media+B%2EV%2E+%29%22">Arabian Journal for Science & Engineering (Springer Science & Business Media B.V. )</searchLink>. Aug2025, Vol. 50 Issue 15, p12449-12461. 13p. – Name: Subject Label: Subjects Group: Su Data: <searchLink fieldCode="DE" term="%22Big+data%22">Big data</searchLink><br /><searchLink fieldCode="DE" term="%22Heterogeneous+computing%22">Heterogeneous computing</searchLink><br /><searchLink fieldCode="DE" term="%22Cloud+computing%22">Cloud computing</searchLink><br /><searchLink fieldCode="DE" term="%22Workflow+management+systems%22">Workflow management systems</searchLink><br /><searchLink fieldCode="DE" term="%22Load+balancing+%28Computer+networks%29%22">Load balancing (Computer networks)</searchLink><br /><searchLink fieldCode="DE" term="%22Resource+allocation%22">Resource allocation</searchLink><br /><searchLink fieldCode="DE" term="%22Scheduling%22">Scheduling</searchLink> – Name: Abstract Label: Abstract Group: Ab Data: Recently, resource allocation in cloud computing has become a popular research topic. Hi-WAY is a scientific workflow management system that facilitates workflows involving large-scale inputs such as big data. Hadoop, a framework designed to implement distributed systems, allows Hi-WAY to be run on thousands of computing nodes with desirable fault tolerance. Task scheduling is not difficult in a homogeneous Hadoop system, where computing nodes have identical specifications. However, task scheduling could be problematic in heterogeneous systems, where specifications such as processor power, memory, and bandwidth may vary from node to node. This paper introduces a workflow scheduler on the Hadoop framework (WSH), accounting for system heterogeneity when scheduling computing- and IO-intensive jobs. WSH uses a training task to collect information before distributing jobs. The results demonstrate effective job allocation and load balancing improvement in Hadoop, leading to increased resource efficiency and reduced makespan. Based on various experiments and the use of different workflows, the proposed method improves the scheduling length ratio by 42%, reduces makespan by 20%, and enhances speedup by approximately 37% compared to the algorithm. [ABSTRACT FROM AUTHOR] – Name: AbstractSuppliedCopyright Label: Group: Ab Data: <i>Copyright of Arabian Journal for Science & Engineering (Springer Science & Business Media B.V. ) is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.) |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=187091477 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1007/s13369-024-09779-9 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 13 StartPage: 12449 Subjects: – SubjectFull: Big data Type: general – SubjectFull: Heterogeneous computing Type: general – SubjectFull: Cloud computing Type: general – SubjectFull: Workflow management systems Type: general – SubjectFull: Load balancing (Computer networks) Type: general – SubjectFull: Resource allocation Type: general – SubjectFull: Scheduling Type: general Titles: – TitleFull: Scheduling of Big Data Workflows in the Hadoop Framework with Heterogeneous Computing Cluster. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Rahmani, Amir Masoud – PersonEntity: Name: NameFull: Chamzini, Ehsan Yazdani – PersonEntity: Name: NameFull: pourshaban, Mohsen – PersonEntity: Name: NameFull: Hosseinzadeh, Mehdi IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 08 Text: Aug2025 Type: published Y: 2025 Identifiers: – Type: issn-print Value: 2193567X Numbering: – Type: volume Value: 50 – Type: issue Value: 15 Titles: – TitleFull: Arabian Journal for Science & Engineering (Springer Science & Business Media B.V. ) Type: main |
| ResultId | 1 |