OSC: An Online Self-Configuring Big Data Framework for Optimization of QoS.

Saved in:
Bibliographic Details
Title: OSC: An Online Self-Configuring Big Data Framework for Optimization of QoS.
Authors: Bei, Zhendong1 (AUTHOR) zd.bei@siat.ac.cn, Kim, Nam Sung2 (AUTHOR) nskim@illinois.edu, HWang, Kai3 (AUTHOR) hwangkai@cuhk.edu.cn, Yu, Zhibin4 (AUTHOR) zb.yu@siat.ac.cn
Source: IEEE Transactions on Computers. Apr2022, Vol. 71 Issue 4, p809-823. 15p.
Subjects: Genetic algorithms, Building performance
Abstract: Big-data frameworks such as MapReduce/Hadoop or Spark have many performance-critical configuration parameters which may interact with each other in a complex way. Their optimal values for an application on a given cluster are affected by not only the application itself but also its input data. This makes offline auto-configuration approaches hard to be used in practice because the input data of an application may change at each run. To address this issue, we propose an Online Self-Configuring (OSC) approach that automatically determines the optimal parameter values for a given application. OSC synergistically integrates three key techniques. First, OSC leverages ensemble learning to build a precise performance model for a given application. Second, it quantifies the importance of the parameters and interaction intensity between them to accelerate the genetic algorithm for searching optimal configuration parameters. Third, OSC supports an incremental modeling approach to achieve low overhead of the models for online needs. These techniques allow OSC to effectively learn the characteristics of an application and optimize its performance by automatically adjusting the configurations at runtime. Our implementation of OSC atop MapReduce/Hadoop 2.6 improves performance by 60 percent on average and up to 120 percent compared with the state-of-the-art approach. Lastly, the performance benefit of an application running on OSC generally increases along with its input data size. [ABSTRACT FROM AUTHOR]
Copyright of IEEE Transactions on Computers is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
FullText Text:
  Availability: 0
Header DbId: egs
DbLabel: Engineering Source
An: 155774185
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: OSC: An Online Self-Configuring Big Data Framework for Optimization of QoS.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Bei%2C+Zhendong%22">Bei, Zhendong</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> zd.bei@siat.ac.cn</i><br /><searchLink fieldCode="AR" term="%22Kim%2C+Nam+Sung%22">Kim, Nam Sung</searchLink><relatesTo>2</relatesTo> (AUTHOR)<i> nskim@illinois.edu</i><br /><searchLink fieldCode="AR" term="%22HWang%2C+Kai%22">HWang, Kai</searchLink><relatesTo>3</relatesTo> (AUTHOR)<i> hwangkai@cuhk.edu.cn</i><br /><searchLink fieldCode="AR" term="%22Yu%2C+Zhibin%22">Yu, Zhibin</searchLink><relatesTo>4</relatesTo> (AUTHOR)<i> zb.yu@siat.ac.cn</i>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22IEEE+Transactions+on+Computers%22">IEEE Transactions on Computers</searchLink>. Apr2022, Vol. 71 Issue 4, p809-823. 15p.
– Name: Subject
  Label: Subjects
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Genetic+algorithms%22">Genetic algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22Building+performance%22">Building performance</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Big-data frameworks such as MapReduce/Hadoop or Spark have many performance-critical configuration parameters which may interact with each other in a complex way. Their optimal values for an application on a given cluster are affected by not only the application itself but also its input data. This makes offline auto-configuration approaches hard to be used in practice because the input data of an application may change at each run. To address this issue, we propose an Online Self-Configuring (OSC) approach that automatically determines the optimal parameter values for a given application. OSC synergistically integrates three key techniques. First, OSC leverages ensemble learning to build a precise performance model for a given application. Second, it quantifies the importance of the parameters and interaction intensity between them to accelerate the genetic algorithm for searching optimal configuration parameters. Third, OSC supports an incremental modeling approach to achieve low overhead of the models for online needs. These techniques allow OSC to effectively learn the characteristics of an application and optimize its performance by automatically adjusting the configurations at runtime. Our implementation of OSC atop MapReduce/Hadoop 2.6 improves performance by 60 percent on average and up to 120 percent compared with the state-of-the-art approach. Lastly, the performance benefit of an application running on OSC generally increases along with its input data size. [ABSTRACT FROM AUTHOR]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of IEEE Transactions on Computers is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=155774185
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1109/TC.2021.3063278
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 15
        StartPage: 809
    Subjects:
      – SubjectFull: Genetic algorithms
        Type: general
      – SubjectFull: Building performance
        Type: general
    Titles:
      – TitleFull: OSC: An Online Self-Configuring Big Data Framework for Optimization of QoS.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Bei, Zhendong
      – PersonEntity:
          Name:
            NameFull: Kim, Nam Sung
      – PersonEntity:
          Name:
            NameFull: HWang, Kai
      – PersonEntity:
          Name:
            NameFull: Yu, Zhibin
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 04
              Text: Apr2022
              Type: published
              Y: 2022
          Identifiers:
            – Type: issn-print
              Value: 00189340
          Numbering:
            – Type: volume
              Value: 71
            – Type: issue
              Value: 4
          Titles:
            – TitleFull: IEEE Transactions on Computers
              Type: main
ResultId 1