Optimizing matrix-matrix multiplication on intel's advanced vector extensions multicore processor.

Saved in:
Bibliographic Details
Title: Optimizing matrix-matrix multiplication on intel's advanced vector extensions multicore processor.
Authors: Hemeida, A.M.1 (AUTHOR) ashraf@aswu.edu.eg, Hassan, S.A.2 (AUTHOR), Alkhalaf, Salem3 (AUTHOR), Mahmoud, M.M.M.2 (AUTHOR), Saber, M.A.2 (AUTHOR), Bahaa Eldin, Ayman M.4 (AUTHOR), Senjyu, Tomonobu5 (AUTHOR), Alayed, Abdullah H.6 (AUTHOR)
Source: Ain Shams Engineering Journal. Dec2020, Vol. 11 Issue 4, p1179-1190. 12p.
Subjects: Matrix multiplications, Multicore processors, Image processing, C++, Electronic data processing
Abstract: This paper is focused on Intel Advanced Vector Extension (AVX) which has been borne of the modern developments in AMD processors and Intel itself. Said prescript processes a chunk of data both individually and altogether. AVX is supporting variety of applications such as image processing. Our goal is to accelerate and optimize square single-precision matrix multiplication from 2080 to 4512, i.e. big size ranges. Our optimization is designed by using AVX instruction sets, OpenMP parallelization, and memory access optimization to overcome bandwidth limitations. This paper is different from other papers by concentrating on several main technique and the results therein. Making parallel implementation guidelines of said algorithms, where the target architecture's characteristics need to be taken into consideration when said algorithms are applied are presented. This work has a comparative study of using most popular compilers: Intel C++ compiler 17.0 over Microsoft Visual Studio C++ compiler 2015. Additionally, a comparative study between single-core and multicore platforms has been examined. The obtained results of the proposed optimized algorithms are achieved a performance improvement of 71%, 59%, and 56% for C = A.B, C = A.BT, and C = AT.B separately compared with results that are achieved by implementing the latest Intel Math Kernel Library 2017 SGEMV subroutines. [ABSTRACT FROM AUTHOR]
Copyright of Ain Shams Engineering Journal is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
FullText Text:
  Availability: 0
Header DbId: egs
DbLabel: Engineering Source
An: 147504402
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Optimizing matrix-matrix multiplication on intel's advanced vector extensions multicore processor.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Hemeida%2C+A%2EM%2E%22">Hemeida, A.M.</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> ashraf@aswu.edu.eg</i><br /><searchLink fieldCode="AR" term="%22Hassan%2C+S%2EA%2E%22">Hassan, S.A.</searchLink><relatesTo>2</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Alkhalaf%2C+Salem%22">Alkhalaf, Salem</searchLink><relatesTo>3</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Mahmoud%2C+M%2EM%2EM%2E%22">Mahmoud, M.M.M.</searchLink><relatesTo>2</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Saber%2C+M%2EA%2E%22">Saber, M.A.</searchLink><relatesTo>2</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Bahaa+Eldin%2C+Ayman+M%2E%22">Bahaa Eldin, Ayman M.</searchLink><relatesTo>4</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Senjyu%2C+Tomonobu%22">Senjyu, Tomonobu</searchLink><relatesTo>5</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Alayed%2C+Abdullah+H%2E%22">Alayed, Abdullah H.</searchLink><relatesTo>6</relatesTo> (AUTHOR)
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22Ain+Shams+Engineering+Journal%22">Ain Shams Engineering Journal</searchLink>. Dec2020, Vol. 11 Issue 4, p1179-1190. 12p.
– Name: Subject
  Label: Subjects
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Matrix+multiplications%22">Matrix multiplications</searchLink><br /><searchLink fieldCode="DE" term="%22Multicore+processors%22">Multicore processors</searchLink><br /><searchLink fieldCode="DE" term="%22Image+processing%22">Image processing</searchLink><br /><searchLink fieldCode="DE" term="%22C%2B%2B%22">C++</searchLink><br /><searchLink fieldCode="DE" term="%22Electronic+data+processing%22">Electronic data processing</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: This paper is focused on Intel Advanced Vector Extension (AVX) which has been borne of the modern developments in AMD processors and Intel itself. Said prescript processes a chunk of data both individually and altogether. AVX is supporting variety of applications such as image processing. Our goal is to accelerate and optimize square single-precision matrix multiplication from 2080 to 4512, i.e. big size ranges. Our optimization is designed by using AVX instruction sets, OpenMP parallelization, and memory access optimization to overcome bandwidth limitations. This paper is different from other papers by concentrating on several main technique and the results therein. Making parallel implementation guidelines of said algorithms, where the target architecture's characteristics need to be taken into consideration when said algorithms are applied are presented. This work has a comparative study of using most popular compilers: Intel C++ compiler 17.0 over Microsoft Visual Studio C++ compiler 2015. Additionally, a comparative study between single-core and multicore platforms has been examined. The obtained results of the proposed optimized algorithms are achieved a performance improvement of 71%, 59%, and 56% for C = A.B, C = A.BT, and C = AT.B separately compared with results that are achieved by implementing the latest Intel Math Kernel Library 2017 SGEMV subroutines. [ABSTRACT FROM AUTHOR]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of Ain Shams Engineering Journal is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=147504402
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1016/j.asej.2020.01.003
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 12
        StartPage: 1179
    Subjects:
      – SubjectFull: Matrix multiplications
        Type: general
      – SubjectFull: Multicore processors
        Type: general
      – SubjectFull: Image processing
        Type: general
      – SubjectFull: C++
        Type: general
      – SubjectFull: Electronic data processing
        Type: general
    Titles:
      – TitleFull: Optimizing matrix-matrix multiplication on intel's advanced vector extensions multicore processor.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Hemeida, A.M.
      – PersonEntity:
          Name:
            NameFull: Hassan, S.A.
      – PersonEntity:
          Name:
            NameFull: Alkhalaf, Salem
      – PersonEntity:
          Name:
            NameFull: Mahmoud, M.M.M.
      – PersonEntity:
          Name:
            NameFull: Saber, M.A.
      – PersonEntity:
          Name:
            NameFull: Bahaa Eldin, Ayman M.
      – PersonEntity:
          Name:
            NameFull: Senjyu, Tomonobu
      – PersonEntity:
          Name:
            NameFull: Alayed, Abdullah H.
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 12
              Text: Dec2020
              Type: published
              Y: 2020
          Identifiers:
            – Type: issn-print
              Value: 20904479
          Numbering:
            – Type: volume
              Value: 11
            – Type: issue
              Value: 4
          Titles:
            – TitleFull: Ain Shams Engineering Journal
              Type: main
ResultId 1