TSF-DBSCAN: A Novel Fuzzy Density-Based Approach for Clustering Unbounded Data Streams.

Saved in:
Bibliographic Details
Title: TSF-DBSCAN: A Novel Fuzzy Density-Based Approach for Clustering Unbounded Data Streams.
Authors: Bechini, Alessio1 (AUTHOR) a.bechini@ing.unipi.it, Marcelloni, Francesco1 (AUTHOR) francesco.marcelloni@unipi.it, Renda, Alessandro1 (AUTHOR) alessandro.renda@unifi.it
Source: IEEE Transactions on Fuzzy Systems. Mar2022, Vol. 30 Issue 3, p623-637. 15p.
Subjects: Fuzzy algorithms, Application software, Data structures, Electronic data processing, Parallel algorithms, Algorithms
Abstract: In recent years, several clustering algorithms have been proposed with the aim of mining knowledge from streams of data generated at a high speed by a variety of hardware platforms and software applications. Among these algorithms, density-based approaches have proved to be particularly attractive, thanks to their capability of handling outliers and capturing clusters with arbitrary shapes. The streaming setting poses additional challenges that need to be addressed as well: data streams are potentially unbounded and affected by concept drift, i.e., a modification over time in the underlying data generation process. In this article, we propose temporal streaming fuzzy density-based spatial clustering of applications with noise (TSF-DBSCAN), a novel fuzzy clustering algorithm for streaming data. TSF-DBSCAN is an extension of the well-known DBSCAN algorithm, one of the most popular density-based clustering approaches. Fuzziness is introduced in TSF-DBSCAN to model the uncertainty about the distance threshold that defines the neighborhood of an object. As a consequence, TSF-DBSCAN identifies clusters with fuzzy overlapping borders. A fading model, which makes objects less relevant as they become more remote in time, endows TSF-DBSCAN with the capability of adapting to evolving data streams. The integration of the model in a two-stage approach ensures computational and memory efficiency: during the online stage, continuously arriving objects are organized in proper data structures that are later exploited in the offline stage to determine a fine-grained partition. An extensive experimental analysis on synthetic and real-world datasets shows that TSF-DBSCAN yields competitive performance when compared to other clustering algorithms recently proposed for streaming data. [ABSTRACT FROM AUTHOR]
Copyright of IEEE Transactions on Fuzzy Systems is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
FullText Text:
  Availability: 0
Header DbId: egs
DbLabel: Engineering Source
An: 155696354
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: TSF-DBSCAN: A Novel Fuzzy Density-Based Approach for Clustering Unbounded Data Streams.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Bechini%2C+Alessio%22">Bechini, Alessio</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> a.bechini@ing.unipi.it</i><br /><searchLink fieldCode="AR" term="%22Marcelloni%2C+Francesco%22">Marcelloni, Francesco</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> francesco.marcelloni@unipi.it</i><br /><searchLink fieldCode="AR" term="%22Renda%2C+Alessandro%22">Renda, Alessandro</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> alessandro.renda@unifi.it</i>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22IEEE+Transactions+on+Fuzzy+Systems%22">IEEE Transactions on Fuzzy Systems</searchLink>. Mar2022, Vol. 30 Issue 3, p623-637. 15p.
– Name: Subject
  Label: Subjects
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Fuzzy+algorithms%22">Fuzzy algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22Application+software%22">Application software</searchLink><br /><searchLink fieldCode="DE" term="%22Data+structures%22">Data structures</searchLink><br /><searchLink fieldCode="DE" term="%22Electronic+data+processing%22">Electronic data processing</searchLink><br /><searchLink fieldCode="DE" term="%22Parallel+algorithms%22">Parallel algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22Algorithms%22">Algorithms</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: In recent years, several clustering algorithms have been proposed with the aim of mining knowledge from streams of data generated at a high speed by a variety of hardware platforms and software applications. Among these algorithms, density-based approaches have proved to be particularly attractive, thanks to their capability of handling outliers and capturing clusters with arbitrary shapes. The streaming setting poses additional challenges that need to be addressed as well: data streams are potentially unbounded and affected by concept drift, i.e., a modification over time in the underlying data generation process. In this article, we propose temporal streaming fuzzy density-based spatial clustering of applications with noise (TSF-DBSCAN), a novel fuzzy clustering algorithm for streaming data. TSF-DBSCAN is an extension of the well-known DBSCAN algorithm, one of the most popular density-based clustering approaches. Fuzziness is introduced in TSF-DBSCAN to model the uncertainty about the distance threshold that defines the neighborhood of an object. As a consequence, TSF-DBSCAN identifies clusters with fuzzy overlapping borders. A fading model, which makes objects less relevant as they become more remote in time, endows TSF-DBSCAN with the capability of adapting to evolving data streams. The integration of the model in a two-stage approach ensures computational and memory efficiency: during the online stage, continuously arriving objects are organized in proper data structures that are later exploited in the offline stage to determine a fine-grained partition. An extensive experimental analysis on synthetic and real-world datasets shows that TSF-DBSCAN yields competitive performance when compared to other clustering algorithms recently proposed for streaming data. [ABSTRACT FROM AUTHOR]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of IEEE Transactions on Fuzzy Systems is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=155696354
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1109/TFUZZ.2020.3042645
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 15
        StartPage: 623
    Subjects:
      – SubjectFull: Fuzzy algorithms
        Type: general
      – SubjectFull: Application software
        Type: general
      – SubjectFull: Data structures
        Type: general
      – SubjectFull: Electronic data processing
        Type: general
      – SubjectFull: Parallel algorithms
        Type: general
      – SubjectFull: Algorithms
        Type: general
    Titles:
      – TitleFull: TSF-DBSCAN: A Novel Fuzzy Density-Based Approach for Clustering Unbounded Data Streams.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Bechini, Alessio
      – PersonEntity:
          Name:
            NameFull: Marcelloni, Francesco
      – PersonEntity:
          Name:
            NameFull: Renda, Alessandro
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 03
              Text: Mar2022
              Type: published
              Y: 2022
          Identifiers:
            – Type: issn-print
              Value: 10636706
          Numbering:
            – Type: volume
              Value: 30
            – Type: issue
              Value: 3
          Titles:
            – TitleFull: IEEE Transactions on Fuzzy Systems
              Type: main
ResultId 1