Distributed and Collaborative Web Change Detection System.

Saved in:
Bibliographic Details
Title: Distributed and Collaborative Web Change Detection System.
Authors: Prieto, Víctor M.1 victor.prieto@udc.es, Álvarez, Manuel1 manuel.alvarez@udc.es, Carneiro, Víctor1 victor.carneiro@udc.es, Cacheda, Fidel1 fidel.cacheda@udc.es
Source: Computer Science & Information Systems. Jan2015, Vol. 12 Issue 1, p91-114. 24p.
Subjects: Internet searching, Website access control, Search engines, Electronic indexes, Internet content, Downloading
Abstract: Search engines use crawlers to traverse the Web in order to download web pages and build their indexes. Maintaining these indexes up-to-date is an essential task to ensure the quality of search results. However, changes in web pages are unpredictable. Identifying the moment when a web page changes as soon as possible and with minimal computational cost is a major challenge. In this article we present theWeb Change Detection system that, in a best case scenario, is capable to detect, almost in real time, when a web page changes. In a worst case scenario, it will require, on average, 12 minutes to detect a change on a low PageRank web site and about one minute on a web site with high PageRank. Meanwhile, current search engines require more than a day, on average, to detect a modification in a web page (in both cases). [ABSTRACT FROM AUTHOR]
Copyright of Computer Science & Information Systems is the property of ComSIS Consortium and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
FullText Links:
  – Type: pdflink
Text:
  Availability: 0
Header DbId: egs
DbLabel: Engineering Source
An: 101573712
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Distributed and Collaborative Web Change Detection System.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Prieto%2C+Víctor+M%2E%22">Prieto, Víctor M.</searchLink><relatesTo>1</relatesTo><i> victor.prieto@udc.es</i><br /><searchLink fieldCode="AR" term="%22Álvarez%2C+Manuel%22">Álvarez, Manuel</searchLink><relatesTo>1</relatesTo><i> manuel.alvarez@udc.es</i><br /><searchLink fieldCode="AR" term="%22Carneiro%2C+Víctor%22">Carneiro, Víctor</searchLink><relatesTo>1</relatesTo><i> victor.carneiro@udc.es</i><br /><searchLink fieldCode="AR" term="%22Cacheda%2C+Fidel%22">Cacheda, Fidel</searchLink><relatesTo>1</relatesTo><i> fidel.cacheda@udc.es</i>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22Computer+Science+%26+Information+Systems%22">Computer Science & Information Systems</searchLink>. Jan2015, Vol. 12 Issue 1, p91-114. 24p.
– Name: Subject
  Label: Subjects
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Internet+searching%22">Internet searching</searchLink><br /><searchLink fieldCode="DE" term="%22Website+access+control%22">Website access control</searchLink><br /><searchLink fieldCode="DE" term="%22Search+engines%22">Search engines</searchLink><br /><searchLink fieldCode="DE" term="%22Electronic+indexes%22">Electronic indexes</searchLink><br /><searchLink fieldCode="DE" term="%22Internet+content%22">Internet content</searchLink><br /><searchLink fieldCode="DE" term="%22Downloading%22">Downloading</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Search engines use crawlers to traverse the Web in order to download web pages and build their indexes. Maintaining these indexes up-to-date is an essential task to ensure the quality of search results. However, changes in web pages are unpredictable. Identifying the moment when a web page changes as soon as possible and with minimal computational cost is a major challenge. In this article we present theWeb Change Detection system that, in a best case scenario, is capable to detect, almost in real time, when a web page changes. In a worst case scenario, it will require, on average, 12 minutes to detect a change on a low PageRank web site and about one minute on a web site with high PageRank. Meanwhile, current search engines require more than a day, on average, to detect a modification in a web page (in both cases). [ABSTRACT FROM AUTHOR]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of Computer Science & Information Systems is the property of ComSIS Consortium and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=101573712
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.2298/CSIS131120081P
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 24
        StartPage: 91
    Subjects:
      – SubjectFull: Internet searching
        Type: general
      – SubjectFull: Website access control
        Type: general
      – SubjectFull: Search engines
        Type: general
      – SubjectFull: Electronic indexes
        Type: general
      – SubjectFull: Internet content
        Type: general
      – SubjectFull: Downloading
        Type: general
    Titles:
      – TitleFull: Distributed and Collaborative Web Change Detection System.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Prieto, Víctor M.
      – PersonEntity:
          Name:
            NameFull: Álvarez, Manuel
      – PersonEntity:
          Name:
            NameFull: Carneiro, Víctor
      – PersonEntity:
          Name:
            NameFull: Cacheda, Fidel
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Text: Jan2015
              Type: published
              Y: 2015
          Identifiers:
            – Type: issn-print
              Value: 18200214
          Numbering:
            – Type: volume
              Value: 12
            – Type: issue
              Value: 1
          Titles:
            – TitleFull: Computer Science & Information Systems
              Type: main
ResultId 1