Estimating the maximum

Saved in:
Bibliographic Details
Title: Estimating the maximum
Authors: Gum, Ben1 gum@cs.grinnell.edu, Lipton, Richard J.2 rjl@cc.gatech.edu, LaPaugh, Andrea3 aslp@cs.princeton.edu, Fich, Faith4 fich@cs.toronto.edu
Source: Journal of Algorithms. Jan2005, Vol. 54 Issue 1, p105-114. 10p.
Subjects: Statistical sampling, Algorithms, Algebra, Graph theory
Abstract: Estimating the maximum of a sampled dataset is an important and daunting task. We give a sampling algorithm for general datasets which gives estimates strictly better than the largest sample for an infinite family of datasets. Our algorithm overshoots the true maximum of the worst case dataset with probability at most (1/e)+O(1/k), where k is the size of our sample, which is much smaller than the size of the dataset. Our proof is the result of a new extremal graph coloring theorem: given any red/green coloring of the edges of a complete graph of n vertices, the probability that the edges among k randomly sampled vertices have a certain property is at most (1/e)+O(1/k). In addition, we show that if an algorithm gives an estimate strictly better than the largest sample for some dataset, then the algorithm overshoots the maximum on some other dataset with probability at least (1/e)-O(1/k). [Copyright &y& Elsevier]
Copyright of Journal of Algorithms is the property of Academic Press Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
FullText Text:
  Availability: 0
Header DbId: egs
DbLabel: Engineering Source
An: 15647807
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Estimating the maximum
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Gum%2C+Ben%22">Gum, Ben</searchLink><relatesTo>1</relatesTo><i> gum@cs.grinnell.edu</i><br /><searchLink fieldCode="AR" term="%22Lipton%2C+Richard+J%2E%22">Lipton, Richard J.</searchLink><relatesTo>2</relatesTo><i> rjl@cc.gatech.edu</i><br /><searchLink fieldCode="AR" term="%22LaPaugh%2C+Andrea%22">LaPaugh, Andrea</searchLink><relatesTo>3</relatesTo><i> aslp@cs.princeton.edu</i><br /><searchLink fieldCode="AR" term="%22Fich%2C+Faith%22">Fich, Faith</searchLink><relatesTo>4</relatesTo><i> fich@cs.toronto.edu</i>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22Journal+of+Algorithms%22">Journal of Algorithms</searchLink>. Jan2005, Vol. 54 Issue 1, p105-114. 10p.
– Name: Subject
  Label: Subjects
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Statistical+sampling%22">Statistical sampling</searchLink><br /><searchLink fieldCode="DE" term="%22Algorithms%22">Algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22Algebra%22">Algebra</searchLink><br /><searchLink fieldCode="DE" term="%22Graph+theory%22">Graph theory</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Estimating the maximum of a sampled dataset is an important and daunting task. We give a sampling algorithm for general datasets which gives estimates strictly better than the largest sample for an infinite family of datasets. Our algorithm overshoots the true maximum of the worst case dataset with probability at most <f>(1/e)+O(1/k)</f>, where <f>k</f> is the size of our sample, which is much smaller than the size of the dataset. Our proof is the result of a new extremal graph coloring theorem: given any red/green coloring of the edges of a complete graph of <f>n</f> vertices, the probability that the edges among <f>k</f> randomly sampled vertices have a certain property is at most <f>(1/e)+O(1/k)</f>. In addition, we show that if an algorithm gives an estimate strictly better than the largest sample for some dataset, then the algorithm overshoots the maximum on some other dataset with probability at least <f>(1/e)-O(1/k)</f>. [Copyright &y& Elsevier]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of Journal of Algorithms is the property of Academic Press Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=15647807
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1016/j.jalgor.2004.04.005
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 10
        StartPage: 105
    Subjects:
      – SubjectFull: Statistical sampling
        Type: general
      – SubjectFull: Algorithms
        Type: general
      – SubjectFull: Algebra
        Type: general
      – SubjectFull: Graph theory
        Type: general
    Titles:
      – TitleFull: Estimating the maximum
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Gum, Ben
      – PersonEntity:
          Name:
            NameFull: Lipton, Richard J.
      – PersonEntity:
          Name:
            NameFull: LaPaugh, Andrea
      – PersonEntity:
          Name:
            NameFull: Fich, Faith
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Text: Jan2005
              Type: published
              Y: 2005
          Identifiers:
            – Type: issn-print
              Value: 01966774
          Numbering:
            – Type: volume
              Value: 54
            – Type: issue
              Value: 1
          Titles:
            – TitleFull: Journal of Algorithms
              Type: main
ResultId 1