OpenWalnut  1.4.0
Public Member Functions | Protected Member Functions | Private Member Functions | Private Attributes | Friends
WValueSetHistogram Class Reference

Used to find the occurrence frequencies of values in a value set. More...

#include <WValueSetHistogram.h>

+ Inheritance diagram for WValueSetHistogram:

List of all members.

Public Member Functions

 WValueSetHistogram (boost::shared_ptr< WValueSetBase > valueSet, size_t buckets=1000)
 Constructor.
 WValueSetHistogram (const WValueSetBase &valueSet, size_t buckets=1000)
 Constructor.
 WValueSetHistogram (boost::shared_ptr< WValueSetBase > valueSet, double min, double max, size_t buckets=1000)
 Constructor.
 WValueSetHistogram (const WValueSetBase &valueSet, double min, double max, size_t buckets=1000)
 Constructor.
 WValueSetHistogram (const WValueSetHistogram &histogram, size_t buckets=0)
 Copy constructor.
 ~WValueSetHistogram ()
 Destructor.
WValueSetHistogramoperator= (const WValueSetHistogram &other)
 Copy assignment.
virtual size_t operator[] (size_t index) const
 Get the count of the bucket.
virtual size_t at (size_t index) const
 Get the count of the bucket.
virtual size_t size () const
 Returns the number of buckets in the histogram with the actual mapping.
virtual double getBucketSize (size_t index=0) const
 Return the size of one bucket.
virtual std::pair< double, double > getIntervalForIndex (size_t index) const
 Returns the actual interval associated with the given index.
virtual size_t getIndexForValue (double value) const
 Returns the right index to the bucket containing the given value.
virtual size_t getTotalElementCount () const
 This returns the number of value set entries added to the histogram.
virtual size_t accumulate (size_t startIndex, size_t endIndex) const
 Sums up the buckets in the specified interval.

Protected Member Functions

boost::shared_array< size_tgetInitialBuckets () const
 Return the initial buckets.
size_t getNInitialBuckets () const
 Return the number of initial buckets.
double getInitialBucketSize () const
 Return the size of one initial bucket.
virtual void insert (double value)
 increment the value by one, contains the logic to find the element place in the array.

Private Member Functions

void buildHistogram (const WValueSetBase &valueSet)
 Actually builds the histogram.

Private Attributes

double m_initialBucketSize
 Size of one bucket in the initial histogram.
boost::shared_array< size_tm_initialBuckets
 Pointer to all initial buckets of the histogram.
size_t m_nInitialBuckets
 Number of buckets in the initial histogram.
boost::shared_array< size_tm_mappedBuckets
 Pointer to all initial buckets of the histogram.
size_t m_nMappedBuckets
 Tracks the number of a buckets in the mapped histogram.
double m_mappedBucketSize
 Size of one bucket in the mapped histogram.
size_t m_nbTotalElements
 The number of elements distributed in the buckets.

Friends

class WValueSetHistogramTest

Detailed Description

Used to find the occurrence frequencies of values in a value set.

It implements a classical histogram but allows easy modification of bucket sizes without unnecessary recalculation of the whole histogram. This histogram uses right-open intervals for counting, which is why there always is a bucket at the end from max to infinity which holds all the max values.

Notes:
This histogram is different from from WValueSetHistogram which is a generic histogram class.

Definition at line 47 of file WValueSetHistogram.h.


Constructor & Destructor Documentation

WValueSetHistogram::WValueSetHistogram ( boost::shared_ptr< WValueSetBase valueSet,
size_t  buckets = 1000 
) [explicit]

Constructor.

Creates the histogram for the specified value set.

Parameters:
valueSetsource of the data for the histogram
bucketsthe number of buckets to use. If not specified, 1000 is used as default. Must be larger than 1.

Definition at line 38 of file WValueSetHistogram.cpp.

References buildHistogram().

WValueSetHistogram::WValueSetHistogram ( const WValueSetBase valueSet,
size_t  buckets = 1000 
) [explicit]

Constructor.

Creates the histogram for the specified value set.

Parameters:
valueSetsource of the data for the histogram
bucketsthe number of buckets to use. If not specified, 1000 is used as default. Must be larger than 1.

Definition at line 44 of file WValueSetHistogram.cpp.

References buildHistogram().

WValueSetHistogram::WValueSetHistogram ( boost::shared_ptr< WValueSetBase valueSet,
double  min,
double  max,
size_t  buckets = 1000 
)

Constructor.

Creates a histogram from the specified value set but allows cropping of values below the given min and above the given max. It actually interprets all values below min and above max to be exactly min and exactly max and sorts them into the appropriate bin. This is especially useful to filter out outliers in data.

Parameters:
valueSetsource data
minthe new minimum to use
maxthe maximum to use
bucketsthe number of buckets to use. If not specified, 1000 is used as default. Must be larger than 1.

Definition at line 50 of file WValueSetHistogram.cpp.

References buildHistogram().

WValueSetHistogram::WValueSetHistogram ( const WValueSetBase valueSet,
double  min,
double  max,
size_t  buckets = 1000 
)

Constructor.

Creates a histogram from the specified value set but allows cropping of values below the given min and above the given max. It actually interprets all values below min and above max to be exactly min and exactly max and sorts them into the appropriate bin. This is especially useful to filter out outliers in data.

Parameters:
valueSetsource data
minthe new minimum to use
maxthe maximum to use
bucketsthe number of buckets to use. If not specified, 1000 is used as default. Must be larger than 1.

Definition at line 56 of file WValueSetHistogram.cpp.

References buildHistogram().

WValueSetHistogram::WValueSetHistogram ( const WValueSetHistogram histogram,
size_t  buckets = 0 
)

Copy constructor.

If another interval size is given the histogram gets matched to it using the initial bucket data.

Notes:
this does not deep copy the m_initialBuckets and m_mappedBuckets array as these are shared_array instances.
Parameters:
histogramanother WValueSetHistogram
bucketsthe new number of buckets. Must be larger than 1 if specified.

Definition at line 99 of file WValueSetHistogram.cpp.

References m_initialBuckets, m_mappedBuckets, m_mappedBucketSize, WHistogram::m_maximum, WHistogram::m_minimum, m_nInitialBuckets, and m_nMappedBuckets.

Destructor.

Definition at line 167 of file WValueSetHistogram.cpp.


Member Function Documentation

size_t WValueSetHistogram::accumulate ( size_t  startIndex,
size_t  endIndex 
) const [virtual]

Sums up the buckets in the specified interval.

Especially useful for cumulative distribution functions or similar.

Parameters:
startIndexthe index where to start counting including this one
endIndexthe index where to end summing up excluding this one.
Returns:
the sum of all buckets in the interval.
Exceptions:
WOutOfBoundsif one of the indices is invalid.

Definition at line 228 of file WValueSetHistogram.cpp.

References m_mappedBuckets, and size().

size_t WValueSetHistogram::at ( size_t  index) const [virtual]

Get the count of the bucket.

Testing if the position is valid.

Parameters:
indexwhich buckets count is to be returned; starts with 0 which is the bucket containing the smallest values.
Returns:
elements in the bucket

Implements WHistogram.

Definition at line 203 of file WValueSetHistogram.cpp.

References m_mappedBuckets, and m_nMappedBuckets.

Referenced by WValueSetHistogramTest::testCopyWithIntervalChanges().

void WValueSetHistogram::buildHistogram ( const WValueSetBase valueSet) [private]

Actually builds the histogram.

This function is simply used for avoiding code duplication in all these constructors.

Parameters:
valueSetthe value set.

Definition at line 62 of file WValueSetHistogram.cpp.

References WValueSetBase::getScalarDouble(), insert(), m_initialBuckets, m_initialBucketSize, m_mappedBuckets, m_mappedBucketSize, WHistogram::m_maximum, WHistogram::m_minimum, WHistogram::m_nbBuckets, m_nbTotalElements, m_nInitialBuckets, m_nMappedBuckets, and WValueSetBase::size().

Referenced by WValueSetHistogram().

double WValueSetHistogram::getBucketSize ( size_t  index = 0) const [virtual]

Return the size of one bucket.

Parameters:
indexthe width for this bucket is queried.
Returns:
the size of a bucket.

Implements WHistogram.

Definition at line 186 of file WValueSetHistogram.cpp.

References m_mappedBucketSize.

size_t WValueSetHistogram::getIndexForValue ( double  value) const [inline, virtual]

Returns the right index to the bucket containing the given value.

If a value larger than the maximum, the maximum index is returned. Same for minimum; if the value is smaller than the minimum, 0 is returned.

Parameters:
valuethe value to search the index for
Returns:
the index of the bucket

Definition at line 267 of file WValueSetHistogram.h.

References m_mappedBucketSize, WHistogram::m_minimum, and m_nMappedBuckets.

Referenced by insert().

boost::shared_array< size_t > WValueSetHistogram::getInitialBuckets ( ) const [protected]

Return the initial buckets.

Returns:
m_initialBuckets

Definition at line 171 of file WValueSetHistogram.cpp.

References m_initialBuckets.

double WValueSetHistogram::getInitialBucketSize ( ) const [protected]

Return the size of one initial bucket.

Returns:
m_bucketSize

Definition at line 181 of file WValueSetHistogram.cpp.

References m_initialBucketSize.

std::pair< double, double > WValueSetHistogram::getIntervalForIndex ( size_t  index) const [virtual]

Returns the actual interval associated with the given index.

The interval is open, meaning that getIntervalForIndex( i ).second == getIntervalForIndex( i + 1 ).first but does not belong anymore to the interval itself but every value smaller than getIntervalForIndex( i ).second.

Parameters:
indexthe intex
Returns:
the open interval.

Implements WHistogram.

Definition at line 221 of file WValueSetHistogram.cpp.

References m_mappedBucketSize, and WHistogram::m_minimum.

Return the number of initial buckets.

Returns:
m_nInitialBuckets

Definition at line 176 of file WValueSetHistogram.cpp.

References m_nInitialBuckets.

This returns the number of value set entries added to the histogram.

This is especially useful to normalize the histogram counts.

Returns:
the number of elements distributed in the buckets.

Definition at line 216 of file WValueSetHistogram.cpp.

References m_nbTotalElements.

void WValueSetHistogram::insert ( double  value) [protected, virtual]

increment the value by one, contains the logic to find the element place in the array.

Should only be used in the constructor i.e. while iterating over WValueSet.

Parameters:
valuevalue to increment

Definition at line 192 of file WValueSetHistogram.cpp.

References getIndexForValue(), and m_mappedBuckets.

Referenced by buildHistogram().

WValueSetHistogram & WValueSetHistogram::operator= ( const WValueSetHistogram other)

Copy assignment.

Copies the contents of the specified histogram to this instance.

Parameters:
otherthe other instance
Returns:
this instance with the contents of the other one.
Notes:
this does not deep copy the m_initialBuckets and m_mappedBuckets array as these are shared_array instances.

Definition at line 148 of file WValueSetHistogram.cpp.

References m_initialBuckets, m_initialBucketSize, m_mappedBuckets, m_mappedBucketSize, WHistogram::m_maximum, WHistogram::m_minimum, m_nInitialBuckets, and m_nMappedBuckets.

size_t WValueSetHistogram::operator[] ( size_t  index) const [virtual]

Get the count of the bucket.

Parameters:
indexwhich buckets count is to be returned; starts with 0 which is the bucket containing the smallest values.
Returns:
elements in the bucket.

Implements WHistogram.

Definition at line 197 of file WValueSetHistogram.cpp.

References m_mappedBuckets.

size_t WValueSetHistogram::size ( ) const [virtual]

Returns the number of buckets in the histogram with the actual mapping.

Returns:
number of buckets

Reimplemented from WHistogram.

Definition at line 211 of file WValueSetHistogram.cpp.

References m_nMappedBuckets.

Referenced by accumulate().


Member Data Documentation

boost::shared_array< size_t > WValueSetHistogram::m_initialBuckets [private]
boost::shared_array< size_t > WValueSetHistogram::m_mappedBuckets [private]

The number of elements distributed in the buckets.

Definition at line 252 of file WValueSetHistogram.h.

Referenced by buildHistogram(), and getTotalElementCount().


The documentation for this class was generated from the following files: