Min-Wise Samplers

Ver. 1.0.0 (2023-04-16)

This module provides a ready-to-use stream sampler class SamplerMinWise, in which a set of instances is classified as one cluster and each of them is weighted with a random uniform value from 0 to 1. The smallest weighted instance in the cluster is not omitted, while the other is omitted. This algorithm is proposed by Suman Nath, et al.

class mlpro.bf.streams.samplers.min_wise.SamplerMinWise(p_num_instances: int = 0, p_cluster_size: float = 1, p_seed: int = 0)

Bases: Sampler

A ready-to-use class for data streams with min-wise sampler. This object can be used in Stream.

Parameters:
  • p_num_instances (int) – Number of instances. This parameter has no affect in this sampler method. Default = 0.

  • p_cluster_size (int) – Number of instances in a cluster. Default = 10.

  • p_seed (int) – Random seeding. Default = 0.

C_TYPE = 'Min-Wise Sampler'
C_SCIREF_TYPE_PROCEEDINGS = 'Proceedings'
C_SCIREF_TYPE = 'Proceedings'
C_SCIREF_AUTHOR = 'Suman Nath, Phillip B. Gibbons, Srinivasan Seshan, and Zachary R. Anderson'
C_SCIREF_TITLE = 'Synopsis Diffusion for Robust Aggregation in Sensor Networks'
C_SCIREF_YEAR = '2004'
C_SCIREF_ISBN = '1581138792'
C_SCIREF_PUBLISHER = 'Association for Computing Machinery'
C_SCIREF_URL = 'https://doi.org/10.1145/1031495.1031525'
C_SCIREF_DOI = '10.1145/1031495.1031525'
C_SCIREF_BOOKTITLE = 'Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems'
C_SCIREF_PAGES = '250–262'
reset()

A method to reset the sampler’s settings.

_omit_instance(p_inst: Instance) bool

A custom method to filter any incoming instances, which is being called by omit_instance() method.

Parameters:

p_inst (Instance) – An input instance to be filtered.

Returns:

False means the input instance is not omitted, otherwise True.

Return type:

bool