Item Selection Methods – catsim.selection

All implemented classes in this module inherit from a base abstract class Selector. Simulator allows that a custom selector be used during the simulation, as long as it also inherits from Selector.

Inheritance diagram of catsim.selection
class catsim.selection.AStratifiedBBlockingSelector(test_size)[source]

Bases: catsim.selection.StratifiedSelector

Implementation of the \(\alpha\)-stratified selector with \(b\) blocking proposed by [Chang2001], in which the item bank is sorted in ascending order according to the items difficulty parameter and then separated into \(M\) strata, each stratum containing gradually higher average difficulty.

Each of the \(M\) strata is then again separated into \(K\) sub-strata (\(k\) being the test size), according to their discrimination. The final item bank is then ordered such that the first sub-strata of each strata forms the first strata of the new ordered item bank, and so on. This method tries to balance the distribution of both parameters between all strata, after perceiving that they are correlated.

_images/b-blocking.svg
Parameters:test_size – the number of items the test contains. The selector uses this parameter to

create the correct number of strata.

class catsim.selection.AStratifiedSelector(test_size)[source]

Bases: catsim.selection.StratifiedSelector

Implementation of the \(\alpha\)-stratified selector proposed by [Chang99], in which the item bank is sorted in ascending order according to the items discrimination parameter and then separated into \(K\) strata (\(K\) being the test size), each stratum containing gradually higher average discrimination. The \(\alpha\)-stratified selector then selects the first non-administered item from stratum \(k\), in which \(k\) represents the position in the test of the current item the examinee is being presented.

_images/alpha-strat.svg
Parameters:test_size – the number of items the test contains. The selector uses this parameter

to create the correct number of strata.

class catsim.selection.ClusterSelector(clusters: list, method: str = 'item_info', r_max: float = 1, r_control: str = 'passive')[source]

Bases: catsim.simulation.Selector

Cluster-based Item Selection Method.

[Men15]Meneghetti, D. R. (2015). Metolodogia de seleção de itens em testes adaptativos informatizados baseada em agrupamento por similaridade (Mestrado). Centro Universitário da FEI. Retrieved from https://www.researchgate.net/publication/283944553_Metodologia_de_selecao_de_itens_em_Testes_Adaptativos_Informatizados_baseada_em_Agrupamento_por_Similaridade
Parameters:
  • clusters – a list containing item cluster memberships
  • r_max – maximum exposure rate for items
  • method

    one of the available methods for cluster selection. Given the estimated theta value at each step:

    item_info: selects the cluster which has the item with maximum information;

    cluster_info: selects the cluster whose items sum of information is maximum;

    weighted_info: selects the cluster whose weighted sum of information is maximum. The weighted equals the number of items in the cluster;

  • r_control – if passive and all items \(i\) in the selected cluster have \(r_i > r^{max}\), applies the item with maximum information; if aggressive, applies the item with smallest \(r\) value.
static avg_cluster_params(items: numpy.ndarray, c: list)[source]

Returns the average values of item parameters by cluster

Parameters:
  • items (ndarray) –
  • c (list) – a list containing clustering memeberships.
Returns:

a matrix containing the average values of each parameter by cluster. Lines are clusters, columns are parameters.

select(index: int = None, items: numpy.ndarray = None, administered_items: list = None, est_theta: float = None, **kwargs) → int[source]

Returns the index of the next item to be administered.

Return type:

int

Parameters:
  • index (int) – the index of the current examinee in the simulator.
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • administered_items (list) – a list containing the indexes of items that were already administered
  • est_theta (float) – a float containing the current estimated proficiency
Returns:

index of the next item to be applied.

static sum_cluster_infos(theta: float, items: numpy.ndarray, clusters: list) → list[source]

Returns the sum of item informations, separated by cluster

Return type:

list

Parameters:
  • theta (float) – an examinee’s \(\theta\) value
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • clusters (list) – a list containing item cluster memberships, represented by integers
Returns:

list containing the sum of item information values for each cluster

static sum_cluster_params(items: numpy.ndarray, c: list)[source]

Returns the sum of item parameter values for each cluster cluster

Parameters:
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • c (list) – a list containing clustering memeberships.
Returns:

a matrix containing the sum of each parameter by cluster. Lines are clusters, columns are parameters.

static weighted_cluster_infos(theta: float, items: numpy.ndarray, clusters: list)[source]

Returns the weighted sum of item informations, separated by cluster. The weight is the number of items in each cluster.

Parameters:
  • theta (float) – an examinee’s \(\theta\) value
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • clusters (list) – a list containing item cluster memberships, represented by integers
Returns:

list containing the sum of item information values for each cluster, divided by the number of items in each cluster

class catsim.selection.IntervalIntegrationSelector(interval: float = None)[source]

Bases: catsim.simulation.Selector

Implementation of an interval integration selector in which, at every step of the test, the item that maximizes the information function integral at a predetermined interval \(\delta\) above and below the current \(\hat\theta\) is chosen.

\[argmax_{i \in I} \int_{\hat\theta - \delta}^{\hat\theta - \delta}I_i(\hat\theta)\]
Parameters:interval – the interval of the integral. If no interval is passed, the integral is calculated from \([-\infty, \infty]\).
select(index: int = None, items: numpy.ndarray = None, administered_items: list = None, est_theta: float = None, **kwargs) → int[source]

Returns the index of the next item to be administered.

Return type:

int

Parameters:
  • index (int) – the index of the current examinee in the simulator.
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • administered_items (list) – a list containing the indexes of items that were already administered
  • est_theta (float) – a float containing the current estimated proficiency
Returns:

index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.LinearSelector(indexes: list)[source]

Bases: catsim.simulation.FiniteSelector

Selector that returns item indexes in a linear order, simulating a standard (non-adaptive) test.

Parameters:indexes – the indexes of the items that will be returned in order
select(index: int = None, administered_items: list = None, **kwargs) → int[source]

Returns the index of the next item to be administered.

Return type:

int

Parameters:
  • index (int) – the index of the current examinee in the simulator.
  • administered_items (list) – a list containing the indexes of items that were already administered
Returns:

index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.MaxInfoBBlockingSelector(test_size)[source]

Bases: catsim.selection.StratifiedSelector

Implementation of the maximum information stratification with \(b\) blocking (MIS-B) selector proposed by [Bar06], in which the item bank is sorted in ascending order according to the items difficulty parameter and then separated into \(M\) strata, each stratum containing gradually higher average difficulty.

Each of the \(M\) strata is then again separated into \(K\) sub-strata (\(k\) being the test size), according to the items maximum information. The final item bank is then ordered such that the first sub-strata of each strata forms the first strata of the new ordered item bank, and so on. This method tries to balance the distribution of both parameters between all strata and works better than the \(a\)-stratified with \(b\) blocking method by [Chang2001] for the three-parameter logistic model of IRT, since item difficulty and maximum information are not positioned in the same place in the proficiency scale in 3PL. This may also apply, although not mentioned by the authors, for the 4PL.

_images/mis-b.svg
Parameters:test_size – the number of items the test contains. The selector uses this parameter to

create the correct number of strata.

class catsim.selection.MaxInfoSelector[source]

Bases: catsim.simulation.Selector

Selector that returns the first non-administered item with maximum information, given an estimated theta

select(index: int = None, items: numpy.ndarray = None, administered_items: list = None, est_theta: float = None, **kwargs) → int[source]

Returns the index of the next item to be administered.

Return type:

int

Parameters:
  • index (int) – the index of the current examinee in the simulator.
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • administered_items (list) – a list containing the indexes of items that were already administered
  • est_theta (float) – a float containing the current estimated proficiency
Returns:

index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.MaxInfoStratificationSelector(test_size)[source]

Bases: catsim.selection.StratifiedSelector

Implementation of the maximum information stratification (MIS) selector proposed by [Bar06], in which the item bank is sorted in ascending order according to the items maximum information and then separated into \(K\) strata (\(K\) being the test size), each stratum containing items with gradually higher maximum information. The MIS selector then selects the first non-administered item from stratum \(k\), in which \(k\) represents the position in the test of the current item the examinee is being presented.

_images/mis.svg

This method claims to work better than the \(a\)-stratified method by [Chang99] for the three-parameter logistic model of IRT, since item difficulty and maximum information are not positioned in the same place in the proficiency scale in 3PL.

Parameters:test_size – the number of items the test contains. The selector uses this parameter to

create the correct number of strata.

class catsim.selection.RandomSelector(replace: bool = False)[source]

Bases: catsim.simulation.Selector

Selector that randomly selects items for application.

Parameters:replace – whether to select an item that has already been selected before for this examinee.
select(index: int = None, items: numpy.ndarray = None, administered_items: list = None, **kwargs) → int[source]

Returns the index of the next item to be administered.

Return type:

int

Parameters:
  • index (int) – the index of the current examinee in the simulator.
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • administered_items (list) – a list containing the indexes of items that were already administered
Returns:

index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.RandomesqueSelector(bin_size)[source]

Bases: catsim.simulation.Selector

Implementation of the randomesque selector proposed by [Kingsbury89], in which, at every step of the test, an item is randomly chosen from the \(n\) most informative items in the item bank, \(n\) being a predefined value (originally 5, but user-defined in this implementation)

Parameters:bin_size – the number of most informative items to be taken into consideration when

randomly selecting one of them.

select(index: int = None, items: numpy.ndarray = None, administered_items: list = None, est_theta: float = None, **kwargs) → int[source]

Returns the index of the next item to be administered.

Return type:

int

Parameters:
  • index (int) – the index of the current examinee in the simulator.
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • administered_items (list) – a list containing the indexes of items that were already administered
  • est_theta (float) – a float containing the current estimated proficiency
Returns:

index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.The54321Selector(test_size)[source]

Bases: catsim.simulation.FiniteSelector

Implementation of the 5-4-3-2-1 selector proposed by [McBride83], in which, at each step \(k\) of a test of size \(K\), an item is chosen from a bin containing the \(K-k\) most informative items in the bank, given the current \(\hat\theta\). As the test progresses, the bin gets smaller and more informative items have a higher probability of being chosen by the end of the test, when the estimation of ‘\(\hat\theta\) is more precise. The 5-4-3-2-1 selector can be viewed as a specialization of the catsim.selection.RandomesqueSelector, in which the bin size of most informative items gets smaller as the test progresses.

Parameters:test_size – the number of items the test contains. The selector uses this parameter to set the bin size
select(index: int = None, items: numpy.ndarray = None, administered_items: list = None, est_theta: float = None, **kwargs) → int[source]

Returns the index of the next item to be administered.

Return type:

int

Parameters:
  • index (int) – the index of the current examinee in the simulator.
  • items (ndarray) – a matrix containing item parameters in the format that catsim understands (see: catsim.cat.generate_item_bank())
  • administered_items (list) – a list containing the indexes of items that were already administered
  • est_theta (float) – a float containing the current estimated proficiency
Returns:

index of the next item to be applied or None if there are no more items in the item bank.