Item Selection Methods – catsim.selection

All implemented classes in this module inherit from a base abstract class Selector. Simulator allows that a custom selector be used during the simulation, as long as it also inherits from Selector.

Inheritance diagram of catsim.selection
class catsim.selection.AStratBBlockSelector(test_size: int)[source]

Bases: StratifiedSelector

Implementation of the \(\\alpha\)-stratified selector with \(b\) blocking proposed by [Chang2001].

In this selector, the item bank is sorted in ascending order according to the items difficulty parameter and then separated into \(M\) strata, each stratum containing gradually higher average difficulty.

Each of the \(M\) strata is then again separated into \(K\) sub-strata (\(k\) being the test size), according to their discrimination. The final item bank is then ordered such that the first sub-strata of each strata forms the first strata of the new ordered item bank, and so on. This method tries to balance the distribution of both parameters between all strata, after perceiving that they are correlated.

_images/b-blocking.svg
Parameters:
test_sizeint

The number of items the test contains. The selector uses this parameter to create the correct number of strata.

presort_items(item_bank: ItemBank) ndarray[tuple[Any, ...], dtype[floating]][source]

Presort items in ascending order of discrimination each item, then each strata according to item difficulty.

Parameters:
item_bankItemBank

An ItemBank containing item parameters.

Returns:
numpy.ndarray

The sorted item matrix.

class catsim.selection.AStratSelector(test_size: int)[source]

Bases: StratifiedSelector

Implementation of the \(\alpha\)-stratified selector proposed by [Chang99].

In this selector, the item bank is sorted in ascending order according to the items’ discrimination parameter and then separated into \(K\) strata (\(K\) being the test size), each stratum containing gradually higher average discrimination. The \(\alpha\)-stratified selector then selects the first non-administered item from stratum \(k\), where \(k\) represents the position in the test of the current item the examinee is being presented.

This method helps control item exposure by ensuring items with different discrimination levels are distributed throughout the test.

_images/alpha-strat.svg
Parameters:
test_sizeint

The number of items the test contains. The selector uses this parameter to create the correct number of strata.

presort_items(item_bank: ItemBank) ndarray[tuple[Any, ...], dtype[floating]][source]

Presort the item matrix in ascending order according to the discrimination of each item.

Parameters:
itemsNDArray[numpy.floating]

An item matrix.

Returns:
numpy.ndarray

Array of item indices sorted in ascending order by discrimination (a parameter).

class catsim.selection.ClusterSelector(clusters: list[int], method: str = 'item_info', r_max: float = 1, r_control: str = 'passive')[source]

Bases: Selector

Cluster-based Item Selection Method.

This method groups items into clusters and selects items from clusters based on their information characteristics, helping to balance item exposure across the item bank [Men15].

[Men15]

Meneghetti, D. R. (2015). Metolodogia de seleção de itens em testes adaptativos informatizados baseada em agrupamento por similaridade (Mestrado). Centro Universitário da FEI. Retrieved from https://www.researchgate.net/publication/283944553_Metodologia_de_selecao_de_itens_em_Testes_Adaptativos_Informatizados_baseada_em_Agrupamento_por_Similaridade

Parameters:
clusterslist[int]

A list containing item cluster memberships (integers representing cluster IDs).

methodstr, optional

One of the available methods for cluster selection. Given the estimated theta value at each step:

  • ‘item_info’: selects the cluster which has the item with maximum information

  • ‘cluster_info’: selects the cluster whose items’ sum of information is maximum

  • ‘weighted_info’: selects the cluster whose weighted sum of information is maximum (weighted by the number of items in the cluster)

Default is ‘item_info’.

r_maxfloat, optional

Maximum exposure rate for items. Default is 1.

r_controlstr, optional

Item exposure control method. If ‘passive’ and all items in the selected cluster have exposure rates > r_max, applies the item with maximum information. If ‘aggressive’, applies the item with smallest exposure rate. Default is ‘passive’.

static avg_cluster_params(item_bank: ItemBank, c: list[int]) ndarray[tuple[Any, ...], dtype[float64]][source]

Return the average values of item parameters by cluster.

Parameters:
item_bankItemBank

An ItemBank containing item parameters.

clist[int]

A list containing clustering memberships (cluster IDs for each item).

Returns:
numpy.ndarray

A matrix containing the average values of each parameter by cluster. Rows are clusters, columns are parameters (a, b, c, d).

property clusters: list[int]

Return the clusters each item belongs to.

property method: str

Return the method used for cluster selection.

property r_control: str

Return the item exposure control method.

property r_max: float

Return the maximum exposure rate for items the selector accepts.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, est_theta: float | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

itemsNDArray[numpy.floating] or None, optional

A matrix containing item parameters in the format that catsim understands (see: ItemBank.generate_item_bank()). Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

est_thetafloat or None, optional

A float containing the current estimated ability. Default is None.

**kwargs

Additional keyword arguments.

Returns:
int or None

Index of the next item to be applied.

static sum_cluster_infos(theta: float, item_bank: ItemBank, clusters: list[int]) ndarray[tuple[Any, ...], dtype[floating]][source]

Return the sum of item information values, separated by cluster.

Parameters:
thetafloat

An examinee’s \(\theta\) value.

item_bankItemBank

An ItemBank containing item parameters.

clusterslist[int]

A list containing item cluster memberships, represented by integers.

Returns:
numpy.ndarray

Array containing the sum of item information values for each cluster.

static sum_cluster_params(item_bank: ItemBank, c: list[int]) ndarray[tuple[Any, ...], dtype[float64]][source]

Return the sum of item parameter values for each cluster.

Parameters:
item_bankItemBank

An ItemBank containing item parameters.

clist[int]

A list containing clustering memberships (cluster IDs for each item).

Returns:
numpy.ndarray

A matrix containing the sum of each parameter by cluster. Rows are clusters, columns are parameters (a, b, c, d).

static weighted_cluster_infos(theta: float, item_bank: ItemBank, clusters: list[int]) ndarray[tuple[Any, ...], dtype[floating]][source]

Return the weighted sum of item information values, separated by cluster.

The weight is the number of items in each cluster, providing an average information value per cluster that accounts for cluster size.

Parameters:
thetafloat

An examinee’s \(\theta\) value.

item_bankItemBank

An ItemBank containing item parameters.

clusterslist[int]

A list containing item cluster memberships, represented by integers.

Returns:
numpy.ndarray

Array containing the average information values for each cluster (sum of item information divided by the number of items in each cluster).

class catsim.selection.IntervalInfoSelector(interval: float | None = None)[source]

Bases: Selector

Selects the item that maximizes the integral of the information function at a predetermined interval.

The interval is defined by a parameter \(\\delta\) above and below the current \(\\hat\\theta\), like so: .. math:: argmax_{i \in I} \int_{\hat\theta - \delta}^{\hat\theta + \delta}I_i(\hat\theta)

property interval: float

Get the size of the interval under which the integral of the information function will be computed.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, est_theta: float | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

itemsNDArray[numpy.floating] or None, optional

A matrix containing item parameters in the format that catsim understands (see: ItemBank.generate_item_bank()). Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

est_thetafloat or None, optional

A float containing the current estimated ability. Default is None.

**kwargs

Additional keyword arguments.

Returns:
int or None

Index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.LinearSelector(indexes: list[int])[source]

Bases: FiniteSelector

Selector that returns item indexes in a linear order, simulating a standard (non-adaptive) test.

This selector is useful for baseline comparisons or for administering a fixed set of items in a predetermined order.

Parameters:
indexeslist[int]

The indexes of the items that will be returned in order.

property current: int

The index of the current item.

property indexes: list[int]

The indexes of the items that will be returned in order.

select(index: int | None = None, administered_items: list[int] | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

**kwargs

Additional keyword arguments.

Returns:
int or None

Index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.MaxInfoBBlockSelector(test_size: int)[source]

Bases: MaxInfoStratSelector

Implementation of the maximum information stratification with \(b\) blocking (MIS-B) selector [Bar06].

In this selector, the item bank is sorted in ascending order according to the items difficulty parameter and then separated into \(M\) strata, each stratum containing gradually higher average difficulty.

Each of the \(M\) strata is then again separated into \(K\) sub-strata (\(k\) being the test size), according to the items maximum information. The final item bank is then ordered such that the first sub-strata of each strata forms the first strata of the new ordered item bank, and so on. This method tries to balance the distribution of both parameters between all strata and works better than the \(a\)-stratified with \(b\) blocking method by [Chang2001] for the three-parameter logistic model of IRT, since item difficulty and maximum information are not positioned in the same place in the ability scale in 3PL. This may also apply, although not mentioned by the authors, for the 4PL.

_images/mis-b.svg
Parameters:
test_sizeint

The number of items the test contains. The selector uses this parameter to create the correct number of strata.

presort_items(item_bank: ItemBank) ndarray[tuple[Any, ...], dtype[floating]][source]

Presort the item matrix according to the information of each item at their maximum.

Parameters:
itemsNDArray[numpy.floating]

An item matrix.

Returns:
numpy.ndarray

The sorted item matrix.

class catsim.selection.MaxInfoSelector(r_max: float = 1)[source]

Bases: Selector

Selector that returns the first non-administered item with maximum information for the current theta estimate.

This is one of the most common item selection methods in CAT, choosing the item that provides the most information at the examinee’s current estimated ability level.

Parameters:
r_maxfloat, optional

Maximum exposure rate for items. Items with exposure rates >= r_max will not be selected unless no other items are available. Default is 1 (no restriction).

property r_max: float

Return the maximum exposure rate for items the selector accepts.

Returns:
float

Maximum exposure rate for items the selector accepts.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, est_theta: float | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

itemsNDArray[numpy.floating] or None, optional

A matrix containing item parameters in the format that catsim understands (see: ItemBank.generate_item_bank()). Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

est_thetafloat or None, optional

A float containing the current estimated ability. Default is None.

**kwargs

Additional keyword arguments.

Returns:
int or None

Index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.MaxInfoStratSelector(test_size: int)[source]

Bases: StratifiedSelector

Implementation of the maximum information stratification (MIS) selector proposed by [Bar06].

In this selector, the item bank is sorted in ascending order according to the items’ maximum information and then separated into \(K\) strata (\(K\) being the test size), each stratum containing items with gradually higher maximum information. The MIS selector then selects the first non-administered item from stratum \(k\), where \(k\) represents the position in the test of the current item the examinee is being presented.

_images/mis.svg

This method claims to work better than the \(a\)-stratified method by [Chang99] for the three-parameter logistic model of IRT, since item difficulty and maximum information are not positioned in the same place in the ability scale in 3PL.

Parameters:
test_sizeint

The number of items the test contains. The selector uses this parameter to create the correct number of strata.

postsort_items(item_bank: ItemBank, using_simulator_props: bool, est_theta: float) ndarray[tuple[Any, ...], dtype[floating]][source]

Divide the item bank into strata and sort each one in descending order of information for the current theta.

Parameters:
itemsNDArray[numpy.floating]

The item matrix.

using_simulator_propsbool

Whether the selector is being executed inside a Simulator.

est_thetafloat

The current estimate of the examinee’s ability.

Returns:
numpy.ndarray

The sorted item matrix.

presort_items(item_bank: ItemBank) ndarray[tuple[Any, ...], dtype[floating]][source]

Presort items in ascending order of maximum information.

Parameters:
itemsNDArray[numpy.floating]

An item matrix.

Returns:
numpy.ndarray

The sorted item matrix.

class catsim.selection.RandomSelector(replace: bool = False)[source]

Bases: Selector

Selector that randomly selects items for application.

This selector is useful for baseline comparisons or for studying the impact of item selection strategies.

Parameters:
replacebool, optional

Whether to select an item that has already been selected before for this examinee. Default is False.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

itemsNDArray[numpy.floating] or None, optional

A matrix containing item parameters in the format that catsim understands (see: ItemBank.generate_item_bank()). Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

**kwargs

Additional keyword arguments. Notably:

  • rng (numpy.random.Generator) – Random number generator used by the object, guarantees reproducibility of outputs.

Returns:
int or None

Index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.RandomesqueSelector(bin_size: int)[source]

Bases: Selector

Implementation of the randomesque selector proposed by [Kingsbury89].

In this selector, at each step of the test, an item is randomly chosen from the \(n\) most informative items in the item bank, \(n\) being a predefined value (originally 5, but user-defined in this implementation).

property bin_size: int

Get the bin size.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, est_theta: float | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

itemsNDArray[numpy.floating] or None, optional

A matrix containing item parameters in the format that catsim understands (see: ItemBank.generate_item_bank()). Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

est_thetafloat or None, optional

A float containing the current estimated ability. Default is None.

**kwargs

Additional keyword arguments. Notably:

  • rng (numpy.random.Generator) – Random number generator used by the object, guarantees reproducibility of outputs.

Returns:
int or None

Index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.StratifiedSelector(test_size: int, sort_once: bool)[source]

Bases: FiniteSelector

Abstract class for stratified finite item selection strategies.

Stratified selectors divide the item bank into strata and select items from different strata as the test progresses, helping to balance item exposure and test characteristics.

Parameters:
test_sizeint

Number of items in the test.

sort_oncebool

Whether the strategy allows for the item matrix to be presorted once at the beginning of the simulation (True) or requires resorting during the test (False).

postsort_items(item_bank: ItemBank, using_simulator_props: bool, **kwargs: Any) ndarray[tuple[Any, ...], dtype[floating]][source]

Sort the item matrix before selecting each new item.

This default implementation simply returns the presorted items, or sorts them using the presort_items() method and returns them.

Parameters:
item_bankItemBank

An ItemBank containing item parameters.

using_simulator_propsbool

Whether the selector is being executed inside a Simulator.

**kwargsdict

Additional keyword arguments.

Returns:
numpy.ndarray

Array of item indices sorted according to the strategy.

preprocess() None[source]

Override this method to perform any initialization the Simulable might need for the simulation.

preprocess is called after a value is set for the simulator property. If a new value is attributed to simulator, this method is called again, guaranteeing that internal properties of the Simulable are re-initialized as necessary.

Notes

The default implementation does nothing. Subclasses should override this method if they need to perform setup operations that require access to the simulator.

abstractmethod presort_items(item_bank: ItemBank) ndarray[tuple[Any, ...], dtype[floating]][source]

Presort the item matrix according to the strategy employed by this selector.

Parameters:
item_bankItemBank

An ItemBank containing item parameters.

Returns:
numpy.ndarray

Array of item indices sorted according to the strategy.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

item_bankItemBank or None, optional

An ItemBank containing item parameters. Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

**kwargs

Additional keyword arguments.

Returns:
int or None

Index of the next item to be applied or None if there are no more strata to get items from.

class catsim.selection.The54321Selector(test_size: int)[source]

Bases: FiniteSelector

Implementation of the 5-4-3-2-1 selector proposed by [McBride83].

In this selector, at each step \(k\) of a test of size \(K\), an item is chosen from a bin containing the \(K-k\) most informative items in the bank, given the current \(\\hat\\theta\). As the test progresses, the bin gets smaller and more informative items have a higher probability of being chosen by the end of the test, when the estimation of ‘\(\\hat\\theta\) is more precise. The 5-4-3-2-1 selector can be viewed as a specialization of the catsim.selection.RandomesqueSelector, in which the bin size of most informative items gets smaller as the test progresses.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, est_theta: float | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

itemsNDArray[numpy.floating] or None, optional

A matrix containing item parameters in the format that catsim understands (see: ItemBank.generate_item_bank()). Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

est_thetafloat or None, optional

A float containing the current estimated ability. Default is None.

**kwargs

Additional keyword arguments. Notably:

  • rng (numpy.random.Generator) – Random number generator used by the object, guarantees reproducibility of outputs.

Returns:
int or None

Index of the next item to be applied or None if there are no more items in the item bank.

class catsim.selection.UrrySelector[source]

Bases: Selector

Selector that returns the item whose difficulty parameter is closest to the examinee’s ability.

This method, known as Urry’s method, selects items based on the proximity of their difficulty (b parameter) to the current ability estimate, which is particularly effective for 1PL and 2PL models where information is maximized when b = theta.

select(index: int | None = None, item_bank: ItemBank | None = None, administered_items: list[int] | None = None, est_theta: float | None = None, **kwargs: Any) int | None[source]

Return the index of the next item to be administered.

Parameters:
indexint or None, optional

The index of the current examinee in the simulator. Default is None.

itemsNDArray[numpy.floating] or None, optional

A matrix containing item parameters in the format that catsim understands (see: ItemBank.generate_item_bank()). Default is None.

administered_itemslist[int] or None, optional

A list containing the indexes of items that were already administered. Default is None.

est_thetafloat or None, optional

A float containing the current estimated ability. Default is None.

**kwargs

Additional keyword arguments.

Returns:
int or None

Index of the next item to be applied or None if there are no more items in the item bank.