Test sensitivity in assessing competencies in nursing education

Wittmann, Eveline; Weyland, Ulrike; Seeber, Susan; Warwas, Julia; Striković, Aldin; Krebs, Philine; Pohley, Monja; Wilczek, Larissa

doi:10.1186/s40461-022-00131-9

Research
Open access
Published: 27 April 2022

Test sensitivity in assessing competencies in nursing education

Eveline Wittmann ORCID: orcid.org/0000-0003-3985-3177¹,
Ulrike Weyland²,
Susan Seeber³,
Julia Warwas⁴,
Aldin Striković ORCID: orcid.org/0000-0002-3743-0501¹,
Philine Krebs³,
Monja Pohley¹ &
…
Larissa Wilczek²

Empirical Research in Vocational Education and Training volume 14, Article number: 3 (2022) Cite this article

3300 Accesses
2 Citations
2 Altmetric
Metrics details

Abstract

The identification of effects of vocational education and training conditions on competence development in nursing education requires longitudinal studies. An important precondition is the availability of a test of nursing competence which is economical in use, measures a homogeneous construct throughout years of nursing education and across nursing specializations, and can detect increases in the required competence, hence allowing for sensitive testing. This article describes a cross-sectional study that aimed to optimize a computer-based test measuring nursing competence in care for the elderly—the TEMA test—through the selection of items on the basis of measurement error, differential item functioning, and item difficulty. Evidence of the test sensitivity of the optimized TEMA-L instrument is presented for the second and third year of nursing education. The total sample consisted of n = 133 German nursing students from clinical and geriatric nursing. The resulting instrument includes two test booklets consisting of 36 (WLE = 0.72) and 35 items (WLE = 0.70) respectively for the second and third year of training. The cross-sectional data indicate that the test likely has good properties for sensitive testing of nursing competence in a future longitudinal study. Hence, it might be used to study factors contributing to increases in nursing competence in German VET and serve as an example for similar studies in other countries. Limitations of the current study and related subjects of future research are discussed.

Introduction

It is a common notion that vocational education and training (VET) leads to improvements regarding the competencies of apprentices. However, there is only little empirical knowledge available regarding the effects leading to such improvements in dual VET (Deutscher and Winther 2018), where outcomes may be affected by school-based instruction and in-company training, as well as by complex relations between instruction and learning in both venues. An important question in the use of competence tests for longitudinal studies examining such effects is whether the measurement instruments used are instructionally sensitive, meaning that they can detect improvement related to the quality of instruction (Naumann et al. 2019) in both the theoretical and the practical sphere (Deutscher and Winther 2018). Naumann et al. (2017) consider test sensitivity, understood as the overall variation of test scores across time points or groups, to be a prerequisite for identifying instructional sensitivity. This concept, which is the focus of our paper, implies that the test measures growth on a homogeneous construct over time. In contrast, item sensitivity—understood as a relative measure—can be defined as the degree to which the sensitivity of the respective item deviates from overall test sensitivity; it is usually measured through differential item functioning (DIF; Naumann et al. 2019) and should be low in a sensitive test.

There are only few existing domain-specific competence testing measures for VET that are suitable for larger samples (Abele et al. 2021) and allow for longitudinal application (e.g., Deutscher and Winther 2018). This is particularly true of nursing education, which in Germany is mostly conducted in non-academic settings and, while not officially part of the dual system of VET, is also an example of a dualistic non-academic form of VET with school-based instruction on the one hand and practical on-site training in care institutions on the other (Bals and Wittmann 2009; Lehmann et al. 2014). In this field, most of the internationally available measurement instruments for competencies have consisted for many years of either self-reports (Wu et al. 2015; Yanhua and Watson 2011) or clinical evaluations in real-world settings (e.g., objective-structured clinical examinations; see Solà-Pola et al. 2020). There has been a lack of systematically and consistently developed, valid and reliable assessment instruments in clinical practice (Immonen et al. 2019). Whereas examining nursing competence in real-world situations is preferable to self-reporting in terms of validity (Kajander-Unkuri et al. 2016), it is not only inefficient with larger samples but also deficient in terms of standardization and reliability, particularly in the case of repeated long-term testing. This is likely a reason why longitudinal studies are rare (e.g., Fan et al. 2015). One way to address these issues is through computerized testing, which the National Council Licensure Examination (NCLEX) requires for nursing licensure in the United States. To address standardization issues in these admission examinations for vocational nursing practice, Woo and Dragan (2012) carried out item sensitivity analyses for content relevance to subgroups based on DIF analyses. However, we could not find any study of nursing competence in the international and national literature conducted with the purpose of testing this construct across years of nursing education or even preparing for its sensitive and economical longitudinal testing. We aim to lay the foundation for such testing in the study presented in this paper.

To address issues of valid and reliable testing of nursing competencies in larger samples, we developed a computer-based test on nursing competence in care for the elderly using a video-based situational judgment approach. We reported in Kaspar et al. (2016) on the measurement quality of the TEMA test in a calibration study, using empirical evidence from a cross-sectional large-scale assessment with 402 geriatric nursing students at the end of nursing education. The test construction supports its curricular and content validity to test nursing competence across geriatric and clinical nursing. However, we were not able to examine its suitability for testing across years of nursing education. Hence, the TEMA test could be used reliably to determine and compare the results of apprentices in geriatric nursing at the end of VET across the expected capability range for students but not (yet) to determine progress throughout VET. In addition, the TEMA test comprises 77 items, requiring almost two hours of testing time, which restricts its economical application in combination with other instruments, such as measures of the quality of VET.

With the cross-sectional study presented in this paper, we therefore aim to further optimize this computerized instrument in two ways. This involves, first, enhanced test economics through a reduction in the number of items and, second, the design of an instrument that allows for tracing progress on a homogeneous core construct of nursing competence in care for the elderly. In preparation for a future longitudinal study, we present evidence of the intended test sensitivity of the TEMA test for the second and third year of nursing education and across nursing specializations (clinical and geriatric nursing). Hence, our research questions in this study are whether it is possible (1) to create an economical short form of the TEMA test providing for acceptable reliability, (2) to maximize test sensitivity by reducing the number of items whose relative item sensitivity deviates substantially from overall test sensitivity by applying differential item functioning (see Naumann et al. 2016, 2017), (3) to create a test enabling us to account for increases in achievement according to years of education, specifically to avoid floor effects. We pursue these targets while at the same time maintaining curricular and content validity and being fair across years of nursing education and nursing specializations. The purpose is to create an economic, reliable, and homogeneous test in which item difficulties balance out across the test for these subgroups, that is, preconditions other than increasing overall achievement on the core construct. With the resulting instrument, it should be possible to examine its aptitude for longitudinal analysis of competence development or to establish instructional sensitivity in a future study, for example by linking test results to the quality of VET (Wittmann et al. 2022; see Naumann et al. 2019).

The TEMA test

Against the background of the increasing relevance of care work for the elderly, implying specific foci such as multimorbidity or cognitive decline, we developed the TEMA test in order to evaluate the learning outcomes for nursing students regarding care for the elderly. To achieve this goal, we proposed a conceptual model of geriatric care competencies to guide the selection of a set of care situations and specific nursing behaviors for competence testing and to define a statistical model for estimating proficiency on the basis of test data. The TEMA test refers to competent action and interaction with care recipients and family members.^{Footnote 1} The instrument is intended to acknowledge care as a continuing mutual relationship with the care recipient and to align with the central elements of the care process, including diagnosis, intervention, and reflection (Kaspar et al. 2016).

The test is provided in the form of a video-based situational judgment test. Since competence assessment relies critically on the adequate representation of situations calling for the required behavior, we defined and validated a sampling space of everyday demands and challenges in care for elderly persons by means of systematic curricular analysis and expert interviews and refined it on the basis of Hundenborn’s (2007) concept of care situations. The test environment provides a set of care situations from three institutional fields of practice covering three major incidents of care for the elderly (dementia, chronic diseases, end of life): (1) long-term group care (LTC) for patients with dementia (dementia hostel), (2) outpatient care (OTC) with a focus on chronic diseases and multimorbidity, and (3) institutional palliative care (PAL); they include five hypothetical care recipients with multiple care needs as cases. Within the fields of practice, we developed an overall set of twelve situations referring to care affordances identified as typical on the basis of curriculum analyses and expert interviews, such as wound and pain management, care planning, nutrition counseling, and emergency measures, among others, providing for item prompts. The situations were transformed into short video sequences of about 1 or 2 min each, with the filming monitored by trained nurses to enhance authenticity of the settings and the acting (Kaspar et al. 2016). Curricular and content validation of the test comprised the breadth of nursing education relevant to care for the elderly in Germany, meaning geriatric and clinical nursing, as well as a generalized program curriculum comprising both specializations since 2020 (see Wittmann et al. 2022). Table 1 provides an overview of the institutional fields of practice, major incidents, and situations.

Table 1 Overview of the institutional fields of practice, major care incidents, and situations in the TEMA assessment

Test sensitivity in assessing competencies in nursing education

Abstract

Introduction

The TEMA test

Methods

Results

Item reduction through measurement error minimization

Differential item functioning

Item selection for measurement in the second and third year of nursing education

Discussion and limitations

Conclusions

Availability of data and materials

Notes

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords