Skip to main content

The challenge of assessing financial literacy: alternative data analysis methods within the Italian context



Assessing individuals’ financial literacy levels is currently widely recognized as being necessary to design effective financial education programs and also to evaluate their actual impact. To address the lack of a consensus regarding an appropriate instrument to measure financial literacy, the OECD and its International Network on Financial Education (INFE) developed a core questionnaire in 2011, to be administered across a wide range of countries. Italy participated in the study with a survey promoted by the financial consortium ABI–PattiChiari. A tailored version of the OECD/INFE questionnaire was used in the survey, with three indicators of financial literacy taken from the OECD survey (financial behavior index, financial attitude index, financial knowledge index) and two new indicators (financial familiarity index and financial planning).


The present paper focuses on data analysis methods used to evaluate financial literacy among the Italian adult population. It reviews data analysis approaches used to evaluate financial literacy and proposes a new method to gauge this latent construct in order to obtain a valid and reliable index that is able to capture educational needs in a manner that is as accurate and targeted as possible.


The sample used for the survey consisted of 1247 Italian residents of at least 18 years of age who were reached via CATI. The sample was obtained by appropriate stratification across several dimensions (gender, age, geographical area, and municipality size). We propose alternative data analysis methods to treat the survey data: item response theory (IRT) and classification and regression tree analysis.


The analysis highlighted the crucial role that data analysis methods play in assessing financial literacy. Comparing the results for classical test theory and IRT, this paper suggests that financial literacy research should be open to alternative and multiple approaches to obtain reliable measures of financial literacy that are able to capture the educational needs of different population groups and can help to design effective financial education programs.


Buying a home or ensuring an income for one’s retirement are just a couple of examples of the many situations that individuals will face during their lives and which require basic financial knowledge to make sound decisions. Currently, the ability to make informed, aware, and efficient financial decisions seems to be particularly important. Some recent trends converge to demonstrate a real need to promote and improve individuals’ financial literacy, especially in some countries, such as Italy (Coppola et al. 2017). As pointed out by analyses conducted in Italy and elsewhere (see The European House-Ambrosetti, Consorzio PattiChiari 2007; Grifoni and Messy 2012; Lusardi and Mitchell 2014), compared to past generations, people live longer (and live longer in retirement), which clearly suggests that there is a need to effectively manage money to achieve lifelong financial security. In this regard, another very recent trend is the greater personal responsibility that individuals have over pension planning, and the anxiety that they associate with it (Nicolini 2017). In Italy, the willingness to handle financial information is less pronounced for women, seniors and the unemployed (sometimes considered as “more vulnerable groups” in terms of poverty risk). Additionally, a low level of financial knowledge, financial anxiety and a lack of interest in financial issues show a negative correlation (Linciano et al. 2017). Taken together, these factors create challenges that might be more easily faced with the aid of financial education.

It is currently acknowledged that the design of effective educational interventions needs to be preceded by a thorough assessment of the initial or baseline level of financial literacy. This initial step is crucial for two reasons. First, a sound assessment approach helps in correctly identifying the educational gaps or biases to be addressed by the education programs. Second, this initial assessment represents an indispensable prerequisite for evaluating the success and impact of the defined interventions. Thus, assessing financial literacy is a challenge that needs to be undertaken by any organization, either public or private, that wants to engage in financial education at any level.

To date, the issue of assessing financial literacy has concentrated on the conceptual definition of the latent variable called “financial literacy”. Huston (2010) and Remund (2010) provide a thorough literature review that helps frame the issue of defining the concept of what financial literacy is or should be. Indeed, financial literacy has been variably defined as specifically referring to a form of knowledge (e.g., Hilgert et al. 2003), the ability to apply that knowledge (e.g., Mandell 2007), and good financial behavior (e.g., Moore 2003). The methods used to measure financial literacy vary quite substantially according to the different conceptual definitions adopted. In fact, without a consensual definition, financial literacy has been measured dissimilarly across studies. The construct either focuses on a few financial issues or covers a wide variety of financial topics, including debt, insurance, spending, budgeting, inflation, investments, and saving for retirement. Analogously, the number of questions used to assess financial literacy levels also varies widely, ranging from 3 to 45 items. Across studies, both performance tests (usually multiple-choice questionnaires) and self-report methods have been employed to measure financial literacy. Performance tests are mainly knowledge-based (e.g., Mandell 2007), while self-reports tend to assess perceived knowledge. More recently, tests have been designed to gauge both objective knowledge and perceived knowledge. In general, considerable progress has been achieved in the design of surveys aimed at identifying individual levels of financial literacy through the effort made by the OECD and its International Network on Financial Education (INFE) to develop and promote a common questionnaire based on the experience of a large number of previous rigorous national and international surveys. The OECD/INFE questionnaire and the underlying approach were described in OECD INFE (2011) and discussed in great detail by Kempson (2009). The questionnaire has been used in many countries (Atkinson and Messy 2012) to collect comparable data, including Italy (ABI-PattiChiari 2014).

In contrast, until very recently, the process of data analysis (i.e., of analyzing the information obtained through the questionnaires) has been less explored. Both bivariate and multivariate techniques are usually applied. In general, the responses to the proposed questions are simply summed to generate a score of financial literacy, which typically ranges between zero and the maximum number of correct answers. More recent studies have applied factor analysis (van Rooij et al. 2011). It is widely acknowledged, however, that more work is needed to develop rigorous psychometric analysis (Knoll and Houtts 2012). Leveraging the Italian experience in assessing financial literacy at the national level, this paper critically reviews data analysis approaches used to evaluate financial literacy, and proposes a new method to gauge this latent construct in order to obtain a valid and reliable index that is able to capture educational needs in a manner that is as accurate and targeted as possible.

The remainder of this paper is organized as follows. The second section provides an overview of the OCSE-INFE international survey on financial literacy and the main results obtained when run in Italy in 2013. Section three describes the sample used in the current study and outlines our approach to data analysis. Section four presents the empirical results. The final section summarizes the main results and draws some conclusions.

Background: the OECD–INFE international survey on financial literacy

In 2011 the OECD promoted a financial literacy survey within the framework of the INFE (OECD INFE 2011). The aim of the project was to collect information on the level of financial literacy among member countries, fully consistent with OECD recommendations to guarantee the cross-country comparability of financial literacy indicators. The findings of the first pilot study based on this approach are illustrated in Atkinson and Messy (2012).

Italy joined the survey in 2013 by means of a public–private partnership led by “PattiChiari,” a consortium of Italian banks committed to promoting market transparency and financial education. The questionnaire used to assess the respondents’ level of financial literacy closely followed the OECD INFE guidelines to measure financial literacy across countries. It included both a core questionnaire and a set of supplementary questions aimed at investigating issues such as the ability to properly access and use financial information and to plan for retirement, which were considered to be of interest for a better description of the Italian population’s level of financial literacy.

Following the OECD approach detailed in Atkinson and Messy (2012), the pieces of information provided in the core sections of the questionnaire were used to define three indicators of financial literacy: a financial behavior index (FBI), financial attitude index (FAI), and financial knowledge index (FKI). Furthermore, by exploiting the supplementary questions of the survey, two further indicators were created: a financial familiarity index (FFI; measuring the knowledge and usage of financial products and services), and financial planning index (FPI; focusing on the respondents’ ability to plan for retirement). The exact definitions and statistical descriptions of all indicators are reported in ABI-PattiChiari (2014) and Baglioni et al. (2018).

The FBI was obtained as an additive indicator, ranging from zero to nine, based on the answers to questions focusing on the financial decisions of the respondent, and assigning a score to each answer that increased with the quality of each decision (i.e., “savvy” financial behavior). The questions covered the consistency of the respondent’s purchases with respect to budget constraints, the ability to meet payment deadlines and to maintain an adequate financial budget, the quality of savings and the choices of financial products, and the ability to commit to long-term financial planning.

The FAI provided a measure of the respondents’ propensity to save. The index was built by assessing the individual’s attitude toward saving for the future, as well as the perception of the tradeoff between current and future spending. The index was obtained by means of categorical scores (between 1 and 5) with higher values indicating a higher propensity to save.

Following Lusardi and Mitchell (2011), the FKI was based on the number of correct answers given by the respondent to questions addressing simple financial concepts such as the role of inflation, the ability to compute simple and compound interest rates, the relationship between risk and return, and the notion of portfolio diversification.

The two supplementary indicators provided a closer look into the respondent’s familiarity with financial products and the ability to plan for retirement. The FFI was based on the respondent’s knowledge and usage of fifteen financial instruments, ranging from bank accounts and credit cards to mutual funds, stocks and shares, and insurance products.

The FPI aimed to establish the respondents’ awareness of the necessity to plan savings in advance in order to smooth consumption over the entire life cycle. The index was based on three questions assessing the existence of a financial budget at the household level, familiarity with supplementary pension funds, and familiarity with other forms of long-term savings to support retirement income.

To obtain a comprehensive measurement of financial literacy, the indexes described above were then aggregated, building on the approach detailed by Atkinson and Messy (2012). Indeed, the authors highlight that “financial literacy is a combination of knowledge, attitude and behavior, and so it makes sense to explore these three components in combination […] by adding the scores together” (Atkinson and Messy 2012, p. 39). The financial literacy index was therefore built as a simple average of the three indexes describing an individual’s financial behavior, financial attitude and financial knowledge. Along with this first financial literacy index, a second and more comprehensive financial literacy index was computed. This included all five elementary indexes depicted above; i.e., the three indexes suggested in the OECD guidelines plus the two indexes obtained from the supplementary questions.

Once the elementary and comprehensive indexes had been computed, they were subsequently used to analyze their relationship with the usual set of sociodemographic and economic characteristics of the respondents to obtain a description of the determinants of the level of financial literacy of the Italian population.

Two main methods were applied: ordered probit and ordinary least square (OLS) regressionsFootnote 1 and classification and regression tree (CART) analysis.Footnote 2 The former are traditional forms of analysis that estimate the relationships between a dependent variable (which could be a categorical and ordinal variable or continuous variable) and a set of independent variables. The latter is a non-parametric regression and classification method originally introduced by Breiman et al. (1984). It allows the simultaneous identification of significant covariates impacting on the dependent variable of interest (in our case, the financial literacy indexes) and significant clusters (in our case, of individuals) that exhibit relevant differences with respect to the dependent variable, and homogeneous characteristics with respect to the explanatory variables considered.

In other words, using probit or OLS regressions, the researcher obtains a causal relationship (significance and relevance of impact) between the level of financial literacy and the sociodemographic explanatory variables under investigation. With CART analysis, the researcher is also able to split the sample into relevant and homogeneous clusters that exhibit differences in their sociodemographic characteristics with respect to their (similar) level of financial literacy.

These two diverse approaches to data analysis produce different results concerning the main determinants of the aggregate level of financial literacy as well as its elementary factors (knowledge, behavior, attitude).

Tables 5 and 6 and Figs. 3, 4 and 5 in Appendix 1 present these differences. The tables include the results of ordered probit and OLS regressions applied to each elementary index and to the aggregate indicators of financial literacy.Footnote 3 Figures 3, 4, 5 illustrate the results of applying CART analysis. The differences are immediately apparent in two ways. First, the influencing covariates are not necessarily the same. Second, with CART analysis it is possible to identify different clusters of individuals with respect to the same variable that regression analysis highlighted as being relevant in influencing their level of financial literacy.

For instance, with respect to the FFI (Fig. 3), the CART analysis resulted in ten different clusters of individuals, initially identified with respect to their participation in the labor force. In this case, what was then relevant in explaining their familiarity with financial products was their level of income, followed by the area of residence for low-income individuals. On the other hand, for inactive individuals (in search of a job but also retirees and students), the second most important explanatory variable was the area of residence, followed by education for those living in southern Italy; however, marital status was more important for individuals living in northern Italy. On the other hand, the probit regression applied to the FFI resulted in a larger number of influencing factors beyond those highlighted by the CART analysis; i.e., gender, age, and direct involvement in the financial decisions of the household. In addition, regression analysis suggested to policy makers that all individuals sharing the same characteristics (for instance, living in southern Italy) are potentially identical targets for the same education program, showing the same deficit in financial literacy. In contrast, CART analysis showed that there are at least three very different clusters among individuals residing in southern Italy (those active in the job market, those not working with low education, and those inactive with a higher level of education), suggesting that individuals not working and with a low level of education comprise the target group most in need of financial education programs.

Similarly, considering the financial knowledge of the respondents (Fig. 4), CART analysis identified gender as the first discriminant factor in the sample of respondents, followed by education and income. As before, the regression analysis identified, with no scale of priority, a larger number of significant covariates, including age and having an active role in financial decision making within the household, in addition to gender, level of income and level of education (as in the CART analysis). In the specific case of the FKI, it is important to underline that targeting women as a single homogenous cluster in need of financial education is, again, a choice with potentially limited effects. Indeed, women with a higher level of educational attainment (university degree), who are married or cohabitating, and who are in the labor force earning medium- to high-level salaries, show, on average, degrees of financial knowledge that are similar to those attained by men. Those who are in greater need of receiving educational support on basic financial issues are women with limited education or those not in the labor force.

Considering the global financial literacy index proposed by the OECD INFE guidelines (see Fig. 5), the OLS regression identified gender, age, involvement in the financial decisions of the household, marital status, level of education, income and area of residence as relevant determinants of the individual’s level of financial literacy. CART analysis restricted the number of relevant covariates, highlighting that the level of education was the first discriminatory variable to define individuals with lower and higher levels of financial literacy. Next, among individuals who had attained tertiary education, the level of income was the second most important discriminatory variable, which helped to identify a specific cluster with a high educational level but a low-income level. Once the level of income was considered, the area of residence was important for medium-income individuals, whereas gender became relevant for high-income individuals. On the other side of the “tree”—individuals with educational attainment up to the secondary level—the geographical area of residence, first, and age, second, were found to be discriminating factors. In summary, a financial education program targeting women as a homogenous (and vulnerable) cluster in need of educational support would not take into account the fact that only highly educated, high-income women are in need of such a specific program, whereas all other women can be addressed by finance programs targeting men with, for example, a low income and lower educational attainments.

So far, by critically reviewing the outcomes of one of the most comprehensive surveys conducted in Italy, we have shown that, according to the diverse statistical methods used to analyze the same financial literacy indexes, different insights about the level of financial literacy of individuals can be revealed, which have important implications for policies (including the design of education programs) aimed at improving this literacy.

A further step toward understanding a latent variable such as financial literacy could come from the adoption of statistical approaches that are able to provide information on the reliability and validity of the measures used. In the next section, we apply a well-known psychometric technique—item response theory (IRT)—in an area where such techniques are not often applied; i.e., financial literacy measurement. To the best of our knowledge, only a few studies have explored the viability of these models for assessing financial literacy (Bongini et al. 2012, 2015; Knoll and Houtts 2012; Despard and Chowa 2014).



The proposed procedure for data handling was applied to a sample comprising 1247 Italian residents of at least 18 years of age who were reached via CATI. The sample was obtained by appropriate stratification across several dimensions (gender, age, geographical area, and municipality size).

Table 1 shows the distribution of the sample relative to the main sociodemographic variables. The respondents’ average age is approximately 50 years. Almost 60% are married or cohabitants; and 42.3% are employed. The median family income declared by the respondents is approximately 1900 euros. Regarding the educational level of the respondents, approximately 21% received only primary education, 29% secondary education (lower level), 31% secondary education (upper level), and 11% tertiary education. In terms of the geographic composition of the sample, 46.5% of the respondents live in the northern region of the country; approximately 31.01% reside in small municipalities (up to 10,000 inhabitants); and 23.5% in large cities (above 100,000 inhabitants).

Table 1 Sample distribution

Statistical analysis: item response theory

The issue of the most appropriate way to measure literacy has attracted increasing attention in educational research over the last two decades. One important aim in measurement is to build tests with high validity and reliability. The two most popular frameworks in educational measurement are classical test theory (CTT) and item response theory (IRT) (Hambleton and Jones 1993). In general, CTT has dominated the area of standardized testing because of its weak assumptions and its easy interpretation. Indeed, the indexes proposed by the OECD approach and discussed above rely on CTT. Despite these features, CTT has been criticized since the score on a test is not an absolute characteristic of the respondent. In fact, it depends on the content of the test. Moreover, the difficulty of the items may vary depending on the sample of respondents who take a specific test. It is therefore difficult to compare the data of respondents between different tests. For these reasons, IRT was originally developed to overcome the problems with CTT.

The specific feature that makes IRT models increasingly popular in many areas of research is the presence of a metric that considers both the test’s difficulty and the respondent’s specific abilities. IRT aims to measure one or more ordinal/quantitative latent variables on a metric level of measurement, and it is fit to quantify aspects such as ability and personal traits. For these reasons, it has been widely adopted in educational research and psychometrics, where researchers develop and design exams, maintain banks of items for exams, and measure the items’ difficulties for successive versions of exams by the use of IRT (Bond and Fox 2007; Goldstein 1979). For example, in computerized adaptive testing (CAT), the respondents respond to items that are optimally selected to assess their attitude or abilities. The respondents may receive no common items. IRT helps to select the items for a respondent and to measure the scores across different subsets of items. For instance, several aptitude tests need IRT to estimate the abilities of the respondents, such as the Armed Services Vocational Aptitude Battery, the Scholastic Aptitude Test (SAT), and the Graduate Record Examination (GRE). Several individual intelligence tests adopt IRT to manage the tests, such as the Woodcock–Johnson Psycho-Educational Battery, the Differential Ability Scales, and the Stanford-Binet test (Embretson and Reise 2013). Furthermore, several researchers have applied IRT to personality trait measurements (Reise and Waller 1990), as well as to attitude measurements and behavioral ratings (Engelhard and Wilson 1996).

The Program for International Student Assessment (PISA) surveys has been adopting IRT models since 2000 (Liu et al. 2008). Moreover, personal properties or item characteristics can be included in IRT models to explain person or item effects, obtaining explanatory item-response models (De Boeck and Wilson 2004). Until very recently, the analysis of financial literacy has relied only on CTT. To the best of our knowledge, only a few studies have used IRT in this domain (Knoll and Houts 2012; Bongini et al. 2012, 2015; Despard and Chowa 2014).

In general, IRT models convert raw scores into linear and reproducible measurements. An IRT model has two properties, which require checking in order to ensure the model’s validity. Those properties are unidimensionality and local independence. The unidimensional property requires that the items of a questionnaire share a common primary construct (i.e., that they all measure financial literacy), while the local independence property requires that the items are significantly independent of each subpopulation of respondents whose members are homogeneous with respect to the latent trait measured (for instance, gender or race).

According to IRT models, an individual’s response to an item is determined by his/her level of knowledge (alternatively, ability or trait) of the latent variable under investigation (e.g., financial literacy), and by the level of difficulty of the given item. IRT models define the score (number of items answered correctly) of a particular respondent as a probability function of his/her ability and item difficulty. One way of expressing IRT models is in terms of the probability that an individual with a particular trait will correctly answer an item that has a particular level of difficulty, as expressed in the following formula:

$$P\left( {X_{pik} = 1 |\theta_{p} , \beta_{ik} } \right) = \frac{{e^{{(\theta_{p} - \beta_{ik} )}} }}{{1 + e^{{(\theta_{p} - \beta_{ik} )}} }}$$

In Formula (1), Xpik refers to the response X made by the p-th individual to the i-th item (k refers to the possible level of the i-th itemFootnote 4); θp refers to the level of knowledge (ability) of financial literacy of the p-th individual; and βik is the level of financial literacy (difficulty) required to reach level k of the i-th item. In addition, we let βi denote the average level of financial literacy for the i-th item.

A typical representation of IRT is an “item map” where the item difficulties can be placed like points along a line and the person’s ability as a point along the same line. In Fig. 1, we apply this method to the data underlying the FKI described in the previous section.

Fig. 1
figure 1

Item-person map for the FKI (CART procedure)

To answer our central research question—i.e., whether survey outcomes are sensitive to the data handling method employed—we applied IRT analysis to the survey data, checking the two properties of unidimensionality and local independence. Our aims are, firstly, to test whether the selected items were indeed measuring the same latent construct (i.e., an individual’s financial knowledge, financial attitude, financial behavior, and level of financial literacy); and, secondly, to ensure the local independence property by assessing whether the instrument is measuring the specific object. Our third aim is to analyze the attributes of the items and the respondents on the same scale, via the item-person map, to convey easy-to-read information about the distribution of the respondents and the chosen items.


Table 2 displays the misfit indexes (Wright 1999; Bond and Fox 2007) for our three elementary indexes (financial knowledge, financial attitude, and financial behavior) and for the overall latent variable of financial literacy. As the term implies, a misfit is an observation that cannot fit into the overall structure of the questionnaire and is an indicator of how well the data conform to the IRT model parameters. In this work, we used the index based on the average value of the squared residuals (MNSQ). Two types of fit statistics are addressed by the MNSQ: infit (the weighted average of the squared residuals) and outfit (unweighted average of the squared residuals) (Bond and Fox 2007). Guidelines vary according to test, item and respondent characteristics, but for general purposes, an MNSQ value in the interval [0.5–1.5] means that the item is “productive” for the measurement. In contrast, for values greater than 2.0, the item is considered degrading for the measurement (Linacre 2006). Almost all the items used for computing the three sub-indexes are “productive for measurement”, which means that they do not distort the measure under investigation; i.e., they all measure the same latent construct. In the case of the FBI and the overall financial literacy index, the test confirmed that all but three items are coherent and finalized to measure the specific latent variable. However, such items did not degrade the measurement system; thus, they can be maintained in the questionnaire. In summary, we can confirm that the items proposed in the OECD/INFE questionnaire are indeed good measures of one latent variable and can be used together.

Table 2 The measure of the items for FBI, FAI, FKI, and financial literary index

A second relevant property that needs to be assessed is local independence, which ensures that the instrument is measuring the specific object. For this purpose, we apply principal component analysis to the standardized residuals (Smith 2002). Table 3 shows the standardized residual variance decomposition for our set of indexes.

Table 3 Standardized residual variance of the indexes

The raw variances of the empirical model explained by each index closely match the expected raw variances (modeled). Moreover, because the modeled values for the three indexes are in the interval [50–60%], the measurement scales can be considered fairly good. Regarding the overall financial literacy index, which exhibits a value greater than 80%, the measurement scale is considered excellent. Furthermore, for the three indexes, the unexplained variances in the 1st contrast demonstrate that the instrument is good (Fisher 2007) since it falls within the required interval [5–10%]. Given that the value of unexplained variances for the global financial literacy index is less than 3%, the measurement instrument is excellent. In summary, we confirm, on solid statistical grounds, that the items used to build the four indexes meet the required unidimensional and local independence traits and are appropriate to define the level of financial literacy of an individual.

The third aim of our analysis was to assess the attributes of the items and respondents at the same time via the item-person map. Figure 1 presents the item-person map for the FKI. Maps produced by IRT models can be used to quickly communicate complex information and do so in a presentational format that can be easily understood. Indeed, if we were not using an IRT metric, we would have been unable to measure, on the very same scale, both the respondents’ ability and the questions’ difficulty. In fact, in the case of the financial knowledge items, the item difficulty scores ranged between 0 and 1247 (i.e., the whole sample): zero was applied to the case where no respondent answered each question correctly, and 1247 applies to the case where the whole sample answered each question correctly. Conversely, a person’s ability is measured on a scale ranging from 0 to 6: 0 corresponds to a person who was unable to answer any item correctly, and 6 applies to a person who answered the whole set of questions correctly. Therefore, the two metrics are not directly comparable.

The IRT item-person map shown in Fig. 1 orders the level of financial knowledge of the respondent (left-hand side), and the difficulty of the multiple-choice questions (right-hand side). The questions at the top of the scale were more difficult to answer; hence the test becomes easier further down the scale. The individuals with the least financial ability (at the bottom of the scale) had difficulty even with the easiest concepts (e.g., the relationship between risk and return); whereas the individuals with the most financial literacy (at the top of the scale) had no difficulty performing any of the activities implied by the questions. In particular, the respondents on the upper left-hand side were said to be “better” or “smarter” than the items on the lower right-hand side, which means that these easier items were not difficult enough to challenge highly proficient individuals. On the other hand, the items on the upper right-hand side outsmarted the individuals on the lower left-hand side, which implies that these difficult items were beyond the level of ability possessed by our sample. Items 4 and 6 were the easiest and most difficult to answer, respectively. This conclusion is also supported by the frequency distribution of the answers given to the six items concerning the construct of financial knowledge. Table 4 lists the percentage of correct answers for the six items.

Table 4 Percentage of correct answers for the six items of the FKI

The relevant contribution of IRT lies in the fact that the map reproduces directly the frequency distribution of the respondents with respect to their financial knowledge (ability) and the position of the items with respect to their difficulty in the financial knowledge construct. The unit of measurement of difficulty and ability is the same. For instance, item 4, with a difficulty equal to − 0.99, was correctly answered by 85.3% of the respondents. This is equivalent to saying, ‘85.3% is the proportion of respondents who have an ability greater than − 0.99.’

Having confirmed that the items were correctly chosen, and having investigated the relationship between difficulty and ability, a researcher is subsequently provided with a number of statistical methods to further investigate the socioeconomic characteristics of the respondents in relation to the IRT measure. For instance, it might be useful to evaluate whether a specific subgroup (defined by age, gender, or education) is disadvantaged or advantaged with respect to single items (numeracy problems, behavioral aspects, or attitude issues) and the whole issue under investigation (financial literacy). Differential item functioning (DIF) is a method that can uncover such differences, as explored by Bongini et al. (2012, 2015), who found a gender gap among university students on a single item (but not one that referred to the whole construct of financial literacy). Alternatively, one can include IRT measures and the socioeconomic characteristics of the respondents in a latent regression model, which provides a powerful framework to detect and analyze group differences that considers the characteristics of both items and individuals simultaneously (De Boeck and Wilson 2004).

Finally, CART analysis can be applied to the IRT measure. In this study, we applied CART analysis to the overall financial literacy index to compare the results when applied to the same construct (financial literacy) but measured through two different methods, CTT (Fig. 5) and IRT (Fig. 2). It is immediately apparent that the same approach applied to two different ways of constructing the same latent variable delivers different results with respect to relevant clusters differing in their level of financial literacy. In other words, depending on how we handle financial literacy data, through CTT or IRT models, we end up with dissimilar outcomes about who needs more financial education.

Fig. 2
figure 2

Segmentation of the Italian population with respect to the aggregate indicator of financial literacy, as defined by the IRT model (CART procedure)


The present paper aimed to provide insight in order to improve the procedures for analyzing data that describe a latent variable such as financial literacy, leveraging the recent Italian national survey based on the approach proposed by the OECD through its INFE. As underlined in the introduction, assessing the baseline level of financial literacy represents an indispensable prerequisite to the design of effective education programs; that is, interventions that successfully address specific target groups with particular educational needs. The evidence provided in this paper shows, firstly, that different methods of analysis applied to the same measure of financial literacy deliver different results; and, secondly, that the same method of analysis applied to different measures of financial literacy also delivers different results. Consequently, we can state that the method of data analysis is crucial for the subsequent step of devising successful education programs in the field of financial literacy among different target groups. In particular, our findings show that adopting a specific method of data analysis delivers results that would not be obtained by adopting an alternative method, thus indicating that different approaches cannot be considered interchangeable. These findings suggest further improvements to the process of financial literacy evaluation which we summarize here.

First, CTT has long been proven to be outdated as regards defining people’s level of financial literacy. A basic test should be integrated into more sophisticated models where the difficulty of the items and the ability of the respondents are considered. From this perspective, using IRT helps to define for every possible test item difficulty the existence of a weighted score that corresponds to that level of ability, opinion, or feeling of the respondents. Moreover, when the assumptions of IRT are proven, its estimates of the item parameters are independent of the sample. A respondent should show the same ability, independent of the set of items adopted; and conversely, a given item should have the same difficulty, independent of the respondents.

Second, applying alternative and more sophisticated methods of data analysis to financial literacy data enables researchers to target specific population groups. Instead of assuming that sharing the same personal characteristics among individuals (e.g., gender) necessarily means sharing the same financial literacy needs, the results of our CART analysis suggest that women should not be considered a homogeneous group in terms of their level of financial literacy. Consequently, policy makers cannot treat “women” as a potentially identical target of the same education program; rather, they should differentiate and develop specific programs depending on the different cluster to which women belong (e.g., educational level, residential area). In this regard, with the goal of targeting people (especially the more financially vulnerable ones) in an ever more detailed and precise way, the data analysis methods used in this study might offer the possibility to also include other individual non-cognitive characteristics such as personality traits, which were recently proven to be a fundamental aspect of financial behavior. For example, research has investigated conscientiousness (Roa et al. 2017) and impulsivity (Baldi et al. 2013; Iannello et al. 2015; Bongini et al. 2015), as well as the social roles of respondents, such as homemakers vs. financial workers (Croson and Gneezy 2009; Dwivedi et al. 2015).

Many of the studies carried out in Italy to date have focused on financial knowledge (the cognitive aspect of financial literacy). However, future research should perhaps carry out more in-depth analysis of soft skills rather than content knowledge, such as the confidence to be proactive, and a willingness to take investment risks. For example, in a meta-analysis carried out by Fernandes et al. (2014), measured knowledge of financial facts had a weak relationship to financial behavior in econometric studies, controlling for omitted variable bias. As pointed out by many authors (e.g., Worthington 2006; Nicolini 2017), financial literacy should be tested against an individual’s needs and the context in which they live, not against a large set of available financial products and services, since consumers will never need or use most of these products and services. The assumption here is that an individual’s financial literacy should not be measured in “a linear sense” but, rather, with respect to the set of knowledge that is necessary to deal with specific financial needs, desires, expectations and fears of specific groups of consumers. From this perspective, “financial literacy becomes a multidimensional construct, with an individual being knowledgeable in certain domains (e.g., investing) while showing a deep lack of knowledge in others (e.g., borrowing)” (Nicolini 2017, p. 35). However, a lack of knowledge in a specific area is not considered a very critical gap if the individual is not called to make financial decisions in that area.

In line with this view, we are aware that the questionnaire used in our study involves some aspects that warrant critique. That questionnaire was a version of the OECD/INFE questionnaire tailored to a national survey and, as pointed out by Robson and Splinter (2015), one problem with national surveys is that there is no clear way to assess individual responses and micro-level changes over time in regard to behavior. As for studies that use the questionnaire to provide reliable information on what people do in the financial domain, another limitation is that the questionnaire tests behavior with self-assessed questions that deal with financial problems and tasks which may be not be realistic for every respondent. In fact, it focuses primarily on one aspect of an individual’s financial capability,Footnote 5 attaching less importance to the context. Further research should take into account social and contextual issues, as suggested by some institutions that promote financial inclusion and financial wellbeing (e.g., CYFI 2012; CFPB 2015), and by authors who are critical of mainstream approaches to financial literacy and work with people living on low incomes (e.g., Landvogt 2006; Rinaldi 2016).

To conclude, our research suggests that financial literacy research should be open to new and alternative approaches to measurement, while being aware that different data analysis methods can produce different results. Therefore, different types of analysis are called for. Additionally, researchers should be clear about why one method is to be preferred to another, and why one set of results are more useful than another set. This sort of information would be useful to policy makers who are keen to design more efficient and more effective financial education programs for target groups.


  1. See Chapter 7, ABI-PattiChiari (2014) and Baglioni et al. (2018).

  2. Chapter 3, ABI-PattiChiari (2014).

  3. See Chapter 7, ABI-PattiChiari (2014) and Baglioni et al. (2018) for the whole set of specifications estimated. For the sake of brevity, we report the two most comprehensive models, including all possible covariates and their meaningful interactions. In particular, these specifications control for whether the respondent plays an active role in making the household’s financial decisions and check for the robustness of the gender effect (including a dummy for the marital status of the respondent and its interaction with gender).

  4. The items have ordered categories 1,2…k,…K, and K could vary among items.

  5. A broader concept that can be defined as the internal capacity to act in one’s best financial interest, given socioeconomic environmental conditions—The World Bank 2013.



computerized adaptive testing (cat)


classification and regression tree


classical test theory


Consumer Financial Protection Bureau


Child Youth Financial International


differential item functioning


financial attitude index


financial behavior index


financial familiarity index


financial knowledge index


financial planning index


Graduate Record Examination


International Network on Financial Education


item response theory


average value of the squared residuals (mean square)


ordinary least square


Program for International Student Assessment


Scholastic Aptitude Test


  • Abi-PattiChiari (2014) Le competenze economico-finanziarie degli italiani. Bancaria Editrice, Roma

    Google Scholar 

  • Atkinson A, Messy F (2012) Measuring financial literacy: results of the OCSE/International Network on Financial Education (INFE) Pilot Study, OCSE Working Papers on Finance, Insurance and Private Pensions

  • Baglioni A, Colombo L, Piccirilli G (2018) On the anatomy of financial literacy in Italy. Econ Notes 47(2–3):245–304

    Article  Google Scholar 

  • Baldi PL, Iannello P, Riva S, Antonietti A (2013) Cognitive reflection and socially biased decisions. Stud Psychol 55:265–271

    Google Scholar 

  • Bond TG, Fox CM (2007) Applying the rash model. fundamental measurement in human sciences. Psycology Press, London

    Google Scholar 

  • Bongini P, Trivellato P, Zenga M (2012) Measuring financial literacy among students: an application of rasch analysis. Elect J Appl Stat Anal 5(3):425–430

    Google Scholar 

  • Bongini P, Trivellato P, Zenga M (2015) Business students and financial literacy: when will the gender gap fade away? J Financ Manag Markets Instit 3(1):13–19

    Google Scholar 

  • Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman & Hall/CRC, Boca Raton

    Google Scholar 

  • CFPB (Consumer Financial Protection Bureau) (2015) Financial well-being: the goal of financial education, Report.

  • Coppola M, Langley G, SabatinI M, Wolf R (2017) When will the penny drop money, financial literacy and risk in digital age, Allianz International Pension Papers 1, Munich, Germany.

  • Croson R, Gneezy U (2009) Gender differences in preferences. J Econ Lit 47(2):1–27

    Article  Google Scholar 

  • Child and Youth Finance International (2012) Children and youth as economic citizens: review of research on financial capability, financial inclusion, and financial education. Research Working Group Report. CYFI, Amsterdam.

  • De Boeck P, Wilson M (2004) Explanatory item response models: a generalized linear and non linear approach. Springer, New York

    Book  Google Scholar 

  • Despard MR, Chowa GS (2014) Testing a measurement model of financial capability among youth in Ghana. J Consum Aff 48(2):301–322

    Article  Google Scholar 

  • Dwivedi M, Purohit H, Mehta D (2015) Improving financial literacy among women: the role of universities. Economic Challenger, October–December 2015. Retrieved from

  • Embretson SE, Reise SP (2013) Item response theory. Psychology Press, Hove

    Google Scholar 

  • Engelhard G, Wilson M (1996) Objective measurement: theory into practice, vol 3. Ablex, Norwood

    Google Scholar 

  • Fernandes D, Lynch JG Jr, Netemeyer RG (2014) Financial literacy, financial education and downstream financial behaviors. Manag Sci 60(8):1861–1883

    Article  Google Scholar 

  • Fisher WP (2007) Rating scale instrument quality criteria. Rasch Meas Trans 21:1095

    Google Scholar 

  • Goldstein H (1979) Consequences of using the rasch model for educational assessment. Br Educ Res J 5(2):211–220

    Article  Google Scholar 

  • Grifoni A, Messy F (2012) Current status of national strategies for financial education: a comparative analysis and relevant practices, OCSE Working Papers on insurance and private pensions

  • Hambleton RC, Jones RW (1993) Comparison of classical test theory and item response theory and their applications to test development. Educ Meas Issue Pract 12(3):38–47

    Article  Google Scholar 

  • Hilgert M, Hogarth J, Beverley S (2003) Household financial management: the connection between knowledge and behavior. Technical Report 309–322. Fed Res Bull

  • Huston SJ (2010) Measuring financial literacy. J Consum Aff 44:296–316

    Article  Google Scholar 

  • Iannello P, Biassoni F, Nelli B, Zugno E, Colombo B (2015) The influence of menstrual cycle and impulsivity on risk-taking behaviour. Neuropsychol Trends 17:47–52

    Article  Google Scholar 

  • Kempson E (2009) Framework for the development of financial literacy baseline surveys. OECD Working Papers on finance, insurance and private pensions, No. 1. OECD Publishing, Paris.

  • Knoll MAZ, Houts CR (2012) The financial knowledge scale: an application of item response theory to the assessment of financial literacy. J Consum Aff 46(3):381–410

    Article  Google Scholar 

  • Landvogt K (2006) Critical financial capability. In: Paper presented at the financial literacy, banking and identity conference, RMIT University, 25–26. Accessed 20 Sept 2015

  • Linacre JM (2006) Data variance explained by measures. Rasch Meas Trans 20:1045–1047

    Google Scholar 

  • Linciano N, Gentile M, Soccorso P (2017) Report on financial investments of Italian households. Commissione Nazionale per le Società e la Borsa (Consob).

  • Liu OL, Wilson M, Paek I (2008) A multidimensional Rasch analysis of gender differences in PISA mathematics. J App Meas 9(1):18–35

    Google Scholar 

  • Lusardi A, Mitchell OS (2011) Financial literacy around the world: an overview. J Pension Econ Financ 10(4):497–508

    Article  Google Scholar 

  • Lusardi A, Mitchell OS (2014) The economic importance of financial literacy: theory and evidence. J Econ Lit 52(1):5–44

    Article  Google Scholar 

  • Mandell L (2007) Financial literacy of high school students. In: Xiao JJ (ed) Handbook of consumer finance research. Springer, New York, NY, pp 163–183

    Google Scholar 

  • Moore D (2003) Survey of financial literacy in Washington State: Knowledge, behavior, attitudes, and experiences. Technical Report n. 03–39, Social and Economic Sciences Research Center, Washingtopn State University

  • Nicolini G (2017) The assessment methodologies of financial literacy. In: Linciano N, Soccorso P (eds) Challenges in ensuring financial competencies. Essays on how to measure financial knowledge, target beneficiaries and deliver educational programme, Quaderni di finanza CONSOB, n.84, ottobre. Tiburtini s.r.l., Roma, pp 34–43

    Google Scholar 

  • OECD INFE (2011) Measuring financial literature: questionnaire and guidance notes for conducting an internationally comparable survey of financial literacy. OECD, Paris

    Google Scholar 

  • Reise SP, Waller NG (1990) Fitting the two-parameter model to personality data. App Psychol Meas 14:45–58.

    Article  Google Scholar 

  • Remund DL (2010) Financial literacy explication: the case for a clearer definition in an increasingly complex economy. J Consum Aff 44(2):276–295

    Article  Google Scholar 

  • Rinaldi E (2016) The relationship between financial education and society: a sociological perspective. Italian J Sociol Educ 8(3):126–148

    Google Scholar 

  • Roa MJ, Garron I, Barboza J (2017) The importance of numerical abilities, conscientiousness and financial literacy in financial decision-making: an empirical analysis in the Andean Region. In: Paper presented at the 3rd Cherry Blossom financial education institute April 6–7. Washington, DC

  • Robson J, Splinter J (2015) A new (and better) way to measure individual financial capability, research report, prepared under contract to Vancity Credit Union. Carleton University, Ottawa.

  • Smith EV Jr (2002) Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. J App Meas 3(2):205–231

    Google Scholar 

  • The European House-Ambrosetti, Consorzio PattiChiari (2007) L’Educazione Finanziaria in Italia. Riflessioni e proposte per migliorare l’educazione finanziaria del Paese

  • The World Bank (2013) Making sense of financial capability surveys around the world. A review of existing financial capability and literacy measurement instruments. Working paper, World Bank, Washington, DC.

  • Van Rooij M, Lusardi A, Alessie R (2011) Financial literacy and stock market participation. J Financ Econ 101(2):449–472

    Article  Google Scholar 

  • Worthington A (2006) Predicting financial literacy in Australia. Financ Serv Rev 15:59–79

    Google Scholar 

  • Wright BD (1999) Fundamental measurement for psychology. In: Embretson SE, Hershberger SL (eds) The new rules of measurement: what every psychologist and educator should know. Erlbaum, Mahwah, pp 65–104

    Google Scholar 

Download references

Authors’ contributions

Conceived and designed the research: PB, MZ, ER. Formal analysis: PB, MZ. Analyzed the data: PB, MZ. Methodology: PB, MZ. Data interpretation: PB, PI, MZ, ER. Writing—original draft preparation: PB, PI, MZ, ER. Writing—review & editing PB, AA. Supervision: PB, AA. All authors read and approved the final manuscript.


Not applicable.

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

Data will not be shared, but the dataset supporting the conclusions of this article can be requested at Pattichiari (now Fondazione per l’Educazione Finanziaria e al Risparmio -

Consent for publication

Not applicable.

Ethics approval and consent to participate

Authors did not collected data themselves, but they used the dataset “Rilevazione ICF: Indice di Cultura Economico-Finanziaria” (™Consorzio PattiChiari 2012).


Consorzio Pattichiari (now Fondazione per l’Educazione Finanziaria e al Risparmio—FEDUF) funded the collection of the data. No funds were received for design, analysis and interpretation of the data of the present paper.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Paola Iannello.

Appendix 1

Appendix 1

See Tables 5, 6 and Figs. 3, 4, 5.

Table 5 Results of the ordered probit regressions applied to each elementary i.
Table 6 Results of an OLS regression applied to the aggregate indicators of financial literacy.
Fig. 3
figure 3

Segmentation of the Italian population with respect to the FFI (CART procedure)

Fig. 4
figure 4

Segmentation of the Italian population with respect to the FKI (CART procedure)

Fig. 5
figure 5

Segmentation of the Italian population with respect to the aggregate indicator of financial literacy, as defined by the OECD (CART procedure)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bongini, P., Iannello, P., Rinaldi, E.E. et al. The challenge of assessing financial literacy: alternative data analysis methods within the Italian context. Empirical Res Voc Ed Train 10, 12 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: