Colorado Learning Attitudes on Science (CLASS)

Updated 2 April 2026

Colorado Learning Attitudes about Science Survey (CLASS) is a formative assessment tool that probes students’ attitudes and epistemologies in physics.
It employs 42 Likert-scale items organized into eight categories, capturing dimensions from personal interest to conceptual understanding.
Analytical metrics such as percent favorable shifts and effect sizes are used to gauge instructional impact and address epistemological disparities.

The Colorado Learning Attitudes about Science Survey (CLASS) is a research-validated, formative assessment instrument designed to probe and characterize students’ beliefs, attitudes, and epistemologies regarding physics and the process of learning physics. Developed at the University of Colorado Boulder, CLASS distinguishes itself by focusing on multi-dimensional constructs associated with “expert-like thinking,” utilizing cognitive-science methods alongside statistical analyses. It has become a central tool in discipline-based physics education research for evaluating instructional impact, supporting curriculum development, and diagnosing epistemological barriers to learning.

1. Instrument Structure and Theoretical Basis

CLASS comprises 42 Likert-scale statements that elicit student agreement or disagreement regarding aspects of physics and its learning process. Each statement is anchored to a five-point Likert scale (Strongly Agree to Strongly Disagree) and is scored against consensus expert responses established through interviews and validation with physics faculty. Statements are grouped into eight psychometrically and cognitively validated categories:

Category	Example Construct
Personal Interest	Value placed on learning physics
Real-World Connection	Relevance of physics to everyday experience
Problem Solving – General	Systematic approaches to solving problems
Problem Solving – Confidence	Self-efficacy in physics problem solving
Problem Solving – Sophistication	Depth and adaptability in problem solving
Sense-Making / Effort	Emphasis on understanding and persistence
Conceptual Understanding	Value of conceptual knowledge
Applied Conceptual Understanding	Connecting equations to underlying concepts

The underlying construct is not a unitary trait but a profile of “expert-like” thinking dimensions, informed by systematic expert–novice interviews and validated through both qualitative and quantitative means (Wieman et al., 2015).

2. Development and Validation Methodology

The development of CLASS was grounded in a mixed-methods strategy, emphasizing cognitive validation and iterative refinement:

Student and Faculty Interviews: Item selection, phrasing, and category construction were driven by extensive interviews to capture genuine issues of student concern and to differentiate consistent expert responses. Each item’s interpretation by students was confirmed through think-aloud protocols.
Reduced-Basis Factor Analysis: Instead of imposing a single-factor, orthogonal structure, the original validation used factor analyses on subsets of items with direct oblimin (oblique) rotation, recognizing that dimensions of student thinking are inherently correlated. Eight robust categories resulted, each with well-defined loadings and supported by a composite “robustness rating” (Wieman et al., 2015).
Iterative Revision and Ongoing Validation: Multiple drafts, further guided by interviews and faculty consensus, iteratively converged on a set of items resistant to misinterpretation across diverse populations. Psychometric properties, such as internal consistency (Cronbach’s α ≈ 0.7–0.8), were repeatedly confirmed (Madsen et al., 2014, Suárez et al., 2021).

Critically, the design rejects standard psychometric unidimensionality in favor of a multidimensional, formative framework that aligns with National Research Council recommendations (Wieman et al., 2015).

3. Scoring, Statistical Techniques, and Interpretation

CLASS responses are dichotomized as “expert-like” (favorable) or “novice-like” (unfavorable) per the expert key. Scoring algorithms include:

Percent Favorable: $(\# \text{ expert-like responses} / \text{total items}) \times 100\%$
Shift Assessment: The primary measure for instructional impact is the group-level shift in percent favorable: $\Delta = M_\text{post} - M_\text{pre}$ .
Normalized Gain: Sometimes, a gain metric analogous to conceptual inventories (FCI) is used: $g = (post - pre) / (100 - pre)$ (Mason et al., 2016, Robinson et al., 2020).
Cohen’s d: Effect size for pre–post shifts or between-group comparisons (Brewe et al., 2013, Robinson et al., 2020).
Inferential Statistics: t-tests, ANOVA, and hierarchical linear modeling are applied to assess significance and model population structure, e.g., for analysis of intersectional “educational debts” (Nissen et al., 2021, Suarez et al., 2022).

CLASS supports both overall and category-level reporting, offering nuanced profiles of students’ epistemological orientations and instructional effects.

4. Empirical Findings and Applications

Meta-analyses and multi-institutional studies confirm the following trends:

Typical Instructional Impacts: In large, traditional or reformed introductory physics courses with no explicit epistemological focus, CLASS shifts are negative ( $\Delta \approx -3.7\%$ ) or neutral (Madsen et al., 2014, Nissen et al., 2021). In contrast, small classes, elementary/non-science majors, or courses with explicit model-building/attitude interventions can yield positive shifts ( $\Delta \approx +8\%$ to $+9\%$ ) (Madsen et al., 2014, Brewe et al., 2013).
Instructional Design: Positive (expert-like) shifts are associated with course designs that explicitly foreground epistemology, modeling, and reflective practices (e.g., Modeling Instruction, Physics by Inquiry) (Brewe et al., 2013). Factors such as small class size, active engagement, and frequent formative feedback appear important. Ongoing instructor professional support enhances results.
Subpopulation Variability: Physics majors generally enter with more expert-like attitudes, and such attitudes remain stable barring targeted interventions. Life-science majors, health-science students, and future teachers tend to exhibit more novice-like attitudes, particularly in problem-solving sophistication and applied conceptual understanding (Suarez et al., 2022, Suárez et al., 2021, Mason et al., 2016).
Demographic Disparities: Large-scale hierarchical linear models reveal persistent and statistically robust gaps in expert-like attitudes across race and gender intersections. White men in calculus-based courses hold the most expert-like attitudes, with women of color exhibiting the lowest scores, and instruction does not erase but may exacerbate these “educational debts” (Nissen et al., 2021).

5. Correlation with Conceptual Gains

Meta-analytic and course-level studies have identified small to moderate positive correlations ( $r \approx 0.2$ -$0.4$) between incoming CLASS beliefs and subsequent gains on conceptual tests such as the FCI or FMCE. The strength of these correlations varies by gender; for women, post-instruction CLASS scores and conceptual gains are more closely linked ( $r = 0.45$ , $p < 0.005$ ) than for men, suggesting belief development may be a stronger lever for supporting women's conceptual mastery (Robinson et al., 2020, Madsen et al., 2014).

A plausible implication is that cultivating expert-like epistemologies may be supportive, though not strictly causal, for robust conceptual learning, and may inform strategies for narrowing performance gaps (Robinson et al., 2020).

6. Extensions, Derivatives, and International Usage

The CLASS family has been extended to new domains (e.g., E-CLASS for experimental physics), which adapts the cognitive–statistical framework for lab courses, expanding quadrants to include perceptions of grade incentives and practice frequency as well as personal/expert beliefs (Zwickl et al., 2012). International studies, e.g., in Uruguay and Thailand, confirm CLASS’s cross-national utility and highlight areas of epistemological vulnerability for both students and pre-service teachers, most notably in problem-solving sophistication (Suárez et al., 2021, Suarez et al., 2022).

CLASS has been translated into eight languages and used to benchmark physics attitudes globally, further cementing its status as a standard for epistemological assessment in physics education research.

7. Methodological Controversies and Best Practices

Debate persists over the application of standard psychometric analyses. Attempts to reduce the instrument to a single (or a few) factors via confirmatory factor analysis and rigid criteria (e.g., eliminating items with low communality or cross-loadings) substantially narrow CLASS’s construct coverage and utility. Such reductions threaten construct validity by masking key expert-novice distinctions, conflating categories with distinct instructional sensitivity, and diluting formative value (Wieman et al., 2015).

Best practice, as outlined by Wieman and Adams and the NRC, dictates that:

Interview-validated items should be preserved, even if orthogonal or cross-loading items are psychometrically messy.
Reduced-basis, oblique factor analysis is superior for this multi-dimensional, formative assessment.
Ongoing validity checks (including interviews) must accompany implementation in novel contexts.
Reporting should favor category-level insights and instructional guidance over high-stakes ranking or instrument simplification (Wieman et al., 2015).

These methodological guardrails ensure CLASS remains a sensitive, reliable instrument for probing the richness of students’ epistemologies and the efficacy of instruction in catalyzing expert-like thinking in physics.