AI Assessment Scale (AIAS)

Updated 8 March 2026

The AI Assessment Scale (AIAS) is a structured framework that defines graduated levels of GenAI integration in educational assessment while emphasizing academic integrity and digital literacy.
It delineates five ordinal levels—from zero AI use to full AI integration—that map permitted GenAI activities to learning outcomes and student responsibilities.
Empirical validations report improved digital literacy, enhanced pass rates, and reduced misconduct, with adaptations extending to EFL contexts and broader AI literacy benchmarks.

The AI Assessment Scale (AIAS) denotes a set of distinct but convergent frameworks and tools for the systematic evaluation and integration of artificial intelligence—particularly Generative AI (GenAI)—into educational assessment and broader AI literacy contexts. Originating in educational theory and subsequently adapted for measurement and benchmarking, the AIAS offers ordinal or multi-dimensional metrics for mapping the degree, nature, and quality of AI involvement, providing guidance for policy, pedagogy, and empirical evaluation (Perkins et al., 2023, Kılınç, 2024, Furze et al., 2024, Perkins et al., 2024, Roe et al., 1 Jan 2025, Roe et al., 2024, Liu et al., 2015, Liu et al., 2017, Markus et al., 17 Mar 2025, Carolus et al., 2023).

1. Conceptual Origins and Rationale

The initial motivation for the AI Assessment Scale emerged from the abrupt proliferation of GenAI models such as ChatGPT within higher education and professional training. Institutions faced dilemmas involving academic integrity, skill formation, digital access, and unclear policies as students began leveraging AI for content generation, editing, and research tasks (Perkins et al., 2023). Simple binary approaches (allow/ban) proved insufficient to address nuanced pedagogical and ethical concerns, necessitating a scale that would clarify and standardize the boundaries of permissible AI use aligned with targeted learning outcomes.

Key underpinning principles of the AIAS include:

Constructive alignment (ensuring permitted AI use directly maps to intended learning/assessment outcomes).
Academic integrity: emphasizing honesty, transparency, fairness, and responsibility.
Digital/data literacy as a core competency.
Equity of access, including tool standardization and support for under-resourced learners (Perkins et al., 2023, Roe et al., 1 Jan 2025, Roe et al., 2024).

2. Formal Structure and Levels of the Educational AIAS

The canonical AI Assessment Scale is an ordinal, scaffolded framework typically comprising five levels of AI integration, with each level precisely delineating permitted GenAI activities and expected student controls. Levels are interpreted as mappings from assessment design (task objectives) to permitted AI use (Perkins et al., 2023, Furze et al., 2024, Perkins et al., 2024, Roe et al., 1 Jan 2025).

Level	Short Name	Permitted AI Usage	Student Responsibility
1	No AI	Zero GenAI usage	Exclusive unaided performance
2	AI-Assisted Planning	Idea generation, outlines, research leads	AI material not present in final work
3	AI-Assisted Editing	Grammar, style, clarity refinement	Submit annotated edits/original draft
4	AI Task Completion + Review	AI-generated content for specific prompts	Critical commentary, explicit citation
5	Full AI Integration	Unrestricted, co-creative AI use	Vouch for integrity of final product

Formally, for assessment design function $f$ , the assigned scale level $S \in \{1,2,3,4,5\}$ is selected by $S := f(\text{AssessmentDesign})$ , where $f$ is informed by the required learning outcome (e.g., critical thinking $\rightarrow$ Level 4; language fluency $\rightarrow$ Level 3) (Perkins et al., 2023, Furze et al., 2024, Perkins et al., 2024). No closed-form scoring or weighting function is present in the canonical version.

3. Domain-Specific Adaptations and Extensions

Practical implementations have resulted in discipline-specific and population-specific variants:

EAP-AIAS / EFL Adaptations: English for Academic Purposes (EAP) and English as a Foreign Language (EFL) settings adapt the AIAS to language learning, emphasizing transparency, formative feedback, and sequenced integration (Levels 2–4). Here, GenAI supports planning, drafting, or revision, with explicit metalinguistic reflection and critical AI-literacy components (Roe et al., 2024, Roe et al., 1 Jan 2025).
CAIAF: The Comprehensive AI Assessment Framework (CAIAF) extends AIAS to six levels, incorporates stringent ethical requirements, real-time interaction, personalized assistance features, and a color-gradient interface, supporting fine-grained control and explicit compliance checklists (Kılınç, 2024).
Implementation Workflows: Institutional guidance includes decision trees, sample rubrics, policy documentation, and iterative staff/student training for context-specific calibration. Empirical pilots have reported reduced academic misconduct and improved student attainment following AIAS adoption (Furze et al., 2024).

4. Methodological and Empirical Validation

Empirical studies provide evidence for the reliability and validity of AIAS applications. Methods include:

Pre/post surveys (Likert-type) to measure confidence and digital/AI literacy (Perkins et al., 2023).
Mixed-methods analysis: rubric-based scoring, inter-rater reliability (ICC up to 0.82), and Cronbach’s $\alpha$ for level descriptors (e.g., $\alpha=0.88$ ) (Roe et al., 1 Jan 2025).
Statistical analyses of institutional impact: significant decreases in AI-related misconduct and measurable increases in attainment and pass rates (5.9% and 33.3%, respectively, in one pilot) (Furze et al., 2024).

AIAS does not prescribe numerically-weighted scoring for composite assessment, but provides a sample formula for rubric normalization:

$\text{Score}_\text{total} = \frac{\sum_i w_i \cdot r_i }{ \sum_i w_i }$

where $w_i$ is the weight for criterion $i$ , and $r_i$ the rating, which may include AI engagement rubrics (Perkins et al., 2023).

5. Ethical, Equity, and Implementation Considerations

The scale explicitly addresses ethical and inclusivity requirements:

Tool standardization and provision of institutional access to minimize disparities.
Requirement for transparent citation of AI input, especially in critical reflection and evaluation stages.
Safeguards against fault modes: e.g., supervised assessments, clear guidelines for misconduct, instruction on AI bias/hallucination detection (Perkins et al., 2023, Kılınç, 2024, Roe et al., 2024).
Continuous working groups to iteratively review scale effectiveness and evolve descriptors in line with technological development (Perkins et al., 2023).

In the CAIAF, five explicit ethical principles are encoded—transparency, equity, pedagogical alignment, accountability, and data privacy—with all assignments required to report compliance on a checklist (Kılınç, 2024).

6. Variants of the AI Assessment Scale in AI Benchmarking and Literacy

AIAS has also been independently formalized in the context of general AI system benchmarking:

Functional Model: A standard intelligent system is specified as a tuple $M = \{ K, K_s, K_M, K_N, Q, Q_I, Q_O, I, O, C, N \}$ , representing components for knowledge acquisition, storage, innovation, and feedback (Liu et al., 2015, Liu et al., 2017).
AI IQ / Intelligence Grade: AI IQ (absolute and deviation) is scored via weighted subtests for acquisition, mastery, innovation, and feedback, permitting direct comparison to human baselines (Liu et al., 2015, Liu et al., 2017).
Autonomous AI Assessment Scale: Extends the ordinal ladder to a multi-axis, operational metric, rating autonomous agents along ten normalized axes (e.g., autonomy, generality, planning, memory, self-revision) and aggregates them via a weighted geometric mean, with discrete gates for automation, self-improvement, and AGI thresholds (Chojecki, 17 Nov 2025).
AI Literacy Scales (AICOS, MAILS): Psychometrically validated instruments such as the AI Competency Objective Scale (AICOS) and Meta AI Literacy Scale (MAILS) offer multidimensional, IRT-calibrated indices for measuring AI literacy across cognitive, ethical, and creativity subdomains (Markus et al., 17 Mar 2025, Carolus et al., 2023).

7. Limitations and Future Directions

Key constraints and directions noted across the literature include:

Context sensitivity: Granularity, level descriptors, and rubrics may require discipline- and age-specific adaptation (Perkins et al., 2023, Perkins et al., 2024).
Technology evolution: Ongoing review is mandatory as AI capabilities and modalities expand (multimodal, real-time, etc.).
Equity challenges and access gaps persist, especially in remote or digitally underserved populations (Perkins et al., 2024).
Empirical validation remains an open research area; large-scale, mixed-methods, and cross-context studies are needed to establish generalizability, learning impact, and longitudinal curve-shaping (Perkins et al., 2023, Perkins et al., 2024).
Integration with institutional policy and global best practices (e.g., COPE, UNESCO) is necessary to maintain alignment with evolving academic integrity and ethical norms (Roe et al., 2024).

Initiatives recommended include working groups, data-driven policy reviews, open-access toolkits, and progressive empirically-informed refinement of both the scale and its supporting resources (Perkins et al., 2023, Perkins et al., 2024, Roe et al., 1 Jan 2025).

Key references:

(Perkins et al., 2023) The AI Assessment Scale (AIAS): A Framework for Ethical Integration of Generative AI in Educational Assessment (Kılınç, 2024) Comprehensive AI Assessment Framework: Enhancing Educational Evaluation with Ethical AI Integration (Furze et al., 2024) The AI Assessment Scale (AIAS) in action: A pilot implementation of GenAI supported assessment (Perkins et al., 2024) The AI Assessment Scale Revisited: A Framework for Educational Assessment (Roe et al., 1 Jan 2025) From Assessment to Practice: Implementing the AIAS Framework in EFL Teaching and Learning (Roe et al., 2024) The EAP-AIAS: Adapting the AI Assessment Scale for English for Academic Purposes (Liu et al., 2015) A Study on Artificial Intelligence IQ and Standard Intelligent Model (Liu et al., 2017) Intelligence Quotient and Intelligence Grade of Artificial Intelligence (Markus et al., 17 Mar 2025) Objective Measurement of AI Literacy: Development and Validation of the AI Competency Objective Scale (AICOS) (Carolus et al., 2023) MAILS -- Meta AI Literacy Scale: Development and Testing of an AI Literacy Questionnaire