PRISMA Guidelines for Systematic Reviews
- PRISMA Guidelines are a standardized framework defining clear objectives, systematic literature searches, and explicit inclusion/exclusion criteria.
- They streamline methodology with structured screening, data extraction, and visual reporting using flow diagrams and checklists.
- Recent updates integrate AI techniques and domain-specific adaptations, enhancing scalability and reproducibility in systematic reviews.
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines constitute a methodological standard developed to improve the transparency, rigor, and reproducibility of systematic reviews and meta-analyses. PRISMA is widely adopted across biomedical, engineering, social, and computational sciences to structure literature searches, screening, inclusion/exclusion criteria, data extraction, and synthesis procedures. The approach is recognized for its explicit reporting standards, standardized flow diagrams, and comprehensive checklists that facilitate critical appraisal and replication by other researchers.
1. PRISMA Guideline Structure and Evolution
PRISMA originated to address shortcomings in prior reporting systems for systematic reviews and meta-analyses by providing detailed instructions for every stage of the review process. The core PRISMA protocol consists of three stages: planning, review, and reporting. The planning phase dictates clear objective formulation, selection of relevant databases, definition of search keywords, and establishment of inclusion and exclusion criteria. The review phase covers systematic literature identification, filtering, data extraction, and synthesis. The reporting phase entails explicit visual and textual documentation—most notably via flow diagrams and structured checklists that enumerate the sequence of paper selection, duplication removal, eligibility assessment, and final paper inclusion (see schematic in (Nor et al., 2021, Alsofiani, 27 May 2024, Shahandashti et al., 2023)). The PRISMA 2020 update augmented prior versions with new standards for transparency, iterative screening, multiple reviewer protocols, and digital object identifiers.
2. Literature Identification and Screening Methodologies
Implementing PRISMA requires exhaustive literature searches employing multiple digital libraries. For example, studies have identified papers from ScienceDirect, IEEE Xplore, SpringerLink, ACM Digital Library, Scopus, Web of Science, PubMed, Engineering Village, and Google Scholar—each utilizing custom search syntaxes tailored by the review team (Nor et al., 2021, Mauro et al., 2022, Shahandashti et al., 2023, Alsofiani, 27 May 2024, Tsirka et al., 16 Jul 2024). Keyword selection adapts dynamically to the target domain, leveraging Boolean strings and wildcards across title, abstract, and keyword fields. Screening involves multi-round duplicate removal (typically via tools like Zotero or EndNote), sequential title-abstract-keyword reviews, and periodic checks for context relevance during full-text assessment. PRISMA requires explicit inclusion and exclusion criteria, with examples including restrictions to peer-reviewed articles, language filters, exclusion of specific domains, and time-window constraints (e.g., only articles published between 2012–2023) (Nor et al., 2021, Shahandashti et al., 2023, Alsofiani, 27 May 2024).
Systematic mapping and snowballing techniques supplement database queries to capture studies otherwise omitted by syntactic limitations, as demonstrated in system assurance mappings (Shahandashti et al., 2023). The rigorous multi-phase approach minimizes bias and ensures reproducibility; dual independent reviewers and random sampling verification checks are standard protocol components.
3. Data Extraction, Synthesis, and Visual Reporting
The PRISMA protocol mandates structured extraction of bibliometric and analytic variables, commonly managed using spreadsheet software (e.g., Microsoft Excel, Notion) or computational libraries (Pandas, Matplotlib, VosViewer) (Nor et al., 2021, Shahandashti et al., 2023). Extracted data often includes publication year, research focus, methodological attributes, performance ratings, evaluation metrics, uncertainty quantification, and annotation of paper types (real-case vs. simulation). The extracted facts are synthesized either quantitatively (e.g., frequency of reported barriers, distribution of paper types) or qualitatively (e.g., content analysis, inductive coding procedures).
Transparent synthesis is visually anchored by standardized PRISMA flow diagrams. These diagrams illustrate the progression of literature identification—record counts before and after duplication removal, sequential exclusions, and final paper inclusion. Typical LaTeX representations demonstrate concise matrix formulations of the process:
1 2 3 4 5 6 7 8 |
\begin{array}{rcl} \text{Records Identified} & \rightarrow & 3048 \ \text{Duplicates Removed} & \rightarrow & 2760 \ \text{Records Screened} & \rightarrow & 2760 \ \text{Records Excluded (title/abstract)} & \rightarrow & 2690 \ \text{Full-text Articles Assessed} & \rightarrow & 70 \ \text{Final Studies Included} & \rightarrow & 35 \ \end{array} |
Additional LaTeX figures may display thematic clustering of findings (Alsofiani, 27 May 2024) and mapping studies of categorized barriers or assurance weakeners (Shahandashti et al., 2023).
4. PRISMA Extensions, AI Integration, and Domain Adaptation
Recent developments expand the PRISMA protocol's scope to accommodate domain-specific systematic reviews and AI-enhanced approaches. The PRISMA-DFLLM framework (Susnjak, 2023) introduces reporting requirements for reviews conducted using finetuned LLMs. This extension maintains core PRISMA elements but supplements them with new items for the documentation of dataset preprocessing (Item 16), LLM finetuning technical details (Item 17), model evaluation metrics (Item 18), and legal/ethical compliance information (Item 31). Finetuned LLMs automate data extraction, classification, summarization, and synthesis, enabling incremental "living reviews" that track ongoing literature emergence.
Technical implementation leverages parameter-efficient finetuning strategies (LoRA, QLoRA), hyperparameter tuning, dropout, weight decay, and early stopping; model evaluation incorporates both quantitative (accuracy, F1, ROUGE) and qualitative analyses. The extended checklist ensures reproducibility and explicit reporting of both AI-enabled and classical review stages.
5. Application Contexts and Domain-Specific Synthesis
PRISMA-guided reviews span a wide range of disciplinary domains. Notable applications include:
- XAI in Prognostic and Health Management (PHM): Systematic synthesis reveals that explainability mechanisms do not degrade diagnostic or anomaly detection accuracy in industrial PHM. Human evaluator involvement and robust uncertainty quantification are underrepresented, indicating gaps for future research (Nor et al., 2021).
- Population Category Guidelines in Genetics: Meta-analyses uncover broad consensus over transparency, disclaimer usage, and diverse sampling, but highlight deep disagreement about the definitions and research use of "race," "ethnicity," and "ancestry." Thematic clustering reveals that while "race" attracts substantial critique, "ancestry" is rarely problematized, though statistical estimation refinements are advocated (Mauro et al., 2022).
- Assurance Weakeners in Engineering: Mapping studies systematically categorize assurance deficits, logical fallacies, and uncertainty, with iterative selection ensuring robust taxonomy formation. The SACM specification by OMG emerges as optimal for categorizing assurance arguments and weakeners (Shahandashti et al., 2023).
- BIM Adoption Barriers in Infrastructure: Reviews systematically identify and cluster 74 distinct barriers to Building Information Modeling, reducing them to seven principal categories covering resistance to change, training gaps, cost perceptions, lack of standards, absent mandates, initiative scarcity, and unclear business value (Alsofiani, 27 May 2024).
- Touch in Human–Social Robot Interaction: The application of PRISMA delivers multifaceted synthesis covering sensor modalities, material choices, interaction types, and haptic emotional quantification. The lack of standardized protocols for hardware and measurement, and insufficient integration of multimodal sensing, are prominent research gaps (Tsirka et al., 16 Jul 2024).
6. Advantages, Challenges, and Future Directions
PRISMA’s rigor ensures transparent, replicable systematic reviews and meta-analyses, supporting robust evidence synthesis, inter-domain comparability, and knowledge accumulation. Recent AI-enabled frameworks (e.g., PRISMA-DFLLM) magnify review scalability and efficiency, enable living reviews, and facilitate collaborative dissemination of finetuned models.
Nevertheless, challenges persist. These include reconciling comprehensive inclusion with precision in literature searches, balancing automation and manual oversight, mitigating subjectivity during multi-phase selection, managing legal and ethical constraints (copyright, privacy), and addressing gaps in standardized evaluation metrics and uncertainty management (Nor et al., 2021, Susnjak, 2023, Shahandashti et al., 2023, Alsofiani, 27 May 2024, Tsirka et al., 16 Jul 2024).
A plausible implication is that future research must prioritize the development of domain-adapted protocols, improved evaluation frameworks, and ethically-aligned AI integration, particularly in areas lacking consensus or standardization. Automation tools for extracting heterogeneous content, multi-modal evidence synthesis, real-time updating, and advanced finetuning approaches are promising directions outlined by current literature (Susnjak, 2023).
7. Controversies and Consensus
PRISMA's standardized approach generates areas of both convergence and contention in systematic review practice. Reviews of population descriptors in genetics illustrate widespread consensus in seven thematic areas and fundamental discord regarding population category definitions (Mauro et al., 2022). The ongoing dispute between treating "race" as a social construct versus a biological category directly impacts methodological recommendations and analytic interpretations.
Across technical disciplines, difficulties in formalizing measurement protocols, integrating human oversight, and harmonizing evaluation standards represent persistent controversies. The multiplicity of opinions and identified research gaps underscore the necessity for continual protocol refinement and community engagement.
In summary, the PRISMA guidelines and their recent extensions constitute the gold standard for systematic review methodology, encompassing explicit protocols for literature search, screening, extraction, synthesis, and reporting. Domain-specific adaptations and AI integration are enhancing review scalability, reproducibility, and methodological rigor, with continuing evolution informed by cross-disciplinary challenges and controversies.