Modeling Public Perceptions of Science in Media (2506.16622v1)

Published 19 Jun 2025 in cs.CL, cs.AI, cs.CY, and cs.HC

Abstract: Effectively engaging the public with science is vital for fostering trust and understanding in our scientific community. Yet, with an ever-growing volume of information, science communicators struggle to anticipate how audiences will perceive and interact with scientific news. In this paper, we introduce a computational framework that models public perception across twelve dimensions, such as newsworthiness, importance, and surprisingness. Using this framework, we create a large-scale science news perception dataset with 10,489 annotations from 2,101 participants from diverse US and UK populations, providing valuable insights into public responses to scientific information across domains. We further develop NLP models that predict public perception scores with a strong performance. Leveraging the dataset and model, we examine public perception of science from two perspectives: (1) Perception as an outcome: What factors affect the public perception of scientific information? (2) Perception as a predictor: Can we use the estimated perceptions to predict public engagement with science? We find that individuals' frequency of science news consumption is the driver of perception, whereas demographic factors exert minimal influence. More importantly, through a large-scale analysis and carefully designed natural experiment on Reddit, we demonstrate that the estimated public perception of scientific information has direct connections with the final engagement pattern. Posts with more positive perception scores receive significantly more comments and upvotes, which is consistent across different scientific information and for the same science, but are framed differently. Overall, this research underscores the importance of nuanced perception modeling in science communication, offering new pathways to predict public interest and engagement with scientific content.

Summary

The paper introduces a multidimensional model combining 12 distinct perception factors to quantify audience responses to science news.
It leverages over 10,000 crowd-sourced labels and a RoBERTa-Large multi-task regression model to predict public engagement.
Findings reveal that individual science news consumption, rather than demographics, is the key driver of both perception and online engagement.

Modeling Public Perceptions of Science in Media

The paper “Modeling Public Perceptions of Science in Media” (2506.16622) presents a computational framework for systematically modeling and predicting how the general public perceives scientific information disseminated via media channels. The authors introduce a multidimensional approach that extends far beyond conventional sentiment or newsworthiness analysis, assembling a 12-factor framework to capture the nuanced spectrum of audience reactions to science news stories.

Multidimensional Framework for Perception

A central contribution of the work is the explicit operationalization of twelve distinct perception dimensions: Newsworthiness, Understandability, Expertise, Interestingness, Fun, Importance, Benefit, Sharing, Reading Willingness, Exaggeration, Surprisingness, and Controversy. The framework draws from and systematizes concepts in communication studies, sociology of science, and journalism, providing a granular instrument to evaluate public response to scientific media content.

The dimensions were annotated by over 2,100 US and UK participants on 1,500+ science news articles, resulting in more than 10,000 perception labels. The diverse and representative sample supports granular population-level inferences.

Modeling and Predicting Perceptions

To automate perception estimation, the authors fine-tune a RoBERTa-Large-based multi-task regression model, predicting the twelve perceptual scores from raw article text. The model is trained and evaluated on the new dataset, enabling scalable perception analysis at the article level. Multi-task learning is employed to leverage dependencies between perception dimensions and improve predictive power.

This modeling pipeline affords two analytical approaches: (1) examining determinants of perception (perception as outcome), and (2) using predicted perceptions to anticipate audience engagement (perception as predictor).

Key Empirical Findings

The regression analyses report several non-obvious, robust results:

Individual frequency of science news consumption is the strongest predictor of positive perception across almost all dimensions. For instance, those who consume science news daily rate the same articles as significantly more important, interesting, and shareable than those who rarely consume such news.
Demographic and political factors exert minimal influence on perception when controlling for content and domain. Notably, gender, age, education, and political orientation explain little variation in perception scores, a claim that diverges from much of the existing literature focusing on broad public attitudes.
Content domain and media format are substantial drivers of perception. Health and medicine articles are rated as more important and newsworthy compared to humanities, which are found fun and understandable but judged as less societally valuable.
Perceptual dimensions strongly predict engagement metrics: In large-scale analysis of 95,000 Reddit science posts, posts estimated as higher in importance, fun, and surprisingness receive significantly more upvotes and comments. Notably, requiring specialized expertise (higher perceived difficulty) is associated with decreased engagement. The effect size is pronounced; a one-point rise in perceived importance translates to a 68% increase in post score.
Framing effects are empirically validated: The same scientific content, reframed to alter public perceptions, yields divergent engagement patterns. The natural experiment within Reddit data demonstrates a strong relationship between controllable perception variables and behavioral outcomes, suggesting a possible causal link.

Implications

This work has substantial practical implications for science communication and computational modeling:

Actionable prediction for communicators: Automated perception prediction allows editors and communicators to simulate public response before publishing, enabling message tailoring for maximal engagement or accessibility.
Audience segmentation: The null findings regarding demographics challenge conventional wisdom and suggest targeting non-consumers of science news may require fundamentally different strategies than those used for existing engaged audiences.
Automated content evaluation: The model and dataset facilitate the assessment of large archives of scientific news for public resonance, supporting both retrospective and prospective media analyses.
Methodology for human-centered NLP: By directly interpolating between psychological theories, communication research, and practical NLP modeling, this work sets a template for future multi-dimensional, purpose-driven natural language evaluation tasks.
Experimental evidence for framing interventions: The natural experiment design bridges computational social science and causal inference, clarifying the mechanisms by which media framing affects observable engagement.

Future Directions

Several avenues for further research are evident:

Cross-lingual and cross-cultural expansion: Extending the framework to non-English contexts and additional populations will test the universality of the findings and reveal potential cultural moderators.
Fine-grained linguistic analysis: Investigation of which linguistic, discursive, or paratextual features drive perception dimension scores could inform the design of computational writing assistants or automated science journalists.
Domain adaptation and explainability: Considering domain-aware LLMs and incorporating explainable AI methods may advance interpretability, especially in high-stakes or sensitive domains (e.g., health).
Integration with social impact modeling: Linking predicted perceptions to real-world outcomes beyond online engagement (e.g., policy change, public understanding, or behavior modification) remains a vital open challenge.

Conclusion

“Modeling Public Perceptions of Science in Media” offers a robust empirical and computational foundation for the nuanced paper of audience response to scientific communication. Its multidimensional, population-scaled, and model-driven approach marks a notable advance in both social informatics and applied NLP for policy-relevant domains, with direct applications for optimizing science engagement strategies and understanding the social mechanics of information diffusion.

PDF Markdown