The Right to Be Remembered: Preserving Maximally Truthful Digital Memory in the Age of AI (2510.16206v2)
Abstract: Since the rapid expansion of LLMs, people have begun to rely on them for information retrieval. While traditional search engines display ranked lists of sources shaped by search engine optimization (SEO), advertising, and personalization, LLMs typically provide a synthesized response that feels singular and authoritative. While both approaches carry risks of bias and omission, LLMs may amplify the effect by collapsing multiple perspectives into one answer, reducing users ability or inclination to compare alternatives. This concentrates power over information in a few LLM vendors whose systems effectively shape what is remembered and what is overlooked. As a result, certain narratives, individuals or groups, may be disproportionately suppressed, while others are disproportionately elevated. Over time, this creates a new threat: the gradual erasure of those with limited digital presence, and the amplification of those already prominent, reshaping collective memory. To address these concerns, this paper presents a concept of the Right To Be Remembered (RTBR) which encompasses minimizing the risk of AI-driven information omission, embracing the right of fair treatment, while ensuring that the generated content would be maximally truthful.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview
This paper introduces a new idea called the Right To Be Remembered (RTBR). It argues that as artificial intelligence (especially LLMs, or LLMs) becomes the main way people find information, we need to make sure our digital memories are kept truthful, complete, and fair. The authors worry that AI systems can accidentally leave out important voices and facts—especially from people or places with smaller online footprints—so they propose RTBR to help protect what humanity knows and remembers.
Key questions the paper asks
To make this clear, here are the main questions the authors explore:
- How do AI systems (like chatbots) shape what we remember and forget online?
- Why do some people, communities, and scientific work disappear from digital memory more than others?
- What does “maximally truthful” AI look like, and how can we build it?
- How should the “Right to Be Remembered” fit with the “Right to Be Forgotten” (the legal right to erase personal data)?
- What design and policy changes are needed so AI keeps memory honest, inclusive, and useful?
Methods and approach (explained simply)
This isn’t a lab experiment paper—it’s a careful, big-picture review and proposal. The authors:
- Explain how LLMs work: An LLM is like a super-fast writer trained by reading huge amounts of text. It predicts the next word in a sentence. Most are built using a “transformer,” a kind of model that pays attention to many parts of a text at once (like a reader who can remember both the start and end of a book while reading the middle).
- Describe Retrieval-Augmented Generation (RAG): This is like asking a librarian to fetch sources while the AI writes its answer, so it can quote evidence instead of guessing.
- Point out real-world problems:
- Link rot: Web pages disappear or move, so sources vanish over time (like books going missing from a library).
- Bias in training data: If the internet has fewer works from some countries, languages, or groups, the AI will know less about them.
- Vendor choices: Companies decide what data to include, how to filter answers, and whether to show multiple viewpoints or just one. These choices shape what gets remembered.
- Summarize research about “truth signals” inside models:
- “Truth direction”: Think of it as a kind of internal truth compass—some studies suggest models have activation patterns that tend to point toward correct answers.
- “Local intrinsic dimension”: True answers seem to be “simpler shapes” inside the model’s brain, while made-up answers look more tangled.
- Compare RTBR with the “Right to Erasure” (GDPR’s “Right to Be Forgotten”): Erasing data from an AI is very hard because knowledge is blended into the model (like trying to remove one drop of dye from a swimming pool without changing the water). They discuss “machine unlearning,” which tries to remove specific information but can damage the model’s overall usefulness.
Main findings and why they matter
Here are the paper’s main points, explained in everyday language:
- AI answers feel authoritative but can hide what’s missing. When a chatbot gives a single, smooth answer, users may not realize some viewpoints or facts were left out. This makes popular narratives stronger and weakens less-visible ones.
- Digital memory is fragile. Over time, links break, pages get taken down, older formats become unreadable, and content gets de-ranked. If AI relies on this unstable web, important parts of knowledge can fade away.
- Visibility is unequal. Work from certain regions, languages, or communities is underrepresented online. That means AI systems are more likely to overlook it, further reducing its presence in the future.
- Vendors have power over memory. Companies choose training data, moderation rules, and interface designs (do you see one answer or several?). These decisions quietly decide who gets remembered.
- Truthfulness needs more than facts. The authors say “maximally truthful” AI should include:
- Accuracy: Being factually correct.
- Honesty: Saying what the model “really believes” based on its training, not just pleasing the prompt.
- Provenance: Showing where information comes from (citations and credit).
- Uncertainty: Saying “I don’t know” when evidence is weak.
- RTBR vs Right to Erasure: For foundational AI systems (big general-purpose models), the authors argue society’s need to preserve a complete and accurate record should usually outweigh individual requests to erase truthful information—especially for historical and scientific knowledge, and particularly after a person’s death.
- Practical design ideas:
- Layered provenance: Give quick answers up front, but let users open a deeper trail of sources and credits underneath (like expanding footnotes).
- Multiple perspectives and calibrated confidence: Show uncertainty and different viewpoints when the topic is complex or debated.
- Retrieval and preservation: Strengthen tools that fetch and protect sources, including work in less dominant languages and older formats.
What this could mean for the future
If we adopt RTBR as a guiding principle:
- AI systems would aim to protect our shared memory, not just deliver convenient answers. That means actively preserving diverse voices, crediting contributors, and signaling uncertainty.
- Designers and policymakers would treat digital memory as a public good—like clean water or public libraries—something we must maintain for everyone.
- Laws (like GDPR) and AI standards might evolve to balance privacy with the need to keep history and knowledge complete and accessible.
- Future generations would inherit a richer, more truthful record of human experience, helping science, culture, and community stories continue and grow.
In short, the paper argues that remembering well—fairly, fully, and truthfully—is essential for AI to help humanity learn, correct mistakes, and make better decisions. The Right To Be Remembered is a call to build AI that not only answers questions today but also protects the foundations of knowledge for tomorrow.
Knowledge Gaps
Unresolved gaps and open questions
Below is a concise, actionable list of knowledge gaps, limitations, and open questions the paper leaves unresolved.
Conceptual and definitional gaps
- Precise operational definition of “Right to Be Remembered (RTBR)” and its scope (who/what is covered, thresholds for inclusion, duration, and mechanisms of enforcement).
- Formalization of “maximal truthfulness” (e.g., a measurable objective function, metrics, and benchmarks that combine accuracy, honesty, provenance, inclusivity, and uncertainty).
- Clear criteria for resolving conflicts between RTBR, safety policies, and the suppression of harmful, illegal, or defamatory content.
- Framework to distinguish remembrance of contributions from preservation of misinformation, propaganda, or manipulated content.
Empirical evidence and measurement
- Longitudinal, quantitative evidence that LLMs cause or accelerate erasure compared to search engines (user behavior, answer diversity, citation persistence, narrative coverage).
- Metrics to quantify “visibility inequality” in AI outputs (e.g., a silencing index across languages, geographies, institutions, and communities).
- Robust methods to measure the impact of link rot on LLM/RAG answers over time and to evaluate mitigation strategies.
- Controlled studies on whether single-answer interfaces reduce users’ comparison of alternatives and increase omission bias.
- Cross-lingual and cross-cultural evaluations of remembrance (coverage and fidelity for non-English sources and marginalized communities).
- Empirical tests that internal “truth directions” and local intrinsic dimension (LID) signals generalize across models, domains, and adversarial prompts.
Technical architecture and implementation
- A concrete system design for layered provenance in LLMs that preserves fast answers while carrying forward scholarly citation chains and credit.
- Methods to embed machine-readable epistemic provenance (researchers, labs, communities) into model outputs beyond content authenticity (e.g., C2PA-like for text).
- Retrieval/indexing strategies that systematically surface underrepresented, non-digitized, or poorly-indexed sources (including multilingual and “grey literature”).
- Standardized metadata, ontologies, and APIs to make archives, institutional repositories, and community memory banks machine-actionable for RAG.
- Algorithms to calibrate abstention that reliably detect “unknowns” without disproportionally suppressing minority or rare perspectives.
- Integration of internal truthfulness signals (truth direction, LID) into decoding and ranking pipelines for real-time hallucination suppression.
- Methods to track and preserve “citation chains” during synthesis (e.g., lineage graphs maintained through prompt, retrieval, and generation).
- Tooling to version model outputs and their supporting evidence over time for auditability and historical reconstruction.
Governance, policy, and legal reconciliation
- Concrete proposals to reconcile RTBR with GDPR Article 17 (Right to Erasure) beyond high-level prioritization (e.g., scoped exceptions, balancing tests, adjudication processes).
- A due-process mechanism for contested remembrance requests (who decides inclusion/exclusion, appeal pathways, transparency obligations).
- Post-mortem privacy and consent protocols (family/estate rights, cultural norms, jurisdictional differences) to guide ingestion of deceased individuals’ digital legacies.
- Licensing and copyright guidance for integrating archival and proprietary materials (including consent models for non-public or community-held data).
- Vendor accountability frameworks and audit standards to ensure RTBR-compliant training, retrieval, and moderation (including third-party oversight).
- International harmonization challenges (divergent legal regimes, data sovereignty, and cross-border data flows).
Equity, ethics, and community participation
- Mechanisms to prevent RTBR from platforming harmful actors or strategic “visibility gaming” while still preserving legitimate minority narratives.
- Community co-governance models (Global South, indigenous, disability, and other groups) to set inclusion rules, correct misrepresentation, and steward their archives.
- Ethical guidelines for balancing remembrance with contextual integrity (e.g., sensitive histories, stigmatizing data, and shifting norms over time).
- Procedures for soliciting and maintaining corrections, retractions, and counter-narratives to avoid freezing historical errors.
- Safeguards against adversarial flooding, censorship, and information operations that exploit RTBR to distort collective memory.
Operational feasibility and sustainability
- Cost, compute, and storage assessments for large-scale remembrance infrastructure (archives, multilingual ingestion, ongoing provenance maintenance).
- Environmental impact analysis (carbon footprint of persistent archiving, continuous re-indexing, and retraining for remembrance-compliant models).
- Incentive design for vendors to adopt remembrance-friendly architectures (business models, regulatory incentives, public procurement standards).
- Maintenance and decay handling (backups, mirroring, format migration, and resilience against domain lapses) to counter long-term link rot.
Machine unlearning and model maintenance
- Technical pathways to accommodate legally mandated erasures without degrading model integrity (granular unlearning, modular knowledge compartments).
- Trade-off analysis between RTBR and unlearning accuracy (utility loss, bias shifts, and emergent gaps) and criteria for proportionality.
- Strategies to prevent catastrophic forgetting of marginalized content during routine fine-tuning and model updates.
User experience and behavior
- Interface designs that present multiple perspectives and uncertainty signals without overwhelming users or harming utility.
- Experiments to determine how provenance depth, credit visibility, and uncertainty cues affect trust, learning, and knowledge retention.
- Personalization safeguards to ensure RTBR does not devolve into echo chambers or reification of a user’s prior exposures.
Security and integrity of digital memory
- Verification pipelines for authenticity of archived text and multimedia (e.g., extensions of C2PA/EKILA to scholarly and social content).
- Robustness to deepfakes, synthetic text laundering, and metadata tampering in provenance chains.
- Detection and mitigation of coordinated campaigns that aim to manipulate “what is remembered” for political or commercial gain.
Practical Applications
Immediate Applications
These applications can be deployed today using existing methods, standards, and workflows that the paper synthesizes or recommends.
- Bold, layered provenance in AI outputs
- Sectors: software, education, publishing, journalism
- What: Ship LLM UX that defaults to concise answers with a one-click “Show Sources/Attribution” panel listing citations, contributor IDs (e.g., ORCID), and evidence snippets
- Tools/Products/Workflows: C2PA-style metadata embeddings; JSON-LD provenance; DOI/ORCID linking; EKILA-like attribution concepts adapted to text; expandable citations and “citation chain explorer”
- Assumptions/Dependencies: Vendor UX adoption; reliable citation resolution (Crossref/DataCite APIs); legal review for citation display
- Diversity-aware answer mode (multi-perspective synthesis)
- Sectors: search, education, media platforms
- What: Add a toggle to present multiple perspectives, including minority/underrepresented sources, rather than a single authoritative answer—especially for contested topics
- Tools/Products/Workflows: Diversity-aware IR re-ranking; fairness constraints in retrieval; source clustering by viewpoint; user-controllable “diversity slider”
- Assumptions/Dependencies: Editorial policy; acceptance of trade-offs between brevity and pluralism; labeled or inferred diversity signals
- Calibrated “I don’t know” and abstention in production assistants
- Sectors: healthcare, legal, finance, enterprise support
- What: Enable selective answering with uncertainty thresholds so models abstain when evidence is insufficient and suggest next steps (e.g., consult a human, retrieve primary sources)
- Tools/Products/Workflows: Confidence calibration; selective prediction; “know-what-you-know” probes; guardrails and human-in-the-loop escalation
- Assumptions/Dependencies: Regulatory tolerance for abstention; business KPIs that value safety over coverage; monitoring to prevent over-abstention
- Link rot mitigation in retrieval and training pipelines
- Sectors: LLM vendors, publishers, libraries, archives
- What: Snapshot cited pages at ingestion; replace dead links with archival mirrors; maintain durable perma-links
- Tools/Products/Workflows: Internet Archive/Memento APIs; Perma.cc; CI link checkers; canonicalization policies
- Assumptions/Dependencies: License compatibility; archive availability; storage budgets
- RTBR-aware RAG: long-tail and non-English coverage by design
- Sectors: software, education, scientific tools
- What: Expand retrieval indices to include non-English, older, and under-digitized materials; weight recall of long-tail sources; expose language/source diversity in results
- Tools/Products/Workflows: Multilingual embedding models; dedicated corpora from institutional repositories; language-aware rerankers
- Assumptions/Dependencies: Access to multilingual/legacy collections; OCR and normalization quality; compute overhead
- Attribution-preserving content creation workflows
- Sectors: publishing, newsrooms, marketing, academia
- What: Require AI-assisted content to embed traceable provenance and credit; maintain scholarly-style references in web articles and PDFs
- Tools/Products/Workflows: CMS plugins for C2PA/JSON-LD; template policies; automatic bib generation from DOIs; provenance lints in CI
- Assumptions/Dependencies: Editorial buy-in; reader UX considerations; training for staff
- Bias and “memory equity” dashboards for model/data governance
- Sectors: AI governance, MLOps, compliance
- What: Track representation coverage by language, geography, and institution; monitor link rot rates; flag topic areas with sparse evidence
- Tools/Products/Workflows: Data profiling; sampling coverage analytics; evaluation sets across cultures; periodic audits
- Assumptions/Dependencies: Access to data lineage; agreement on equity metrics; privacy-preserving reporting
- Internal truthfulness monitors to flag likely hallucinations
- Sectors: enterprise platforms, safety teams
- What: Use model-internal signals (e.g., “truth direction,” local intrinsic dimension) to flag outputs for extra verification before delivery
- Tools/Products/Workflows: Activation probes; LID estimators; secondary verification pass; risk labeling in logs
- Assumptions/Dependencies: Access to model internals or surrogate probes; validation against ground truth; performance impact assessment
- Institutional archiving and DOI adoption drives
- Sectors: academia, NGOs, government agencies
- What: Ensure outputs receive DOIs; deposit artifacts in stable repositories; add multilingual abstracts; schema.org markup for discovery
- Tools/Products/Workflows: Crossref/DataCite registration; LOCKSS/Portico; institutional repositories; Save Page Now automations
- Assumptions/Dependencies: Funding and staffing; policy mandates; coordination with libraries/archives
- Procurement checklists for AI systems with RTBR requirements
- Sectors: public sector, regulated industries
- What: Require provenance, abstention capabilities, multi-perspective mode, and archiving strategies in RFPs and vendor assessments
- Tools/Products/Workflows: Model cards with RTBR sections; data statements; acceptance tests covering inclusivity and uncertainty
- Assumptions/Dependencies: Policy adoption; vendor ecosystem readiness; auditing capacity
- End-user prompt practices for epistemic hygiene
- Sectors: daily life, journalism, education
- What: Provide prompt templates/extensions that request sources, ask for multiple perspectives, and ask the model to state confidence and unknowns
- Tools/Products/Workflows: Browser extensions; prompt libraries; LMS and newsroom playbooks
- Assumptions/Dependencies: User training; compatibility across LLM providers
- Data donation and post-mortem digital legacy programs
- Sectors: civic tech, memorial services, libraries
- What: Offer consent-based programs to preserve personal archives for historical research and AI training, with clear governance
- Tools/Products/Workflows: Consent management portals; personal data stores; standardized deposit agreements
- Assumptions/Dependencies: Legal clarity; trust frameworks; ethical review boards
Long-Term Applications
These applications likely require further research, standardization, scaling, or regulatory change before widespread deployment.
- RTBR-aligned foundation model training
- Sectors: AI research, vendors
- What: Co-train models on token prediction plus objectives for provenance fidelity, calibrated abstention, and memory equity constraints
- Tools/Products/Workflows: Multi-objective loss functions; representation regularizers tied to “truth direction”; retrieval-grounded training loops
- Assumptions/Dependencies: Access to high-quality, provenance-rich corpora; scalable training methods; open evaluation benchmarks
- Global, federated digital memory infrastructure
- Sectors: libraries, archives, standards bodies, AI labs
- What: Build a tamper-evident, open, multilingual repository of human contributions used as a canonical training and retrieval backbone
- Tools/Products/Workflows: Content authenticity (C2PA); persistent identifiers (DOI/ORCID/ISNI); interoperable knowledge graphs; decentralized storage
- Assumptions/Dependencies: International coordination; sustainable funding; governance to prevent capture
- Legal reconciliation of RTBR and Right to Erasure
- Sectors: policymakers, regulators, civil society
- What: Define exceptions, safe harbors, and post-mortem norms for preservation; clarify standards for “machine unlearning” obligations
- Tools/Products/Workflows: Model law templates; regulatory guidance; privacy-by-design with archival exceptions
- Assumptions/Dependencies: Political consensus; cross-jurisdiction harmonization; stakeholder engagement
- Surgical, provenance-aware machine unlearning
- Sectors: AI safety, research
- What: Develop methods to remove specific personal data without degrading general knowledge or inducing distortions
- Tools/Products/Workflows: Parameter editing; targeted forgetting with constraint satisfaction; inference-time masking with verified provenance checks
- Assumptions/Dependencies: New theory and benchmarks; compute budgets; risk mitigation for collateral forgetting
- Memory equity standards and certification
- Sectors: standards bodies, auditors, enterprises
- What: Establish measurable inclusivity/coverage criteria and third-party audits; certify “RTBR-compliant” systems
- Tools/Products/Workflows: ISO-style standards; public scorecards; red-team evaluations across cultures and languages
- Assumptions/Dependencies: Agreement on metrics; independent auditors; incentives for compliance
- Education: epistemic literacy by default
- Sectors: K–12, higher ed, professional training
- What: Integrate curricula and tools that teach multi-perspective analysis, source tracing, uncertainty interpretation, and AI limitations
- Tools/Products/Workflows: Classroom assistants that surface primary sources; “source density meters”; debate-mode AI tutors
- Assumptions/Dependencies: Curriculum approvals; teacher training; equitable tech access
- Healthcare evidence assistants with archival resilience
- Sectors: healthcare
- What: Clinical AI that always surfaces original trials (including non-English/older studies), registers null results, and abstains when evidence is weak
- Tools/Products/Workflows: RAG tied to clinical registries; multilingual EBM corpora; uncertainty-calibrated recommendations
- Assumptions/Dependencies: Regulatory approval; liability frameworks; integration with EHRs
- Finance and legal AI with audit-grade provenance
- Sectors: finance, legal, compliance
- What: Advisory systems that deliver decisions with verifiable document trails and abstain under ambiguity, enabling regulator-ready audits
- Tools/Products/Workflows: Immutable evidence ledgers; per-decision provenance packets; continuous verification pipelines
- Assumptions/Dependencies: Standardized audit formats; regulator endorsement; secure data handling
- Consumer-grade Personal Memory Vaults
- Sectors: consumer software, privacy tech
- What: Personal data stores with user-governed licensing to contribute to collective memory and AI training, including post-mortem directives
- Tools/Products/Workflows: Secure personal data pods; granular consent; micro-licensing and revenue-sharing
- Assumptions/Dependencies: Trustworthy identity and consent infrastructure; market demand; privacy guarantees
- Data cooperatives to uplift underrepresented corpora
- Sectors: NGOs, cultural institutions, philanthropic funders
- What: Community-led digitization and corpus curation for underrepresented languages and regions with equitable licensing
- Tools/Products/Workflows: Participatory data governance; localized OCR/ASR; multilingual annotation programs
- Assumptions/Dependencies: Funding; community leadership; data sovereignty agreements
- Open SDKs for “citation-first” LLM development
- Sectors: developer tooling, open source
- What: Provide libraries that make provenance embedding, abstention, diversity re-ranking, and archival mirroring first-class primitives
- Tools/Products/Workflows: Open-source packages; reference UIs; evaluation harnesses for RTBR metrics
- Assumptions/Dependencies: Community stewardship; compatibility across providers; maintainability
- Independent RTBR auditors and marketplaces
- Sectors: assurance, marketplaces, enterprises
- What: Create a market of third-party RTBR audits and continuous monitoring services for LLMs and retrieval systems
- Tools/Products/Workflows: Black-box and white-box audit suites; bias/coverage probes; SLA-backed monitoring
- Assumptions/Dependencies: Clear demand from buyers; standardized reports; access to systems under test
Notes on cross-cutting assumptions and dependencies:
- Data rights and licensing must permit preservation and attribution while respecting privacy and cultural data sovereignty.
- Many capabilities (e.g., truthfulness probes) are easier with white-box access; black-box alternatives may need proxies and carry higher uncertainty.
- Performance, latency, and cost trade-offs are real—provenance and multi-perspective features add overhead that must be engineered carefully.
- Organizational incentives must align with safety and inclusivity (e.g., KPIs valuing accuracy and equity, not just speed or engagement).
- Standardization (C2PA extensions for text, provenance schemas, memory equity metrics) will accelerate interoperability and adoption.
Glossary
- Abstention mechanisms: Design strategies that enable models to withhold answers when evidence is insufficient. "Such abstention mechanisms are essential to a conception of maximal truthfulness that prioritizes honesty over fluency."
- Activation space: The high-dimensional space of a neural network’s internal activations used to represent information during processing. "a 'truth direction' in activation space"
- Algorithmic de-ranking: Automated lowering of content visibility or ranking by platform algorithms. "algorithmic de-ranking"
- Algorithmic omission: Systematic exclusion of certain information caused by algorithmic processes or design. "algorithmic omission"
- Attribution trails: Embedded metadata chains that record credit and sourcing for generated content. "metadata and attribution trails can be embedded directly into outputs."
- C2PA content authenticity standard: An industry standard for embedding and verifying provenance and authenticity of digital content. "the C2PA content authenticity standard"
- Calibrated probabilities: Probabilities adjusted to reflect true confidence levels, often used to signal “knowing” versus “not knowing.” "calibrated probabilities of 'knowing' versus 'not knowing'"
- Content moderation takedowns: Removals of online material by platforms to enforce policies or regulations. "content moderation takedowns"
- Data controllers: Entities that determine purposes and means of processing personal data under data protection law. "mandates that data controllers erase personal data"
- Data lineage: Documentation of the origins, transformations, and flow of data through systems. "Current solutions largely capture data lineage and content authenticity"
- Digital Object Identifiers (DOIs): Persistent identifiers used to uniquely reference digital scholarly works. "Digital Object Identifiers (DOIs)"
- EKILA: An initiative for synthetic media provenance and attribution in generative art. "initiatives such as EKILA for digital art"
- Epistemic humility: A norm of acknowledging uncertainty and limits in knowledge claims. "cultivating norms of epistemic humility"
- Epistemic integrity: The alignment of system design and outputs with reliable, verifiable knowledge. "optimize for epistemic integrity"
- Epistemic justice: Fair representation and recognition within knowledge systems, avoiding unjust exclusion or bias. "touching on recognition, fairness and epistemic justice in the digital age."
- Foundational AI: Base-level AI systems or models that encode broad knowledge and underpin many downstream applications. "in the context of foundational AI, the collective RTBR including ensuring maximal truthfulness and historical accuracy must take precedence"
- Foundational models: Large pretrained models that serve as general-purpose bases for varied tasks. "foundational models increasingly shape the epistemic foundations of future generations"
- General Data Protection Regulation (GDPR): The EU’s comprehensive data protection law governing personal data processing. "General Data Protection Regulation (GDPR), officially termed the 'Right to Erasure'"
- Hallucination: Confident but incorrect or fabricated content generated by AI models. "ideally reducing hallucination and providing verifiable citations."
- Jurisprudence: The theory and case law framework guiding legal interpretation and rights. "Emerging from European jurisprudence"
- Link rot: The phenomenon of URLs becoming inaccessible over time due to web content decay. "Empirical studies of 'link rot' show that a substantial portion of online content disappears within a few years"
- Local intrinsic dimension analysis: A method for characterizing the dimensionality of local regions in model representations. "Using local intrinsic dimension analysis, investigators found that truthful responses are encoded in more compact, lower-dimensional activation patterns"
- Machine unlearning: Techniques for removing specific learned information from model parameters post-training. "a new phenomenon: 'machine unlearning'"
- Manifolds: Mathematical spaces that locally resemble Euclidean space; used to describe structure in high-dimensional representations. "hallucinations are scattered across higher-dimensional manifolds"
- MASK framework: A benchmark to assess whether models state what they “believe,” distinguishing honesty from accuracy. "the MASK framework, which attempts to measure whether models state what they 'believe,'"
- Parameter space: The space of all model parameters where learned patterns are stored and represented. "distributed patterns in parameter space"
- Probing studies: Analytical techniques that use auxiliary models or tasks to investigate internal representations in neural networks. "Probing studies have identified what has been called a 'truth direction'"
- Provenance: The documented origin and history of information or content, enabling traceability and credit. "Provenance can be layered so that immediate answers remain accessible"
- Reinforcement learning with human feedback (RLHF): A fine-tuning method that aligns model outputs with human preferences via feedback-driven rewards. "fine-tune models through reinforcement learning with human feedback (RLHF)"
- Retrieval-augmented generation (RAG): A method that augments generative models with retrieved evidence to improve factuality. "Retrieval-augmented generation (RAG) was developed to address the limitations"
- Right To Be Remembered (RTBR): A proposed right asserting the preservation and fair representation of contributions in digital memory. "this paper presents a concept of the Right To Be Remembered (RTBR)"
- Right to be forgotten: A legal concept allowing individuals to have personal data delisted or removed from search results. "commonly known as the 'right to be forgotten'."
- Right to Erasure: GDPR Article 17 right requiring deletion of personal data under specified conditions. '"Right to Erasure", which mandates that data controllers erase personal data'
- Search engine optimization (SEO): Techniques aimed at improving the visibility and ranking of content in search engines. "search engine optimization (SEO)"
- Self-attention mechanism: The transformer component that relates tokens to one another to capture dependencies. "whose self-attention mechanism allows the network to capture both short- and long-range dependencies"
- Self-supervised learning objective: A training setup where models learn from unlabeled data by predicting parts of the input (e.g., next tokens). "using a self-supervised learning objective: given a sequence of tokens, the model predicts the probability of the next token"
- Transformer architecture: A neural network architecture based on attention mechanisms, widely used for LLMs. "built on the transformer architecture"
- Truth direction: A geometric direction in model representation space correlated with factual correctness. '"truth direction" in activation space'
Collections
Sign up for free to add this paper to one or more collections.