Censorship Chokepoints: New Battlegrounds for Regional Surveillance, Censorship and Influence on the Internet (2510.18394v1)
Abstract: Undoubtedly, the Internet has become one of the most important conduits to information for the general public. Nonetheless, Internet access can be and has been limited systematically or blocked completely during political events in numerous countries and regions by various censorship mechanisms. Depending on where the core filtering component is situated, censorship techniques have been classified as client-based, server-based, or network-based. However, as the Internet evolves rapidly, new and sophisticated censorship techniques have emerged, which involve techniques that cut across locations and involve new forms of hurdles to information access. We argue that modern censorship can be better understood through a new lens that we term chokepoints, which identifies bottlenecks in the content production or delivery cycle where efficient new forms of large-scale client-side surveillance and filtering mechanisms have emerged.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Explaining “Censorship Chokepoints: New Battlegrounds for Regional Surveillance, Censorship and Influence on the Internet”
Overview: What is this paper about?
This paper looks at how people, companies, and governments control what we see and do on the Internet. It introduces a simple way to understand modern online censorship by focusing on “chokepoints” — key places where information passes through. By controlling these chokepoints, powerful groups can block, slow, or redirect information without you even noticing.
Big Questions the paper asks
To make the topic easier, here are the main questions the researchers explore:
- Where and how is Internet censorship happening today?
- What new kinds of censorship are appearing, especially ones that are subtle and hard to spot?
- How can we organize these practices into a clear, modern system (a “taxonomy”)?
- How are tools like AI models and phone apps being used to filter or guide information?
- What can ordinary users and researchers do to resist or avoid censorship?
How the researchers approached the paper
Instead of running a single experiment, the authors:
- Read and compared many reports, case studies, and technical papers from different countries.
- Collected real examples of censorship from social media, phone apps, websites, search engines, and AI tools.
- Organized these examples into two types of “chokepoints” and explained how they work.
- Reviewed tools and strategies that people use to fight censorship.
Think of the Internet like a city with roads:
- Your device is your “home.”
- The network is the “roads and highways.”
- Servers are the “shops” and “libraries” where information lives.
A “chokepoint” is like a busy bridge or tunnel. If someone controls that bridge, they can block traffic, slow it down, or redirect it somewhere else. The paper shows where these bridges are online and how they’re being controlled.
The two kinds of chokepoints
To make this clearer, here’s a simple comparison:
| Type | What it feels like | What happens to information | How noticeable is it? |
|---|---|---|---|
| Hard chokepoint | Like a locked door | Content is removed or blocked for good | Usually obvious |
| Soft chokepoint | Like a dimmer switch or a detour | Content is hidden, slowed, or attention is redirected | Often subtle or invisible |
Examples, explained simply
Here are a few examples the paper discusses. These are introduced to help you picture how chokepoints work in real life:
- Client-side (your device): Some apps or phone systems can watch what you type or send and quietly report it. Even keyword filters inside games or chat apps can trigger surveillance.
- Servers (the websites): Governments or companies can block or modify websites, or automatically filter emails with certain words.
- Social platforms: Sites can ban accounts or delete posts. Sometimes they use “shadow banning,” where your post exists but is hard or impossible for others to find.
- Search engines: Search results can be tweaked so certain pages never show up or are buried under less helpful results.
- “Attention honeypots”: Trolls or bot accounts flood social media with distracting or misleading posts during elections or protests, pulling attention away from real news.
- AI models: Chatbots and LLMs may refuse to talk about certain topics, or their training data may be biased. That can gently push people toward certain views without obvious blocking.
- Networks: The Internet can be “throttled” — slowed down — for certain sites or during certain events.
Main findings: What did the paper discover and why does it matter?
The authors make several important points:
- Modern censorship is shifting from blunt blocking (hard) to subtle influencing (soft). Soft censorship is sneaky because it doesn’t always delete information — it just makes it harder to find or trust.
- “Client-side” controls on your own device are growing. These are harder for normal users to spot or turn off.
- AI and algorithms are becoming powerful filters. If AI is trained on biased or censored data, it can pass that bias on to users without anyone noticing.
- Soft tactics, like attention distractions and silent ranking changes, can be very effective because most people won’t realize they’re being steered.
- Both hard and soft chokepoints can show up anywhere: on your phone, on websites, on social media, in search engines, and in AI tools.
- Because soft censorship is often invisible, users may “self-censor” — avoiding certain topics just in case — which reduces healthy public discussion.
What can people do about it?
The paper also highlights countermeasures — ways to reduce the impact of censorship:
- Use privacy settings, permission controls, or privacy-friendly phone systems to limit data collection.
- Try decentralised platforms and storage (like some Web3 tools or IPFS) that don’t rely on a single company or server.
- Use trusted VPNs or, in some places, satellite Internet to avoid local blocking and throttling.
- Be careful with app marketplaces and choose trusted developers.
- Consider running some AI models locally to avoid server-side filtering.
- Support or use global monitoring projects that measure censorship in real time, helping everyone see when and where it’s happening.
Use lists sparingly, but here’s a short one to highlight everyday analogies:
- Shadow banning: Like talking in a room where your microphone is secretly off.
- Deep packet inspection: Like a mail sorter opening envelopes to read your letters.
- VPN: Like a private tunnel through the city so your traffic can’t be easily watched.
- SEO poisoning: Like putting fake signs that point you away from the real museum.
So what’s the impact?
The big takeaway is that censorship is changing. Instead of just “blocking,” a lot of modern control is about “bending” attention. That’s dangerous because:
- It’s harder to notice, so people don’t complain or resist.
- It can quietly shape public opinion during important moments, like elections or protests.
- It can limit knowledge over time if AI and search engines keep serving filtered or biased information.
The paper encourages:
- Researchers to paper these soft tactics more closely, especially in AI and on personal devices.
- The public to learn how these systems work, spot when their attention is being redirected, and use tools that give them more control.
- Policymakers and platforms to be transparent about moderation and filtering, so people can trust the information they find.
In short, the Internet is full of “bridges” where information flows. Knowing where those bridges are — and how they can be controlled — helps everyone understand, detect, and resist unfair censorship.
Knowledge Gaps
Unresolved knowledge gaps, limitations, and open questions
Below is a consolidated list of concrete gaps and open questions the paper leaves insufficiently addressed, intended to guide future research.
- Formalization of the chokepoint taxonomy: precise, operational definitions and decision rules for “hard” versus “soft” chokepoints; measurable criteria for noticeability, permanence, detectability, and impact.
- Taxonomy validation: inter-rater reliability and a reproducible coding protocol to classify cases; creation and release of a labeled, multi-country dataset of censorship incidents mapped to chokepoints.
- Prevalence and impact quantification: global, longitudinal estimates of how common each chokepoint type is and their causal effects on information access, public opinion, and mobilization.
- Distinguishing censorship from moderation or outages: diagnostic tests and operational indicators to separate platform policy enforcement, commercial ranking changes, and genuine network failures from state-driven censorship.
- Attribution frameworks: methods to link observed manipulation (e.g., shadowbans, ranking tweaks, botnets) to specific actors (states, platforms, PR firms, troll farms), including use of transparency reports, legal orders, and forensic signals.
- Detection of soft chokepoints at scale: robust methodologies to detect shadowbanning, downtiering, search-result manipulation, attention-honeypot campaigns, and SEO poisoning with ground-truth validation.
- Client-side censorship measurement: scalable static/dynamic analysis of apps/firmware/OS for keyword filtering and surveillance; techniques for detecting closed-source, preinstalled telemetry beyond a few countries; hardware and baseband-level assessment.
- Quantifying chilling effects: longitudinal or natural-experiment designs to measure behavioral self-censorship due to surveillance awareness, verifiable IDs, and platform moderation.
- Verifiable IDs evaluation: empirical studies on speech suppression, participation effects, and circumvention behaviors under different ID regimes; cross-country legal and policy comparisons.
- Web servers and mirrors resilience: systematic evaluation of mirroring, IPFS, and related systems under realistic Sybil/DoS adversaries; deployment playbooks that balance availability, latency, and safety.
- Decentralized social networks in practice: measurements of centralization pressures (relay/instance concentration), resilience during real-world crackdowns, abuse mitigation without reintroducing chokepoints, and adoption/usability in censored regions.
- VPN trust and integrity: standardized, reproducible auditing frameworks for VPN clients/servers (manipulation, privacy leakage, jurisdictional exposure), and secure distribution channels in censored markets.
- Network throttling identification: methods to distinguish policy throttling from congestion or peering issues; real-time throttling fingerprints across protocols (e.g., QUIC/HTTP3) and ISPs; deployment at global scale.
- Search engine auditing: cross-lingual, cross-jurisdiction audits of ranking suppression and filtering; causal tests of attention steering; defenses against state-backed SEO poisoning.
- App marketplace transparency: longitudinal, multi-store datasets of takedowns and delistings with reason classification; analysis of developer chilling effects; sideloading risks and notarization policies in restrictive environments.
- AI model censorship benchmarking: multilingual, geopolitically diverse test suites to quantify refusal patterns and content bias; methods to disentangle training-data censorship versus policy alignment; provenance tracing for training corpora.
- Safe prompt-based circumvention: rigorous evaluations of prompt strategies that retrieve censored information while managing dual-use risks; standardized ethical protocols for such experimentation.
- Model governance and transparency: frameworks for API-level disclosure of filtering policies, fine-tuning datasets, and government requests; techniques to detect covert, state-influenced model adjustments.
- Adversarial dynamics modeling: predictive models of tactic shifts across chokepoints (soft to hard, or vice versa), early-warning indicators, and cost-benefit analyses for both censors and resisters.
- Global monitoring coverage and ethics: strategies to extend vantage points to underrepresented regions (Global South, rural/mobile networks) while protecting participants; privacy-preserving measurement pipelines; standardized schemas and public data releases.
- Equity and disparate impacts: measurement of which communities (e.g., minorities, political dissidents) are disproportionately affected by specific chokepoints; intersectional analyses across platforms and regions.
- Economic analysis: comparative cost models for implementing/maintaining hard vs soft chokepoints and for user circumvention; welfare and externality assessments.
- Legal and policy safeguards: evaluation of algorithmic transparency mandates, platform auditability, and remedies under different legal systems; feasibility and enforcement in authoritarian contexts.
- User safety and usability of countermeasures: empirical studies on the cognitive load, usability barriers, and legal/personal risks of recommended tools (VPNs, decentralized apps, satellite links), with localized training materials.
- Satellite Internet realities: affordability, coverage, legal risk, detectability, and deanonymization concerns for satellite-based circumvention in censoring states.
- Coverage bias of examples: systematic sampling across countries and languages to reduce focus bias (e.g., beyond China/Russia/India); incorporation of less-studied regimes and local platforms.
- Reproducibility and openness: public release of case selection criteria, classification data (e.g., for figures), and code to replicate analyses; maintenance of a living repository for updates.
- Cross-modality censorship: methods to detect/control manipulation affecting images, video, live streams, and ephemeral content; cross-platform coordination analysis (hash-matching, copyright pretexts).
- Overlooked supply-chain chokepoints: systematic mapping of CDNs, DNS resolvers, certificate authorities, domain registrars, cloud providers, and app update channels as potential leverage points.
- Interactions across chokepoints: empirical studies on how combined hard and soft mechanisms amplify effects; identification of the most effective intervention points along the content lifecycle.
- Historical stage transition validation: time-series evidence to test the claim that we have moved beyond ONI’s “Access Contested” stage into covert attention/manipulation regimes; periodization across regions.
Practical Applications
Immediate Applications
The paper’s chokepoint taxonomy (hard vs. soft) and survey of techniques can be operationalized right away across industry, academia, policy, and daily life as follows.
Industry
- Chokepoint audit and mitigation for platforms (software/search/social/app stores/AI)
- Use the hard/soft chokepoint taxonomy to map where your product enforces removals (hard) or reduces visibility/quality (soft), then add transparency, notice, and appeal workflows where feasible. Prioritize high‑risk surfaces: client telemetry, moderation pipelines, ranking, throttling controls, app delisting, and LLM alignment settings. (Dependencies: access to internal logs/APIs; legal and trust & safety buy‑in)
- Shadowban diagnostics for creators and brands
- Provide analytics that detect ghost/search/suggestion/downtiering bans by measuring differential reach across controlled audiences/searches and time. Offer remediation guidance and audit trails. Sectors: media, retail, political campaigns. (Dependencies: API/data access; platform ToS compliance)
- Client-side telemetry minimization and detection
- Ship SDKs and CI checks that flag over-broad permissions, plaintext PII exfiltration, and suspicious keyword/blacklist filters in apps; add on-device permission monitors for enterprise fleets. Sectors: mobile OEMs, app developers, healthcare/finance apps with compliance exposure. (Dependencies: mobile OS APIs; developer adoption)
- Censorship-resilient hosting playbooks for publishers/NGOs
- Deploy mirrored hosting, dynamic DNS, CDN shielding, and peer-to-peer backups (e.g., IPFS) with DDoS runbooks for newsrooms and civil-society sites. Integrate routinely verified content hashes and fallback URLs in templates. (Dependencies: ops capacity; threat modeling; acceptance of IPFS trade-offs)
- VPN and circumvention tech due diligence
- Use automated test harnesses to detect URL redirection, traffic blocking, weak crypto, and jurisdiction-specific tampering in VPN clients; provide a certification label. Sectors: security vendors, ISPs, enterprise IT. (Dependencies: reproducible tests; vendor cooperation)
- Attention honeypot detection as a service
- ML-based detection of coordinated inauthentic behavior (activity bursts, master–follower patterns, content homogeneity) to flag botnets and state-sponsored trolls for platforms/brands/election integrity teams. (Dependencies: labeled datasets; platform data sharing)
- LLM refusal/bias monitoring for enterprise AI ops
- Regression tests that compare answer/refusal patterns across jurisdictions and topics; alert when alignment changes create hidden “soft” censorship. Sectors: software, education, healthcare decision support. (Dependencies: test corpora; multi-model access)
- Search bias and availability watchdog
- Continuous measurement of accessibility and ranking shifts across engines (e.g., query diffs for sensitive topics) with alerts for anomalous filtering. Sectors: media, education, public health communications. (Dependencies: stable scraping/usage; legal compliance)
Academia
- Replicable censorship measurement campaigns
- Extend existing platforms (e.g., multi-probe throttling/DNS/packet injection tests) and publish open dashboards for regional events. (Dependencies: ethics approvals; probe deployment partnerships)
- Datasets and benchmarks for attention operations
- Curate botnet/troll corpora and behavioral features; release benchmarks for detection models and case studies around elections/conflicts. (Dependencies: annotation pipelines; platform collaboration)
- LLM censorship benchmarks
- Cross-lingual prompt suites to quantify geopolitical refusal patterns; release scores and test protocols. (Dependencies: compute; multilingual expertise)
- Comparative client-side surveillance studies
- Systematic analysis of preinstalled apps and popular clients for PII flows/keyword lists across regions; produce methodological toolkits. (Dependencies: device labs; reverse-engineering expertise)
Policy and Regulators
- Transparency and due-process standards for content actions
- Notice/appeal/record‑keeping for removals (hard) and visibility reductions (soft), including shadowban reporting to affected users. (Dependencies: statutory authority; platform engagement)
- Limits and disclosure for mandatory client surveillance
- Guardrails on government-mandated apps/telemetry; require data minimization, audits, and sunset clauses; require disclosure of keyword lists to independent auditors. (Dependencies: legislative action; enforcement capacity)
- VPN security certification and consumer labeling
- Establish test standards for encryption, no‑tamper, and jurisdictional risks; publish a certified list. (Dependencies: NIST/ETSI-like standards work; lab funding)
- Search and recommender accountability
- Mandate provenance and audit interfaces for ranking changes affecting civic topics; require independent access for watchdogs. (Dependencies: privacy‑preserving audit design)
- Deplatforming portability and migration support
- Require data export/interoperability to mitigate harm from platform bans; support federation with decentralized protocols. (Dependencies: interoperability standards)
- Real-name/ID impact assessments
- Human-rights and competition assessments before implementing SIM/Internet ID systems; require periodic review. (Dependencies: cross-ministry coordination)
Daily Life (Individuals and Organizations)
- Personal resilience kit for information access
- Use a vetted VPN, keep a shortlist of mirror URLs, try multi-engine search and local/offline LLM retrieval, and maintain backup channels (e.g., E2EE chats with codewords/homonyms during filtering). (Dependencies: device capability; local law)
- Permission and privacy hygiene
- Regularly audit app permissions; uninstall high-risk apps; prefer privacy-friendly OS builds when feasible (e.g., secondary device for at‑risk users). (Dependencies: technical literacy)
- Shadowban self-tests
- Verify reach via test accounts and independent search; adjust hashtags/keywords; log incidents to support appeals. (Dependencies: time; platform features)
- Community monitoring and mirroring
- Participate in local probe networks and peer-to-peer content seeding to keep key resources reachable during disruptions. (Dependencies: community organizers; bandwidth)
- Media literacy practices
- Cross-verify claims across sources; watch for attention honeypots (sudden off-topic trends) and coordinated posting patterns. (Dependencies: training materials)
Long-Term Applications
Strategic developments that require further research, scaling, standardization, or new infrastructure.
Industry
- Chokepoint-aware platform architectures
- Redesign systems to minimize single points of control: federated moderation, transparency logs for content actions, and verifiable ranking change records. (Dependencies: standards; performance and privacy safeguards)
- Federated app store ecosystems
- Portable listings and multi-store notarization to reduce political delisting risk while preserving malware defenses. (Dependencies: OS vendor cooperation; code-signing governance)
- Anti-throttling transport innovations
- Adaptive obfuscation and congestion-camouflage protocols that survive DPI and protocol-level throttling without breaking QoS. (Dependencies: ISP trials; IETF standardization)
- LLM “refusal provenance” and policy-conditional serving
- Expose structured reasons for refusals and allow jurisdictionally transparent policy modules with audit hooks; offer dual-track models (safety vs. plurality modes) for sensitive knowledge retrieval. (Dependencies: safety policy frameworks; regulator/market alignment)
- “Censorship readiness level” (CRL) scoring
- A cross‑vendor metric for products’ susceptibility to hard/soft chokepoints across client, network, server, and model layers; market as a procurement signal. (Dependencies: multi-stakeholder consensus)
Academia
- Robust, censorship-resistant P2P and storage
- Anti‑Sybil/DoS mechanisms for IPFS-like systems; incentive-compatible replication with retrieval guarantees for public-interest content. (Dependencies: protocol and game-theory advances)
- Federated global monitoring network with safe harbor
- Privacy-preserving, legally protected probes and data sharing to enable continuous, comparable measurements. (Dependencies: international agreements; ethics frameworks)
- Causal impact studies of censorship
- Longitudinal analyses of how hard vs. soft chokepoints affect civic participation, polarization, and trust; inform proportionality tests. (Dependencies: cross-country data access)
- Standardized LLM censorship benchmarks and datasets
- Multilingual, culturally diverse corpora with documented moderation biases and tools to debias or expose them. (Dependencies: licensing; community governance)
Policy and Regulators
- International norms for network restrictions
- Proportionality, time-bounds, and transparency requirements for throttling/shutdowns with independent oversight and after‑action audits. (Dependencies: treaty processes; enforcement mechanisms)
- Legal carve-outs for measurement and circumvention
- Safe-harbor protections for researchers, journalists, and vetted tools used to detect and bypass censorship in line with human-rights law. (Dependencies: legislative consensus)
- AI and app store human-rights due diligence
- Require human-rights impact assessments for alignment policies, dataset provenance, and delisting rules; mandate independent audits. (Dependencies: audit capacity)
- Real-name/ID governance with sunset and redress
- If implemented, include narrow scoping, data minimization, appeals, and sunset clauses; audit chilling effects on vulnerable groups. (Dependencies: monitoring offices)
Daily Life (Individuals and Organizations)
- Community satellite and mesh cooperatives
- Localized satellite/mesh initiatives to keep critical information flows during crises without centralized chokepoints. (Dependencies: spectrum/licensing; funding)
- Education curricula on chokepoints
- Integrate the hard/soft chokepoint model into digital/media literacy from secondary through higher education. (Dependencies: curriculum bodies; teacher training)
- Personal AI retrieval agents with multi-origin evidence
- Local agents that retrieve from diverse sources (including decentralized stores), cite provenance, and flag potential filtering. (Dependencies: open models; device hardware)
- Sector-specific resilience plans
- Healthcare/public health, finance, and emergency services adopt chokepoint-aware continuity plans for public communications and alerts. (Dependencies: regulatory guidance; drills)
Assumptions and Dependencies That Cut Across Applications
- Access to platform/network data and APIs; lawful basis for collection and audits.
- User adoption and digital literacy; device capability for local models and VPNs.
- Open standards and cooperation among OS vendors, app stores, ISPs, and platforms.
- Ethical, privacy-preserving measurement and safe-harbor protections.
- Risk of adversarial adaptation by censors (evasion, legal pressure) and potential retaliation against users or organizations.
- Cost and availability of alternative infrastructure (satellite, P2P, federation) and their legal status in specific jurisdictions.
Glossary
- Attention honeypots: Tactics that divert users’ focus to unrelated content to implicitly suppress genuine information. "For instance ``attention honeypots'' draw users' attention into some unconnected content and thereby implicitly silence genuine content during political events."
- Botnets: Networks of automated accounts coordinated to amplify or manipulate online discourse. "the disinformation botnets consisting of 275 bot accounts were more active and generate more content than legitimate journalists"
- Chilling effect: The deterrence of speech or behavior due to perceived surveillance or potential repercussions. "The pervasive nature of client-side filtering may induce a so-called ``chilling effect''"
- Client-side keyword censorship: Filtering of content on user devices based on locally enforced keyword lists. "client-side keyword censorship was common in popular {Chinese} mobile games"
- Data poisoning attack: Malicious alteration of data (e.g., search indices) to disrupt functionality or degrade results. "a data poisoning attack could also be executed to disrupt the platform's search functions."
- Decentralised networks: Distributed systems without a single point of control, used to resist censorship. "including the recent development of decentralised networks and innovative strategies to bypass censorship."
- Deplatforming: Removing users or groups from platforms to curb certain content or behaviors. "‘deplatforming’ is employed by social platforms as a moderation method to maintain civil discourse."
- Deep Packet Inspection: Inspecting packet payloads at the network level to filter or manipulate traffic. "Deep Packet Inspection and DNS poisoning"
- DNS manipulation: Interference with the Domain Name System to misdirect or block access. "with the ability to detect DNS manipulation, packet drops, and censored webpages."
- DNS poisoning: Corrupting DNS responses to redirect or block users from legitimate domains. "injecting pages (through mechanisms such as DNS poisoning) declaring that access to a web resource is blocked."
- DNS tampering: Altering DNS behavior (e.g., responses or resolution) to enforce censorship. "utilising filtering techniques such as DNS tampering, HTTP DoS, and SNI (Server Name Indication)"
- Domain name blocking: Preventing access to specific domains at the DNS or routing level. "using various methods such as domain name blocking, TCP packet injection"
- Downtiering: Algorithmically reducing content visibility without outright removal. "Downtiering (content visibility reduced algorithmically)."
- Dynamic DNS: Automatically updating DNS records to track changing IP addresses, aiding resilience. "Host websites on cloud, using dynamic DNS, publish content on a peer-to-peer network"
- End-to-end encryption (E2EE): Encryption where only communicating endpoints can read content, blocking intermediary inspection. "get ahead of any end-to-end encryption (E2EE) that may prevent more sophisticated analysis in the network."
- Eternity service: A replicated storage concept ensuring content cannot be deleted after publication. "Ross Anderson proposed the eternity service to guarantee the availability of digitally published work"
- Ghost bans: A form of shadowban where content is visible only to the creator. "Ghost bans (content visible only to the creator)"
- HTTP DoS: Denial-of-service techniques targeting HTTP to disrupt site availability. "utilising filtering techniques such as DNS tampering, HTTP DoS, and SNI (Server Name Indication)"
- I2P (Invisible Internet Project): An anonymizing network used for censorship circumvention. "such as the Tor network or The Invisible Internet Project (I2P)."
- IMEI: A unique device identifier used by mobile networks; its collection can enable tracking. "Device identifiers and IMEI are among the data collected."
- Information poisoning: Injecting false or malicious data to degrade the reliability of systems. "vulnerable to denial-of-service attacks or information poisoning."
- Internet Content Provider license: A regulatory permit required to operate websites in certain jurisdictions. "any website operating from {China} must obtain an Internet Content Provider license"
- InterPlanetary File System (IPFS): A content-addressed, distributed file system used for censorship-resistant hosting. "the InterPlanetary File System (IPFS), which utilises content-based addressing"
- Jitter: Variation in packet delay that degrades network performance and media quality. "including increased packet loss and jitter, network delays, and throttling"
- LLMs: AI models trained on vast corpora; they may embed geopolitical censorship. "reveals that some LLMs have embedded censorship patterns closely linked to their geopolitical origins."
- Mail Exchange (MX) server: DNS records specifying mail servers; used here to highlight email-level censorship. "regardless of whether the Mail Exchange (MX) server is based in {Hong Kong} or {Mainland China}"
- NLP (Natural Language Processing): Techniques for processing language data; used for moderation and hashtag analysis. "Flag inaccurate hashtags using ML or NLP"
- Nostr: An open protocol enabling censorship-resistant data relays for social content. "Nostr (Notes and Other Stuff Transmitted by Relays), an open-source protocol, which allows users to deploy decentralised relays to offer censorship-resistant data storage and transmission"
- Peer-to-Peer (P2P) network: A decentralized architecture where nodes directly exchange data. "a decentralised Peer-to-Peer (P2P) network"
- Prompts: Inputs crafted to elicit specific outputs from AI models; can be designed to bypass filters. "retrieving information using carefully constructed prompts that do not contain sensitive keywords may still be possible."
- Quality of Service (QoS): Metrics governing network performance (e.g., loss, latency); targeted to degrade access. "worsening the network Quality of Service (QoS), causing symptoms of loss of QoS"
- Search bans: Removing content from search results to suppress discoverability. "Search bans (content removed from search results)"
- Search Engine Optimisation (SEO) poisoning: Manipulating ranking signals to promote misleading or censored content. "Search Engine Optimisation (SEO) poisoning."
- Search suggestion bans: Excluding content from auto-suggest features to reduce reach. "Search suggestion bans (content excluded from search suggestions)"
- Server Name Indication (SNI): A TLS extension indicating the hostname; can be inspected for censorship. "SNI (Server Name Indication)"
- Shadowbanning: Covertly limiting content visibility or reach without user notification. "Risius and Blasiak argue that four types of shadowbans can be identified on social platforms"
- Sybil attack: Creating many fake identities to surround or overwhelm legitimate nodes. "a Sybil attack can be operated by generating enough malicious peers around a legitimate content provider node"
- TCP packet injection: Inserting forged TCP packets to manipulate or block connections. "using various methods such as domain name blocking, TCP packet injection"
- Throttling: Intentionally slowing network protocols or traffic to hinder access. "{Iran} was found to disrupt Internet connectivity by throttling Internet protocols and speeds"
- Tor network: An anonymizing overlay network used to avoid surveillance and censorship. "such as the Tor network or The Invisible Internet Project (I2P)."
- Vantage points: Measurement probes placed at various network locations to detect censorship. "Vantage points can generally collect data from the client-side, gateway or server-side passively or actively."
- VPN software: Tools that tunnel traffic to bypass filters; sometimes themselves compromised. "some VPN software has also been observed intercepting or manipulating traffic"
- Web3: Decentralized, blockchain-based web architecture proposed to resist centralized control. "decentralised and Web3-based systems (blockchain and crypto-based)"
- Word embeddings: Vector representations of words; biased corpora can encode censorship-related biases. "word embeddings trained on censored or moderated corpora could introduce biases into predictive models."
Collections
Sign up for free to add this paper to one or more collections.