Freeing the Law with LOCUS: A Local Ordinance Corpus for the United States

This presentation introduces LOCUS, the first comprehensive machine-readable corpus of U.S. local ordinances at scale. Covering 9,239 cities and counties, LOCUS transforms fragmented local laws into a harmonized research substrate, enabling systematic legal retrieval, comparative policy analysis, and evaluation of legal AI systems. The talk explores the technical pipeline behind this 80-gigabyte corpus, reveals novel dimensional analyses of local law, and demonstrates how LOCUS advances both empirical legal research and computational legal applications.
Script
Local laws governing zoning, housing, and business licensing are scattered across thousands of vendor platforms, locked away from systematic analysis. The authors built LOCUS to free this hidden legal landscape, harmonizing codes from over 9,000 cities and counties into a single machine-readable corpus.
Transforming 7 million pages of raw PDFs required sophisticated engineering. The researchers deployed a 1-billion-parameter vision-language model to handle diverse layouts, from scanned documents to born-digital text, then segmented laws into precise sections and removed structural noise.
LOCUS scores every ordinance on four continuous dimensions. Opacity measures how difficult a law is for a layperson to understand, while paternalism distinguishes self-regarding rules from public-oriented ones. These axes reveal what local laws do and how they communicate.
Dimensional analysis uncovered striking patterns. Florida laws stand out as highly opaque yet not paternalistic, while opacity and paternalism correlate only weakly at 0.11. County codes are generally more opaque than city codes, and zoning dominates counties while nuisance rules cluster in cities.
LOCUS does not resolve which government truly controls a given law. Instead, it selects the most substantial code per county, creating a pragmatic harmonized layer that enables reproducible retrieval and connects legal text to population and policy data.
By transforming fragmented ordinances into a scalable substrate, LOCUS enables empirical legal studies, regulatory extraction, and robust evaluation of legal language models. Explore how this corpus is advancing computational legal research at EmergentMind.com and create your own videos from the latest papers.