This paper addresses the problem of automatic human trafficking detection from web-based escort advertisements. The authors present a trafficking detection pipeline developed over three years within the DARPA Memex program. The primary focus is on post hoc bias analysis and mitigation strategies, highlighting the challenges of deploying predictive machine learning algorithms in this sensitive domain. The work led to the integration of an interpretable solution into a search system used by over 200 law enforcement agencies.
The paper identifies that human trafficking activities have a significant presence on the web, often hidden within a large volume of escort advertisements. The goal is to assign a normalized risk score to a set of advertisements, indicating the likelihood of trafficking-related activities. The approach involves analyzing sets of ads (case studies) for subtle clues indicative of trafficking, such as underage escorts, risky services, or movement between cities. The ultimate aim is to aid law enforcement by prioritizing investigations and preserving limited resources.
Key contributions include:
- Defining the trafficking detection problem and motivating automated detection methods.
- Presenting an architectural overview of the trafficking detection pipeline, including data collection, feature extraction, and machine learning stages.
- Describing a bias mitigation plan developed and implemented based on insights from real-world usage and feedback.
- Detailing lessons learned over three years of research, focusing on the importance of interpretability and the evolution of the problem definition.
The paper discusses related work in intelligent systems for counter-human trafficking and the broader field of trust and bias in AI. It references systems like the Domain-specific Insight Graph (DIG) used by law enforcement and studies highlighting biases in standard algorithms.
The problem definition involves a domain-specific collection C of webpages containing escort ads and reviews. The aim is to define an assignment function f:C′→[0,1] that assigns a trafficking risk score to a subset C′⊂C of ads. The authors note that outputs by f cannot be validated except through a real-world investigation.
The challenges addressed include:
- Sparsity: Trafficking-related ads are sparsely distributed among a large number of escort ads.
- Scale: The volume of online sex advertisements is vast, making manual classification infeasible. The Memex repository indexes over 100 million documents.
- Adversarial nature: Traffickers constantly evolve their methods, making it difficult to identify indicative signals.
The approach involves several stages, as illustrated in Figure 1 of the paper:
- Crawling and Data Collection: Web crawlers collect sex-related advertisements from online marketplaces.
- Extraction: Information Extraction (IE) algorithms extract domain-specific attributes such as phone numbers, dates, locations, and text descriptions.
- Labeling: Trafficking experts label escort data, with positive labels obtained from law enforcement partners and negative labels sampled from the corpus.
- Sampling and Clustering: Sampling addresses the scarcity of negatively-labeled ads. Correlation clustering, using algorithms like KWIKCLUSTER, groups related ads based on multi-modal similarity functions.
- Featurization and Binary Classification: Text is converted into numerical feature vectors using bag-of-words approaches and word embedding algorithms. Linear-kernel SVMs, ensemble models, and penalized logistic regression are used for classification.
- Evaluation: The approach is evaluated using the area under the Receiver Operating Characteristic (ROC) curve. Posthoc evaluation studies involve detailed analysis of feature importance.
The bias mitigation plan addresses three main types of bias:
- Labeling Bias: Labeled data is biased toward certain locations, web domains, and the positive class. Mitigation involves sampling additional negative examples conditioned on biases found in positive class data and removing biased features.
- Domain-Specific Bias: Cluster sizes vary, and escort ad content is often duplicated. Mitigation involves sampling negative clusters to resemble positive cluster sizes and using multi-objective clustering to prevent duplicated content.
- Estimation Bias: Training data is limited, and careless partitioning can cause overfitting. Mitigation involves conditioning cross-validation folds to have matching distributions of unwanted features and maintaining model interpretability.
The authors describe a method for diagnosing and evaluating bias using statistical significance tests, such as Pearson's chi-squared test, to evaluate the independence of potentially-biased features relative to the class label. For example, they test the null hypothesis that the distribution of a web domain is independent with respect to label class.
Mitigation strategies include:
- Information Removal: Removing tokens in the ad text that refer to locations.
- Conditioned Sampling: Adding or removing data points to ensure that the distribution of the biased feature is similar across positive and negative data points.
The paper emphasizes the importance of interpretability in the model, achieved through indicator mining and integration. Indicators are expert-elicited rules and unsupervised text embeddings that supplement ads with clues suggestive of potential trafficking. These indicators are being integrated into the DIG search system.
The Memex trafficking detection systems are being transitioned to the office of the District Attorney of New York, and non-trafficking versions have been released as open-source software. The DIG system, along with other trafficking detection tools, has contributed to trafficking prosecutions, including a case in San Francisco where a man was sentenced to 97 years to life.