Co-Matching: Towards Human-Machine Collaborative Legal Case Matching (2405.10248v1)
Abstract: Recent efforts have aimed to improve AI machines in legal case matching by integrating legal domain knowledge. However, successful legal case matching requires the tacit knowledge of legal practitioners, which is difficult to verbalize and encode into machines. This emphasizes the crucial role of involving legal practitioners in high-stakes legal case matching. To address this, we propose a collaborative matching framework called Co-Matching, which encourages both the machine and the legal practitioner to participate in the matching process, integrating tacit knowledge. Unlike existing methods that rely solely on the machine, Co-Matching allows both the legal practitioner and the machine to determine key sentences and then combine them probabilistically. Co-Matching introduces a method called ProtoEM to estimate human decision uncertainty, facilitating the probabilistic combination. Experimental results demonstrate that Co-Matching consistently outperforms existing legal case matching methods, delivering significant performance improvements over human- and machine-based matching in isolation (on average, +5.51% and +8.71%, respectively). Further analysis shows that Co-Matching also ensures better human-machine collaboration effectiveness. Our study represents a pioneering effort in human-machine collaboration for the matching task, marking a milestone for future collaborative matching studies.
- Optimisation of knowledge management (KM) with machine learning (ML) Enabled. Information 14, 1 (2023), 35.
- Does the whole exceed its parts? the effect of ai explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.
- Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150 (2020).
- A history of AI and Law in 50 papers: 25 years of the international conference on AI and Law. Artificial Intelligence and Law 20 (2012), 215–319.
- Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science 331, 6013 (2011), 83–87.
- Hier-spcnet: a legal statute hierarchy-based heterogeneous network for computing legal case document similarity. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval. 1657–1660.
- Methods for computing legal document similarity: A comparative study. arXiv preprint arXiv:2004.12307 (2020).
- Amy R Bland and Alexandre Schaefer. 2012. Different varieties of uncertainty in human decision-making. Frontiers in neuroscience 6 (2012), 85.
- Cognitive model priors for predicting human decisions. In International conference on machine learning. PMLR, 5133–5141.
- Modeling patterns of probability calibration with random support theory: Diagnosing case-based judgment. Organizational Behavior and Human Decision Processes 97, 1 (2005), 64–81.
- Unlocking the tacit knowledge of data work in machine learning. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems. 1–7.
- Yujin Cha and Sang Wan Lee. 2021. Human Uncertainty Inference via Deterministic Ensemble Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 5877–5886.
- LAiW: A Chinese Legal Large Language Models Benchmark (A Technical Report). arXiv:2310.05620 [cs.CL]
- Human-algorithm collaboration: Achieving complementarity and avoiding unfairness. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 1639–1656.
- LawBench: Benchmarking Legal Knowledge of Large Language Models. arXiv preprint arXiv:2309.16289 (2023).
- Kurt D Fenstermacher. 2005. The tyranny of tacit knowledge: What artificial intelligence tells us about knowledge representation. In Proceedings of the 38th annual Hawaii international conference on system sciences. IEEE, 243a–243a.
- Charles Findling and Valentin Wyart. 2021. Computation noise in human learning and decision-making: origin, impact, function. Current Opinion in Behavioral Sciences 38 (2021), 124–132.
- Advances and challenges in conversational recommender systems: A survey. AI Open 2 (2021), 100–126.
- Michael E Gorman. 2002. Types of knowledge and their roles in technology transfer. The Journal of Technology Transfer 27, 3 (2002), 219–231.
- On calibration of modern neural networks. In International conference on machine learning. PMLR, 1321–1330.
- Demetris Hadjimichael and Haridimos Tsoukas. 2019. Toward a better understanding of tacit knowledge in organizations: Taking stock and moving forward. Academy of Management Annals 13, 2 (2019), 672–703.
- Characterizing stages of a multi-session complex search task through direct and indirect query modifications. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 897–900.
- Augmenting Bayesian Optimization with Preference-based Expert Feedback. In ICML 2023 Workshop The Many Facets of Preference-Based Learning.
- Alex Kendall and Yarin Gal. 2017. What uncertainties do we need in bayesian deep learning for computer vision? Advances in neural information processing systems 30 (2017).
- Combining human predictions with model probabilities via confusion matrices and calibration. Advances in Neural Information Processing Systems 34 (2021), 4421–4434.
- Automatic boolean query suggestion for professional search. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 825–834.
- Similarity analysis of legal judgments. In Proceedings of the fourth annual ACM Bangalore conference. 1–4.
- Ludmila I Kuncheva. 2014. Combining pattern classifiers: methods and algorithms. John Wiley & Sons.
- Alice Lam. 2000. Tacit knowledge, organizational learning and societal institutions: An integrated framework. Organization studies 21, 3 (2000), 487–513.
- SAILER: Structure-Aware Pre-Trained Language Model for Legal Case Retrieval. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 1035–1044. https://doi.org/10.1145/3539618.3591761
- Conversational vs Traditional: Comparing Search Behavior and Outcome in Legal Case Retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1622–1626.
- Query Generation and Buffer Mechanism: Towards a better conversational agent for legal case retrieval. Information Processing & Management 59, 5 (2022), 103051.
- A Taxonomy Characterizing Human and ML Predictive Decision-making. ICML Workshop on Human-Machine Collaboration and Teaming (2022).
- Link analysis for representing and retrieving legal information. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 380–393.
- Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–19.
- Collaborative Human-ML Decision Making Using Experts’ Privileged Information Under Uncertainty. In HUMAN@ AAAI Fall Symposium.
- Dean Mason. 2006. Legal information retrieval study–Lexis professional and Westlaw UK. Legal Information Management 6, 4 (2006), 246–250.
- Finding relevant indian judgments using dispersion of citation network. In Proceedings of the 24th international conference on World Wide Web. 1085–1088.
- How data science workers work with data: Discovery, capture, curation, design, creation. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–15.
- Ikujiro Nonaka. 1994. A dynamic theory of organizational knowledge creation. Organization science 5, 1 (1994), 14–37.
- Crowd counting with decomposed uncertainty. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 11799–11806.
- OpenAI. 2022. OpenAI: Introducing ChatGPT. https://openai.com/blog/chatgpt. OpenAI.
- Neural variability and sampling-based probabilistic representations in the visual cortex. Neuron 92, 2 (2016), 530–543.
- Offline RL+ CKG: A hybrid AI model for cybersecurity tasks. In AAAI Spring Symposium on Challenges Requiring the Combination of Machine Learning and Knowledge Engineering.
- Direct uncertainty prediction for medical second opinions. In International Conference on Machine Learning. PMLR, 5281–5290.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3982–3992.
- Mark O Riedl and Brent Harrison. 2016. Using stories to teach human values to artificial agents. In Workshops at the Thirtieth AAAI Conference on Artificial Intelligence.
- Gilbert Ryle. 1945. Knowing how and knowing that: The presidential address. In Proceedings of the Aristotelian society, Vol. 46. JSTOR, 1–16.
- Improving legal information retrieval using an ontological framework. Artificial Intelligence and Law 17 (2009), 101–124.
- Pallavi Satsangi. 2019. Automation of tacit knowledge using machine learning. In 2019 6th International Conference on Soft Computing & Machine Intelligence (ISCMI). IEEE, 35–39.
- BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval.. In IJCAI. 3501–3507.
- Understanding Relevance Judgments in Legal Case Retrieval. ACM Trans. Inf. Syst. 41, 3, Article 76 (feb 2023), 32 pages. https://doi.org/10.1145/3569929
- Investigating user behavior in legal case retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 962–972.
- Law article-enhanced legal case matching: A causal learning approach. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1549–1558.
- Law Article-Enhanced Legal Case Matching: A Causal Learning Approach. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 1549–1558. https://doi.org/10.1145/3539618.3591709
- Explainable Legal Case Matching via Graph Optimal Transport. IEEE Transactions on Knowledge and Data Engineering (2023), 1–14. https://doi.org/10.1109/TKDE.2023.3321935
- PADTUN-using semantic technologies in tunnel diagnosis and maintenance domain. In The Semantic Web. Latest Advances and New Domains: 12th European Semantic Web Conference, ESWC 2015, Portoroz, Slovenia, May 31–June 4, 2015. Proceedings 12. Springer, 683–698.
- A Crossroads for Hybrid Human-Machine decision-making. (2023).
- Lawformer: A pre-trained language model for chinese legal long documents. AI Open 2 (2021), 79–84.
- Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE transactions on systems, man, and cybernetics 22, 3 (1992), 418–435.
- Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration. NeurIPS Workshop on Human in the Loop Learning (HiLL). (2022).
- Explainable legal case matching via inverse optimal transport-based rationale extraction. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 657–668.
- Ground Truth Or Dare: Factors Affecting The Creation Of Medical Datasets For Training AI. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 351–362.
- Conversational Information Seeking. arXiv:2201.08808 [cs.IR]
- Knowledge representation for the intelligent legal case retrieval. In International Conference on Knowledge-Based and Intelligent Information and Engineering Systems. Springer, 339–345.
- ChengXiang Zhai. 2020. Interactive information retrieval: Models, algorithms, and evaluation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2444–2447.
- Xiao Zhang and David Evans. 2021. Understanding Intrinsic Robustness Using Label Uncertainty. In International Conference on Learning Representations.
- Open Chinese Language Pre-trained Model Zoo. Technical Report. https://github.com/thunlp/openclap
- Jianlong Zhou and Fang Chen. 2019. Towards trustworthy human-AI teaming under uncertainty. In IJCAI 2019 workshop on explainable AI (XAI).