Explainable machine learning multi-label classification of Spanish legal judgements (2405.17610v1)
Abstract: Artificial Intelligence techniques such as Machine Learning (ML) have not been exploited to their maximum potential in the legal domain. This has been partially due to the insufficient explanations they provided about their decisions. Automatic expert systems with explanatory capabilities can be specially useful when legal practitioners search jurisprudence to gather contextual knowledge for their cases. Therefore, we propose a hybrid system that applies ML for multi-label classification of judgements (sentences) and visual and natural language descriptions for explanation purposes, boosted by Natural Language Processing techniques and deep legal reasoning to identify the entities, such as the parties, involved. We are not aware of any prior work on automatic multi-label classification of legal judgements also providing natural language explanations to the end-users with comparable overall quality. Our solution achieves over 85 % micro precision on a labelled data set annotated by legal experts. This endorses its interest to relieve human experts from monotonous labour-intensive legal classification tasks.
- HMATC: Hierarchical multi-label Arabic text classification model using machine learning. Egyptian Informatics Journal, 22, 225–237. doi:10.1016/j.eij.2020.08.004.
- Explanation in AI and law: Past, present and future. Artificial Intelligence, 289, 103387. doi:10.1016/j.artint.2020.103387.
- Leukocyte classification based on feature selection using extra trees classifier: a transfer learning approach. Turkish Journal of Electrical Engineering Computer Sciences, 29, 2742–2757. doi:10.3906/elk-2104-183.
- LegalDB: Long DistilBERT for Legal Document Classification. In 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT) (pp. 1–4). IEEE. doi:10.1109/ICAECT49130.2021.9392558.
- Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. doi:10.1016/j.inffus.2019.12.012.
- Berrar, D. (2019). Cross-Validation. In Encyclopedia of Bioinformatics and Computational Biology (pp. 542–545). Academic Press. doi:10.1016/B978-0-12-809633-8.20349-X.
- Branting, L. K. (2017). Data-centric and logic-based models for automated legal problem solving. Artificial Intelligence and Law, 25, 5–27. doi:10.1007/s10506-017-9193-x.
- Scalable and explainable legal prediction. Artificial Intelligence and Law, 29, 213–238. doi:10.1007/s10506-020-09273-1.
- Online multi-label dependency topic models for text classification. Machine Learning, 107, 859–886. doi:10.1007/s10994-017-5689-6.
- Explicable recommendation based on knowledge graph. Expert Systems with Applications, (p. 117035). doi:10.1016/j.eswa.2022.117035.
- Multi-label classification of legislative contents with hierarchical label attention networks. International Journal on Digital Libraries, 23, 77–90. doi:10.1007/s00799-021-00307-w.
- A Hierarchical Label Network for Multi-label EuroVoc Classification of Legislative Contents. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 238–252). Springer volume 11799 LNCS. doi:10.1007/978-3-030-30760-8_21.
- Canhoto, A. I. (2021). Leveraging machine learning in the global fight against money laundering and terrorism financing: An affordances perspective. Journal of Business Research, 131, 441–452. doi:10.1016/j.jbusres.2020.10.012.
- Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics, 8, 832. doi:10.3390/electronics8080832.
- A comparative study of automated legal text classification using random forests and deep learning. Information Processing & Management, 59, 102798. doi:10.1016/j.ipm.2021.102798.
- A unified multi-label classification framework with supervised low-dimensional embedding. Neurocomputing, 171, 1563–1575. doi:10.1016/j.neucom.2015.07.087.
- Building a Production-Ready Multi-Label Classifier for Legal Documents with Digital-Twin-Distiller. Applied Sciences, 12, 1470. doi:10.3390/app12031470.
- Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychological Methods, 21, 273–290. doi:10.1037/met0000079.
- A Machine Learning Based Methodology for Automatic Annotation and Anonymisation of Privacy-Related Items in Textual Documents for Justice Domain. In Advances in Intelligent Systems and Computing (pp. 530–539). Springer volume 1194 AISC. doi:10.1007/978-3-030-50454-0_55.
- Contextual semantic embeddings based on fine-tuned AraBERT model for Arabic text multi-class categorization. Journal of King Saud University - Computer and Information Sciences, (pp. 1–7). doi:10.1016/j.jksuci.2021.02.005.
- A Deep Learning-Based Approach for Multi-Label Emotion Classification in Tweets. Applied Sciences, 9, 1123. doi:10.3390/app9061123.
- Text summarization from legal documents: a survey. Artificial Intelligence Review, 51, 371–402. doi:10.1007/s10462-017-9566-2.
- Health Quest: A generalized clinical decision support system with multi-label classification. Journal of King Saud University - Computer and Information Sciences, 33, 45–53. doi:10.1016/j.jksuci.2018.11.003.
- Explanation plug-in for stream-based collaborative filtering. In Proceedings of the Information Systems and Technologies Conference (p. ahead of print). Springer.
- Improving Text Classification using Local Latent Semantic Indexing. In Proceedings of the IEEE International Conference on Data Mining (pp. 162–169). IEEE. doi:10.1109/ICDM.2004.10096.
- A Novel Sigmoid-Function-Based Adaptive Weighted Particle Swarm Optimizer. IEEE Transactions on Cybernetics, 51, 1085–1093. doi:10.1109/TCYB.2019.2925015.
- Imbalanced text classification: A term weighting approach. Expert Systems with Applications, 36, 690–701. doi:10.1016/j.eswa.2007.10.042.
- Using machine learning to predict decisions of the European Court of Human Rights. Artificial Intelligence and Law, 28, 237–266. doi:10.1007/s10506-019-09255-y.
- Review of ensembles of multi-label classifiers: Models, experimental study and prospects. Information Fusion, 44, 33–45. doi:10.1016/j.inffus.2017.12.001.
- Big Data technologies: A survey. Journal of King Saud University - Computer and Information Sciences, 30, 431–448. doi:10.1016/j.jksuci.2017.06.001.
- A Review on Random Forest: An Ensemble Classifier. Lecture Notes on Data Engineering and Communications Technologies, 26, 758–763. doi:10.1007/978-3-030-03146-6_86.
- Categorizing feature selection methods for multi-label classification. Artificial Intelligence Review, 49, 57–78. doi:10.1007/s10462-016-9516-4.
- Performance improvement of extreme multi-label classification using K-way tree construction with parallel clustering algorithm. Journal of King Saud University - Computer and Information Sciences, (pp. 1–11). doi:10.1016/j.jksuci.2021.02.014.
- Convolutional-neural-network-based Multilabel Text Classification for Automatic Discrimination of Legal Documents. Sensors and Materials, 32, 2673. doi:10.18494/SAM.2020.2794.
- EVE: Explainable vector based embedding technique using wikipedia. Journal of Intelligent Information Systems, 53, 137–165. doi:10.1007/s10844-018-0511-x.
- Why Should I Trust You?: Explaining the Predictions of Any Classifier. In Proceedings of North American Chapter of the Association for Computational Linguistics: Demonstrations (pp. 97–101). Association for Computational Linguistics. doi:10.18653/v1/N16-3020.
- A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective. IEEE Transactions on Knowledge and Data Engineering, 33, 1328–1347. doi:10.1109/TKDE.2019.2946162.
- Predicting applicable law sections from judicial case reports using legislative text analysis with machine learning. Journal of Computational Social Science, (pp. 1–14). doi:10.1007/s42001-021-00135-7.
- Speeding up k-Nearest Neighbors classifier for large-scale multi-label learning on GPUs. Neurocomputing, 354, 10–19. doi:10.1016/j.neucom.2018.06.095.
- Multi-label legal document classification: A deep learning-based approach with label-attention and domain-specific pre-training. Information Systems, 106, 101718. doi:10.1016/j.is.2021.101718.
- A Survey on Spark Ecosystem: Big Data Processing Infrastructure, Machine Learning, and Applications. IEEE Transactions on Knowledge and Data Engineering, 34, 1–1. doi:10.1109/TKDE.2020.2975652.
- Toward multi-label sentiment analysis: a transfer learning based approach. Journal of Big Data, 7, 1–26. doi:10.1186/s40537-019-0278-0.
- A review of methods for imbalanced multi-label classification. Pattern Recognition, 118, 107965. doi:10.1016/j.patcog.2021.107965.
- Teisseyre, P. (2021). Classifier chains for positive unlabelled multi-label learning. Knowledge-Based Systems, 213, 106709. doi:10.1016/j.knosys.2020.106709.
- A review on deep neural networks for ICD coding. IEEE Transactions on Knowledge and Data Engineering, (pp. 1–1). doi:10.1109/TKDE.2022.3148267.
- Semi‐supervised, knowledge‐integrated pattern learning approach for fact extraction from judicial text. Expert Systems, 38. doi:10.1111/exsy.12656.
- Decision tree classifiers for evidential attribute values and class labels. Fuzzy Sets and Systems, 366, 46–62. doi:10.1016/j.fss.2018.11.006.
- Notions of explainability and evaluation approaches for explainable artificial intelligence. Information Fusion, 76, 89–106. doi:10.1016/j.inffus.2021.05.009.
- Joint Ranking SVM and Binary Relevance with robust Low-rank learning for multi-label classification. Neural Networks, 122, 24–39. doi:10.1016/j.neunet.2019.10.002.
- Multi-Label Active Learning Algorithms for Image Classification. ACM Computing Surveys, 53, 1–35. doi:10.1145/3379504.
- Wikipedia ORES explorer: Visualizing trade-offs for designing applications with machine learning API. In Proceedings of the Designing Interactive Systems Conference (p. 1554–1565). Association for Computing Machinery. doi:10.1145/3461778.3462099.
- Ensemble Machine Learning. Springer. doi:10.1007/978-1-4419-9326-7.
- Multi-label learning with label-specific features by resolving label correlations. Knowledge-Based Systems, 159, 148–157. doi:10.1016/j.knosys.2018.07.003.
- A Review on Multi-Label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering, 26, 1819–1837. doi:10.1109/TKDE.2013.39.
- Database Meets Artificial Intelligence: A Survey. IEEE Transactions on Knowledge and Data Engineering, 34, 1096–1116. doi:10.1109/TKDE.2020.2994641.
- Francisco de Arriba-Pérez (24 papers)
- Silvia García-Méndez (28 papers)
- Francisco J. González-Castaño (19 papers)
- Jaime González-González (3 papers)