Identification of unknown parameters and prediction with hierarchical matrices
Abstract: Statistical analysis of massive datasets very often implies expensive linear algebra operations with large dense matrices. Typical tasks are an estimation of unknown parameters of the underlying statistical model and prediction of missing values. We developed the H-MLE procedure, which solves these tasks. The unknown parameters can be estimated by maximizing the joint Gaussian log-likelihood function, which depends on a covariance matrix. To decrease high computational cost, we approximate the covariance matrix in the hierarchical (H-) matrix format. The H-matrix technique allows us to work with inhomogeneous covariance matrices and almost arbitrary locations. Especially, H-matrices can be applied in cases when the matrices under consideration are dense and unstructured. For validation purposes, we implemented three machine learning methods: the k-nearest neighbors (kNN), random forest, and deep neural network. The best results (for the given datasets) were obtained by the kNN method with three or seven neighbors depending on the dataset. The results computed with the H-MLE method were compared with the results obtained by the kNN method. The developed H-matrix code and all datasets are freely available online.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.