Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Online Updating Huber Robust Regression for Big Data Streams (2209.01718v2)

Published 5 Sep 2022 in stat.ME

Abstract: Big data streams are grasping increasing attention with the development of modern science and information technology. Due to the incompatibility of limited computer memory to high volume of streaming data, real-time methods without historical data storage is worth investigating. Moreover, outliers may occur with high velocity data streams generating, calling for more robust analysis. Motivated by these concerns, a novel Online Updating Huber Robust Regression algorithm is proposed in this paper. By extracting key features of new data subsets, it obtains a computational efficient online updating estimator without historical data storage. Meanwhile, by integrating Huber regression into the framework, the estimator is robust to contaminated data streams, such as heavy-tailed or heterogeneous distributed ones as well as cases with outliers. Moreover, the proposed online updating estimator is asymptotically equivalent to Oracle estimator obtained by the entire data and has a lower computation complexity. Extensive numerical simulations and a real data analysis are also conducted to evaluate the estimation and calculation efficiency of the proposed method.

Summary

We haven't generated a summary for this paper yet.