Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LiteMORT: A memory efficient gradient boosting tree system on adaptive compact distributions (2001.09419v1)

Published 26 Jan 2020 in cs.LG and stat.ML

Abstract: Gradient boosted decision trees (GBDT) is the leading algorithm for many commercial and academic data applications. We give a deep analysis of this algorithm, especially the histogram technique, which is a basis for the regulized distribution with compact support. We present three new modifications. 1) Share memory technique to reduce memory usage. In many cases, it only need the data source itself and no extra memory. 2) Implicit merging for "merge overflow problem"."merge overflow" means that merge some small datasets to huge datasets, which are too huge to be solved. By implicit merging, we just need the original small datasets to train the GBDT model. 3) Adaptive resize algorithm of histogram bins to improve accuracy. Experiments on two large Kaggle competitions verified our methods. They use much less memory than LightGBM and have higher accuracy. We have implemented these algorithms in an open-source package LiteMORT. The source codes are available at https://github.com/closest-git/LiteMORT

Citations (2)

Summary

We haven't generated a summary for this paper yet.