BAGSFit: 3D Fitting & GOF Assessment
- BAGSFit is a dual-framework system that combines a deep learning model for precise 3D primitive segmentation and a data-driven goodness-of-fit test for binary classifiers.
- The 3D primitive fitting component uses a two-stage process with a deep FCN for boundary-aware segmentation followed by RANSAC-based geometric verification to reliably identify shapes like planes, spheres, cylinders, and cones.
- The statistical goodness-of-fit test leverages data splitting and nonparametric regression to adaptively bin residuals, ensuring rigorous error control and enhanced detection of model misspecifications.
BAGSFit encompasses two distinct but widely referenced frameworks: (1) a deep learning–based system for primitive fitting in 3D point clouds, and (2) an adaptive, data-driven statistical test for classification goodness-of-fit assessment. Both occur in literatures under closely related acronyms and have been independently described and deployed in advanced research contexts (Li et al., 2018, Zhang et al., 2019). The following article delineates these frameworks, addressing their theoretical underpinnings, technical architectures, training regimes, and empirical performance.
1. Boundary-Aware Geometric Segmentation Framework for Primitive Fitting
The BAGSFit framework for primitive fitting formulates the segmentation and analytic fitting of 3D geometric primitives in noisy and cluttered sensor data as a two-stage process. It is designed to robustly detect, segment, and model multiple primitive classes—including planes, spheres, cylinders, and cones—within complex range images or point clouds (Li et al., 2018).
The pipeline proceeds as follows:
- Stage 1: Deep boundary-aware segmentation via a fully convolutional network (FCN), labeling each point with a primitive class and producing a boundary map, all without explicit geometric model fitting.
- Stage 2: Geometric verification and model fitting in which class-specific point clusters are refined via RANSAC-style optimization and analytic model fitting, with outlier rejection and parameter estimation.
Let denote a point (or pixel) and the number of primitive classes. The network outputs class-probability maps and a boundary-probability map . The argmax segmentation assigns
Hypotheses are then geometrically verified and analytically fit using efficient RANSAC [Schnabel et al. 2007].
2. Network Design, Losses, and Instance Detection
The segmentation frontend leverages a DeepLabv2-inspired FCN with a ResNet-101 backbone and dilated convolutions to yield dense per-point classification and boundary detection. The input can be either a normal map (3 channels) or a concatenation of (6 channels) at VGA resolution. Atrous convolutions and skip-connections are adopted to maximize spatial resolution and boundary accuracy.
Output heads consist of:
- A class head (multinomial softmax for mutually-exclusive classes, or multi-binomial sigmoids for independent labeling) yielding probability maps.
- A binary boundary head yielding via a convolution and sigmoid activation.
The loss function combines either:
- Multinomial cross-entropy for segmentation,
- Or (optionally) multi-binomial cross-entropies with self-balancing weights $\beta_k \propto 1/(\text{# training pixels in class }k)$ to address class imbalance,
- Plus analogous binary loss for the boundary map.
Instance-level segmentation exploits joint boundary prediction, with on pixels whose neighborhood contains a different instance ID. At inference, boundary thresholding () defines connected components inside each class, effectively splitting adjacent or intersecting instances.
3. Primitive Hypothesis and Geometric Verification
Following segmentation and boundary carving, each component within forms a primitive hypothesis. Robust fitting of analytic models (plane, sphere, cylinder, cone) is performed using efficient RANSAC with preset hyperparameters: minimum inlier count , point-to-model distance threshold $0.03$ m, angular thresholds ( for inlier scoring, for normal-based re-expansion), and RANSAC skip probability .
Primitive-specific residuals are used for model scoring:
- Plane: Perpendicular (point-to-plane) distance.
- Sphere: .
- Cylinder: .
- Cone: Joint angle and radial residual to axis .
After RANSAC identification and parameter refinement (via least squares on inliers), detected primitives are marked final, and the corresponding inlier points are removed from subsequent hypotheses to reduce duplicity.
4. Data Regime, Optimization, and Empirical Metrics
Synthetic data for training and evaluation is generated using Blender plus Blensor, simulating Kinect acquisition over 20 room scenes. Eighteen are used for training (3,456 scans), two for validation (384 scans), and 20 fresh scenes (360 scans) for testing. Kinect-style depth maps at and ground-truth instance ID are provided per scan; point normals are estimated using a PCA scheme. Training employs extensive augmentation (random crops), 50 epochs, Caffe+DeepLabv2 optimization, and standard settings (SGD with momentum, weight decay, linear learning rate decay).
Inference is performed at FPS on an NVIDIA Titan X GPU.
Key results on synthetic test data:
- Segmentation: Pixel-wise accuracy up to , mean IoU up to , F1 score up to .
- Primitive fitting (over 9,609 instances in 720 scans): Primitive Average Precision (PAP) up to (cf. for pure ERANSAC), Primitive Average Recall (PAR) up to (cf. ERANSAC), mean fitting error cm ($0.81$ cm ERANSAC).
- On real-world Kinect scans, simulated-trained BAGSFit recovers precise geometry with minimal false positives.
5. Binary Adaptive Goodness-of-Fit Test for Classification (BAGofT/BAGSFit)
In classification, BAGSFit (also presented as “BAGofT”) denotes a flexible, asymptotically-valid data-driven GOF assessment for general binary classifiers (Zhang et al., 2019). It assesses whether a classifier's estimated success probability aligns with the true label probability , beyond mere classification accuracy.
Core framework:
- Two-stage split: Randomly partition data into training () and validation (). A general learner is fit on training, yielding .
- Compute residuals and fit a nonparametric regression (default: Random Forest) mapping to residuals.
- For to , partition the range of predicted residuals into quantile bins, compute a chi-squared–style statistic, and select adaptively to maximize fit discrepancy detection.
- On validation, compute a test statistic
with and aggregating residuals and predicted variances within each group (see original for formal definitions). Under , . The procedure controls Type I error and is consistent under alternatives.
6. Empirical Validation and Best Practices
For primitive fitting, BAGSFit's advantage lies in integrating learned segmentation—including instance boundaries—with robust geometric verification, yielding significant improvements in precision and recall compared to prior RANSAC-only workflows.
For the GOF test, simulations and real-world tasks (logistic regression misspecification, neural network underfitting, genomics, image, and medical data) demonstrate both rigorous Type I error control and high power—often outperforming established GOF tests such as Hosmer–Lemeshow or le Cessie–van Houwelingen in detecting subtle or spatially-localized misspecification. Splitting ratios and binning strategies are adaptively selected; repeated random splits and aggregation of p-values (optionally with bootstrap calibration) are recommended for stability.
7. Software and Availability
For the primitive fitting framework, code is based on Caffe and DeepLabv2 environments with standard deep learning infrastructure. For the GOF methodology, a dedicated R package “BAGofT” (also documented as “BAGSFit”) is available, implementing all procedural steps: data splitting, model fitting, residual regression, binning, test statistic calculation, and p-value reporting.
References:
- Primitive fitting: "Primitive Fitting Using Deep Boundary Aware Geometric Segmentation" (Li et al., 2018)
- Classification goodness-of-fit: "Is a Classification Procedure Good Enough? A Goodness-of-Fit Assessment Tool for Classification Learning" (Zhang et al., 2019)