- The paper introduces computationally efficient methods that transform data-intensive voxel-based grain maps into compact, accurate diagram representations using anisotropic power diagrams.
- Using coresets and interior removal, the s-GBPD approach reduces data from 69 million to 86,000 points while maintaining a classification accuracy above 0.93.
- A direct linear programming model (DiLPM) optimizes all parameters to achieve a 0.956 accuracy and improved topological fidelity, albeit with significantly higher computational cost.
This paper addresses the challenge of representing complex 3D grain structures in polycrystalline materials using computationally efficient and accurate mathematical models called diagrams or tessellations. Traditional voxel-based grain maps are data-intensive and slow to process, hindering the paper of dynamic processes like grain growth. The paper focuses on improving and extending the Generalized Balanced Power Diagram (GBPD) method [Alpers2015], which uses Anisotropic Power Diagrams (APDs) to represent grains. APDs are defined by parameters controlling grain shape (matrices A), position (sites S), and size (weights Γ).
The core problem is finding the optimal APD parameters (A,S,Γ) that best fit a given grain scan (a labeled dataset (X,Y) where X are points and Y are grain labels), minimizing the classification error. The paper introduces two main approaches:
- Sparse-GBPD (s-GBPD): This approach significantly accelerates the original GBPD method, which optimizes only the size parameters Γ while keeping A (typically inverse covariance matrices) and S (centroids) fixed. The acceleration comes from using coresets – small, weighted subsets of the original data points (X′,Ω′) that approximate the clustering cost on the full dataset X. Solving the underlying weight-constrained anisotropic clustering problem (formulated as a linear program, LP) on the much smaller coreset drastically reduces computation time. Two coreset construction methods are discussed and evaluated:
- Pencil Coresets: Points are projected onto rays emanating from grain sites, and points along each ray are grouped into batches represented by weighted centroids. The coreset size depends on the number of grains (k) and desired precision (ϵ), not the original data size n.
- Resolution Coresets: A coarser grid resolution is used, effectively reducing the number of points while maintaining the grid structure. The required resolution also depends only on k and ϵ.
- Interior Removal: An additional technique, usable when the full grain scan is known, removes points deep inside a grain (far from boundaries according to a grid graph distance), further sparsifying the dataset used for LP solving.
- Direct Linear Programming Model (DiLPM): This is a new, direct approach that optimizes all APD parameters (A,S,Γ) simultaneously to minimize classification error. It reformulates the problem similar to a multi-class Support Vector Machine (SVM). Separation constraints enforce that points belonging to grain Gi should satisfy hi(x)≤hl(x) for other grains l, where hi(x)=∥x−si∥Ai2+γi. To handle imperfect fits, slack variables (ζj) are introduced for points near boundaries, while points deep inside grains (using the interior removal concept) are enforced to be strictly separated. This results in a large linear program. A mechanism is included to ensure the resulting matrices Ai are positive definite.
The methods were evaluated on a real-world 3D dataset of a beta titanium alloy (339×339×599 voxels, k=591 grains, n≈69 million points).
Key Findings:
- s-GBPD is effective: Using coresets (especially resolution coresets combined with interior removal) reduced the number of points needed for the LP from ~69 million to ~86,000 while achieving high accuracy (ΦG>0.93). This made the computation feasible on standard hardware in under 3 minutes. Resolution coresets slightly outperformed pencil coresets.
- DiLPM yields higher accuracy: Optimizing all parameters with DiLPM resulted in better overall fit (ΦG≈0.956) and significantly better topological accuracy (e.g., 63.5% correct neighborhoods vs 35.7% for s-GBPD).
- Trade-off: DiLPM's higher accuracy comes at a much higher computational cost (~627 minutes vs. ~2.6 minutes for s-GBPD on the reduced dataset).
- Practical Implication: s-GBPD offers a practical, fast method for generating high-quality diagram representations of large 3D polycrystals, suitable for many applications. DiLPM serves as a benchmark for maximum achievable accuracy when computation time is less critical.
The paper concludes that s-GBPD provides a viable tool for converting grain maps into compact diagram representations. The approach can potentially be adapted for scenarios where only grain measurements (centroids, volumes, moments) are available, not the full scan, by using statistical assumptions to guide coreset construction and interior removal.