On the Selection of Tuning Parameters for Patch-Stitching Embedding Methods (2207.07218v2)
Abstract: While classical scaling, just like principal component analysis, is parameter-free, other methods for embedding multivariate data require the selection of one or several tuning parameters. This tuning can be difficult due to the unsupervised nature of the situation. We propose a simple, almost obvious, approach to supervise the choice of tuning parameter(s): minimize a notion of stress. We apply this approach to the selection of the patch size in a prototypical patch-stitching embedding method, both in the multidimensional scaling (aka network localization) setting and in the dimensionality reduction (aka manifold learning) setting. In our study, we uncover a new bias--variance tradeoff phenomenon.
- Solving Euclidean distance matrix completion problems via semidefinite programming. Computational Optimization and Applications 12(1), 13–30.
- Formal theory of noisy sensor network localization. SIAM Journal on Discrete Mathematics 24(2), 684–698.
- Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis (3rd ed.). Hoboken: John Wiley and Sons.
- Arias-Castro, E. and P. A. Chau (2023). Stability of sequential lateration and of stress minimization in the presence of noise. arXiv preprint arXiv:2310.10900.
- Spectral clustering based on local linear approximations. Electronic Journal of Statistics 5, 1537–1587.
- Unconstrained and curvature-constrained shortest-path distances and their approximation. Discrete & Computational Geometry 62(1), 1–28.
- A theory of network localization. IEEE Transactions on Mobile Computing 5(12), 1663–1678.
- Bakonyi, M. and C. R. Johnson (1995). The Euclidian distance matrix completion problem. SIAM Journal on Matrix Analysis and Applications 16(2), 646–654.
- Graph drawing: algorithms for the visualization of graphs. Prentice Hall PTR.
- Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(16), 1373–1396.
- Graph approximations to geodesics on embedded manifolds. Technical report, Department of Psychology, Stanford University.
- Semidefinite programming based algorithms for sensor network localization. ACM Transactions on Sensor Networks (TOSN) 2(2), 188–220.
- Semidefinite programming approaches for sensor network localization with noisy distance measurements. Automation Science and Engineering, IEEE Transactions on 3(4), 360–371.
- Blumenthal, L. M. (1953). Theory and applications of distance geometry. Oxford University Press.
- Attraction-repulsion spectrum in neighbor embeddings. Journal of Machine Learning Research 23(95), 1–32.
- Borg, I. and P. J. Groenen (2005). Modern multidimensional scaling: Theory and applications. Springer.
- Brand, M. (2003). Charting a manifold. Advances In Neural Information Processing Systems, 985–992.
- Multigrid multidimensional scaling. Numerical Linear Algebra with Applications 13(2-3), 149–171.
- Robust euclidean embedding. In International Conference on Machine Learning, pp. 169–176.
- Local multidimensional scaling for nonlinear dimension reduction, graph drawing, and proximity analysis. Journal of the American Statistical Association 104(485), 209–219.
- Cohen, J. D. (1997). Drawing graphs to convey proximity: An incremental arrangement method. ACM Transactions on Computer-Human Interaction (TOCHI) 4(3), 197–229.
- Diffusion maps. Applied And Computational Harmonic Analysis 21(1), 5–30.
- Connelly, R. (2005). Generic global rigidity. Discrete & Computational Geometry 33(4), 549–563.
- Distributed weighted-multidimensional scaling for node localization in sensor networks. ACM Transactions on Sensor Networks (TOSN) 2(1), 39–64.
- Sensor network localization by eigenvector synchronization over the euclidean group. ACM Transactions on Sensor Networks (TOSN) 8(3), 19.
- Damrich, S. and F. A. Hamprecht (2021). On umap’s true loss function. Advances in Neural Information Processing Systems 34, 5798–5809.
- de Leeuw, J. (1975). An alternating least squares approach to squared distance scaling. Technical report, Department of Data Theory FSW/RUL.
- De Leeuw, J. (1977). Applications of convex analysis to multidimensional scaling. In J. Barra, F. Brodeau, G. Romier, and B. van Cutsem (Eds.), Recent Developments in Statistics. North-Holland Publishing Company.
- Multidimensional scaling using majorization: Smacof in r. Journal of Statistical Software 31(i03).
- Global versus local methods in nonlinear dimensionality reduction. Advances In Neural Information Processing Systems 15, 705–712.
- Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences 100(10), 5591–5596.
- Noisy Euclidean distance realization: Robust facial reduction and the Pareto frontier. SIAM Journal on Optimization 27(4), 2301–2331.
- Rigidity, computation, and randomization in network localization. In Joint Conference of the IEEE Computer and Communications Societies, Volume 4, pp. 2673–2684.
- On computing the connectivities of graphs and digraphs. Networks 14(2), 355–366.
- Sequential localization of sensor networks. SIAM Journal on Control and Optimization 48(1), 321–350.
- Elements of dimensionality reduction and manifold learning. Springer Nature.
- Molecular conformations from distance matrices. Journal of Computational Chemistry 14(1), 114–120.
- Gower, J. C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53(3-4), 325–338.
- Measuring the strangeness of strange attractors. Physica D 9, 189–208.
- Positive definite completions of partial Hermitian matrices. Linear Algebra and its Applications 58, 109–124.
- Hall, K. M. (1970). An r-dimensional quadratic placement algorithm. Management Science 17(3), 219–229.
- Principal curves. Journal of the American Statistical Association 84(406), 502–516.
- The elements of statistical learning: Data mining, inference, and prediction, Volume 2. Springer.
- Heiser, W. J. (1988). Multidimensional scaling with least absolute residuals. Classification and related methods of data analysis, 455–462.
- Hendrickson, B. (1995). The molecule problem: Exploiting structure in global optimization. SIAM Journal on Optimization 5(4), 835–857.
- Connected rigidity matroids and unique realizations of graphs. Journal of Combinatorial Theory, Series B 94(1), 1–29.
- Localization from incomplete noisy distance measurements. Foundations of Computational Mathematics 13(3), 297–345.
- An algorithm for drawing general undirected graphs. Information Processing Letters 31(1), 7–15.
- The solution of the metric STRESS and SSTRESS problems in multidimensional scaling using Newton’s method. Computational Statistics 13(3), 369–396.
- Klimenta, M. (2012). Extending the usability of multidimensional scaling for graph drawing. Ph. D. thesis, Universität Konstanz.
- Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological cybernetics 43(1), 59–69.
- PATCHWORK: Efficient localization for sensor networks by distributed global optimization.
- Explicit sensor network localization using semidefinite representations and facial reductions. SIAM Journal on Optimization 20(5), 2679–2708.
- Kruskal, J. B. (1964a). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1–27.
- Kruskal, J. B. (1964b). Nonmetric multidimensional scaling: a numerical method. Psychometrika 29(2), 115–129.
- Kruskal, J. B. and J. B. Seery (1980). Designing network diagrams. In Conference on Social Graphics, pp. 22–50.
- Laurent, M. (2001a). Matrix completion problems. In Encyclopedia of Optimization, pp. 221–229. Springer.
- Laurent, M. (2001b). Polynomial instances of the positive semidefinite and Euclidean distance matrix completion problems. SIAM Journal on Matrix Analysis and Applications 22(3), 874–894.
- Nonlinear dimensionality reduction. Information Science and Statistics. Springer New York.
- More on multidimensional scaling and unfolding in R: smacof version 2. Journal of Statistical Software 102, 1–47.
- Wireless sensor network localization techniques. Computer networks 51(10), 2529–2553.
- Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.
- Robust distributed network localization with noisy range measurements. In ACM Conference on Embedded Networked Sensor Systems, pp. 50–61.
- Dimensionality estimation, manifold learning and function approximation using tensor voting. Journal of Machine Learning Research 11(1).
- DV based positioning in ad hoc networks. Telecommunication Systems 22(1-4), 267–280.
- Anchor-free distributed localization in sensor networks. In Conference on Embedded Networked Sensor Systems, pp. 340–341. AMC.
- Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326.
- Think globally, fit locally: unsupervised learning of low dimensional manifolds. The Journal of Machine Learning Research 4, 119–155.
- Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5), 1299–1319.
- Intrinsic isometric manifold learning with application to localization. SIAM Journal on Imaging Sciences 12(3), 1347–1391.
- Seber, G. A. (2004). Multivariate Observations. John Wiley & Sons.
- Improved mds-based localization. In Conference of the IEEE Computer and Communications Societies, Volume 4, pp. 2640–2651. IEEE.
- Localization from mere connectivity. In ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp. 201–212.
- Singer, A. (2008). A remark on global positioning from local distances. Proceedings of the National Academy of Sciences 105(28), 9507–9511.
- So, A. M.-C. and Y. Ye (2007). Theory of semidefinite programming for sensor network localization. Mathematical Programming 109(2-3), 367–384.
- Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika 42(1), 7–67.
- Tenenbaum, J. (1997). Mapping a manifold of perceptual observations. Advances in Neural Information Processing Systems 10.
- A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323.
- Torgerson, W. S. (1952). Multidimensional scaling: I. Theory and method. Psychometrika 17(4), 401–419.
- Torgerson, W. S. (1958). Theory and Methods of Scaling. Wiley.
- Multidimensional scaling for large genomic data sets. BMC Bioinformatics 9(1), 1–17.
- Van der Maaten, L. and G. Hinton (2008). Visualizing data using t-sne. Journal of Machine Learning Research 9(11).
- Weinberger, K. Q. and L. K. Saul (2006). An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In National Conference on Artificial Intelligence (AAAI), Volume 2, pp. 1683–1686.
- Graph laplacian regularization for large-scale semidefinite programming. In Advances in Neural Information Processing Systems, pp. 1489–1496.
- Steerable, progressive multidimensional scaling. In Symposium on Information Visualization, pp. 57–64. IEEE.
- A fast approximation to multidimensional scaling. In Workshop on Computation Intensive Methods for Computer Vision. ECCV.
- An as-rigid-as-possible approach to sensor network localization. ACM Transactions on Sensor Networks (TOSN) 6(4), 35.
- Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal on Scientific Computing 26(1), 313–338.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.