Sublinear Time Algorithm for Online Weighted Bipartite Matching (2208.03367v2)
Abstract: Online bipartite matching is a fundamental problem in online algorithms. The goal is to match two sets of vertices to maximize the sum of the edge weights, where for one set of vertices, each vertex and its corresponding edge weights appear in a sequence. Currently, in the practical recommendation system or search engine, the weights are decided by the inner product between the deep representation of a user and the deep representation of an item. The standard online matching needs to pay $nd$ time to linear scan all the $n$ items, computing weight (assuming each representation vector has length $d$), and then deciding the matching based on the weights. However, in reality, the $n$ could be very large, e.g. in online e-commerce platforms. Thus, improving the time of computing weights is a problem of practical significance. In this work, we provide the theoretical foundation for computing the weights approximately. We show that, with our proposed randomized data structures, the weights can be computed in sublinear time while still preserving the competitive ratio of the matching algorithm.
- Edge weighted online windowed matching. In Proceedings of the 20th ACM Conference on Economics and Computation, pages 729–742, 2019.
- Faster kernel ridge regression using sketching and preconditioning. SIAM Journal on Matrix Analysis and Applications, 38(4):1116–1138, 2017.
- Secretary and online matching problems with machine learned advice. Advances in Neural Information Processing Systems, 33:7933–7944, 2020.
- Online vertex-weighted bipartite matching and single-bid budgeted allocations. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1253–1264. SIAM, 2011.
- Practical and optimal lsh for angular distance. In Advances in Neural Information Processing Systems (NIPS), pages 1225–1233. Curran Associates, 2015.
- Beyond locality-sensitive hashing. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 1018–1028. SIAM, 2014.
- Approximate nearest neighbor search in high dimensions. arXiv preprint arXiv:1806.09823, 7, 2018.
- Oblivious sketching of high-degree polynomial kernels. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 141–160. SIAM, 2020.
- Optimal hashing-based time-space trade-offs for approximate near neighbors. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 47–66. SIAM, 2017.
- Optimal data-dependent hashing for approximate near neighbors. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 793–801, 2015.
- Online submodular maximization: Beating 1/2 made simple. Mathematical Programming, 183(1):149–169, 2020.
- Attenuate locally, win globally: Attenuation-based frameworks for online stochastic matching with timeouts. Algorithmica, 82(1):64–87, 2020.
- {{\{{SANNS}}\}}: Scaling up secure approximate k-nearest neighbors search. In 29th {normal-{\{{USENIX}normal-}\}} Security Symposium ({normal-{\{{USENIX}normal-}\}} Security 20), pages 2111–2128, 2020.
- Scatterbrain: Unifying sparse and low-rank attention. Advances in Neural Information Processing Systems, 34, 2021.
- MONGOOSE: A learnable LSH framework for efficient neural network training. In International Conference on Learning Representations (ICLR), 2021.
- Slide : In defense of smart algorithms over hardware acceleration for large-scale deep learning systems. In I. Dhillon, D. Papailiopoulos, and V. Sze, editors, Proceedings of Machine Learning and Systems, pages 291–306, 2020.
- Uniform approximations for randomized hadamard transforms with applications. arXiv preprint arXiv:2203.01599, 2022.
- Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry (SoCG), pages 253–262, 2004.
- Faster matchings via learned duals. Advances in Neural Information Processing Systems, 34, 2021.
- Learning space partitions for nearest neighbor search. In International Conference on Learning Representations, 2019.
- Near optimal online algorithms and fast approximation algorithms for resource allocation problems. In Proceedings of the 12th ACM Conference on Electronic Commerce, pages 29–38. ACM, 2011.
- Accelerating slide deep learning on modern cpus: Vectorization, quantizations, memory optimizations, and more. Proceedings of Machine Learning and Systems, 3, 2021.
- Mobius: towards the next generation of query-ad matching in baidu’s sponsored search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2509–2517, 2019.
- Edge-weighted online bipartite matching. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 412–423. IEEE, 2020.
- Online ad assignment with free disposal. In International Workshop on Internet and Network Economics, pages 374–385. Springer, 2009.
- Online stochastic matching: Beating 1−1/e11𝑒1-1/e1 - 1 / italic_e. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science, pages 117–126. IEEE, 2009.
- Online matching with general arrivals. In Proceedings of the 60th Annual IEEE Symposium on Foundations of Computer Science, pages 26–37. IEEE, 2019.
- Beating greedy for stochastic bipartite matching. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2841–2854. SIAM, 2019.
- Deep metric learning using triplet network. In International workshop on similarity-based pattern recognition, pages 84–92. Springer, 2015.
- How to match when all vertices arrive online. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 17–29. ACM, 2018.
- Online stochastic weighted matching: Improved approximation algorithms. In International Workshop on Internet and Network Economics, pages 170–181. Springer, 2011.
- Tight competitive ratios of classic matching algorithms in the fully online model. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2875–2886. SIAM, 2019.
- Online stochastic matching, poisson arrivals, and the natural linear program. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 682–693, 2021.
- Fully online matching II: Beating Ranking and Water-filling. In Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science. IEEE, 2020.
- Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pages 549–558, 2016.
- Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604–613, 1998.
- Approximate nearest neighbors in limited space. In Conference On Learning Theory, pages 2012–2036. PMLR, 2018.
- Online stochastic matching: New algorithms with better bounds. Mathematics of Operations Research, 39(3):624–646, 2013.
- Visual search at pinterest. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1889–1898, 2015.
- Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
- Online bipartite matching with unknown distributions. In Proceedings of the 43rd Annual ACM Symposium on Theory of Computing, pages 587–596. ACM, 2011.
- An optimal algorithm for on-line bipartite matching. In Proceedings of the twenty-second annual ACM symposium on Theory of computing, pages 352–358, 1990.
- Rejection sampling for weighted jaccard similarity revisited. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021.
- Learning job representation using directed graph embedding. In Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data, pages 1–5, 2019.
- Aranyak Mehta. Online matching and ad allocation. Foundations and Trends in Theoretical Computer Science, 8(4):265–368, 2013.
- Online stochastic matching: Online actions based on offline statistics. Mathematics of Operations Research, 37(4):559–573, 2012.
- On symmetric and asymmetric lshs for inner product search. In International Conference on Machine Learning, pages 1926–1934. PMLR, 2015.
- Random features for large-scale kernel machines. Advances in neural information processing systems, 20, 2007.
- Alexander Schrijver. Combinatorial optimization: polyhedra and efficiency, volume 24. Springer, 2003.
- Sublinear least-squares value iteration via locality sensitive hashing. arXiv preprint arXiv:2105.08285, 2021.
- Shop the look: Building a large scale visual shopping system at pinterest. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3203–3212, 2020.
- Fast sketching of polynomial kernels of polynomial degree. In International Conference on Machine Learning, pages 9812–9823. PMLR, 2021.
- Accelerating frank-wolfe algorithm using low-dimensional and adaptive data structures. arXiv preprint arXiv:2207.09002, 2022.
- Speeding up sparsification using inner product search data structures. arXiv preprint arXiv:2204.03209, 2022.
- Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768, 2020.
- Locality sensitive teaching. Advances in Neural Information Processing Systems, 2021.
- Deep matrix factorization models for recommender systems. In IJCAI, volume 17, pages 3203–3209. Melbourne, Australia, 2017.
- Breaking the linear iteration cost barrier for some well-known conditional gradient methods using maxip data-structures. Advances in Neural Information Processing Systems, 34, 2021.
- Distance metric learning: A comprehensive survey. Michigan State Universiy, 2(2):4, 2006.
- Semantic similarity strategies for job title classification. arXiv preprint arXiv:1609.06268, 2016.