Precise characterization of Node2Vec embeddings under softmax training

Characterize the embeddings learned by a 1-layer, 1-hop Node2Vec model trained with the full softmax cross-entropy objective by deriving a precise, general description of the learned embedding directions as functions of the graph structure, without relying on low-rank bottlenecks, explicit regularization, or multi-hop objectives.

Background

The authors analyze a simple Node2Vec setup and observe an emergent spectral bias: learned embeddings align with top eigenvectors of the (negative) graph Laplacian, even without typical architectural or regularization pressures. However, a complete, rigorous characterization of what the softmax-trained Node2Vec model learns is currently lacking.

This gap prevents directly transferring the spectral-bias explanation to deep sequence models and limits the theoretical understanding of when and why geometric memory emerges from local supervision.

References

A precise characterization of what embeddings are learned even in such simple models is an open question, but a rich line of work (albeit with key assumptions about various pressures outlined shortly) points to a spectral bias: the learned embeddings often align with the top (non-degenerate) eigenvectors of the negative graph Laplacian.

— Deep sequence models tend to memorize geometrically; it is unclear why (2510.26745 - Noroozizadeh et al., 30 Oct 2025) in Section 4 (Geometry arises from naturally-occurring spectral bias, without pressures)

Precise characterization of Node2Vec embeddings under softmax training

Sponsor

Background

References

Related Problems