Probing BERT in Hyperbolic Spaces

Published 8 Apr 2021 in cs.CL | (2104.03869v1)

Abstract: Recently, a variety of probing tasks are proposed to discover linguistic properties learned in contextualized word embeddings. Many of these works implicitly assume these embeddings lay in certain metric spaces, typically the Euclidean space. This work considers a family of geometrically special spaces, the hyperbolic spaces, that exhibit better inductive biases for hierarchical structures and may better reveal linguistic hierarchies encoded in contextualized representations. We introduce a Poincare probe, a structural probe projecting these embeddings into a Poincare subspace with explicitly defined hierarchies. We focus on two probing objectives: (a) dependency trees where the hierarchy is defined as head-dependent structures; (b) lexical sentiments where the hierarchy is defined as the polarity of words (positivity and negativity). We argue that a key desideratum of a probe is its sensitivity to the existence of linguistic structures. We apply our probes on BERT, a typical contextualized embedding model. In a syntactic subspace, our probe better recovers tree structures than Euclidean probes, revealing the possibility that the geometry of BERT syntax may not necessarily be Euclidean. In a sentiment subspace, we reveal two possible meta-embeddings for positive and negative sentiments and show how lexically-controlled contextualization would change the geometric localization of embeddings. We demonstrate the findings with our Poincare probe via extensive experiments and visualization. Our results can be reproduced at https://github.com/FranxYao/PoincareProbe.

Abstract PDF Upgrade to Chat

Authors (7)

Citations (51)

View on Semantic Scholar

Summary

The paper introduces a Poincaré probe that maps BERT embeddings into hyperbolic space to uncover deeper syntactic hierarchies.
The study demonstrates that hyperbolic projections can better recover dependency trees, improving the detection of long-range syntactic dependencies.
The approach also enhances sentiment analysis by localizing word sentiments through meta-embedding alignment in hyperbolic geometry.

"Probing BERT in Hyperbolic Spaces" Detailed Summary

This paper explores the adaptation of contextualized embeddings, particularly BERT, into hyperbolic space using Poincaré geometry. The overarching aim is to utilize the tree-like structure of hyperbolic space to better represent linguistic hierarchies implicitly encoded in BERT embeddings.

Hyperbolic Space Adaptation for Probing

Contextualized word embeddings, such as those generated by BERT, capture rich syntactic and semantic information. The paper extends traditional probing approaches to hyperbolic space, positing that its properties align more naturally with linguistic hierarchies like syntax trees.

A Poincaré probe is introduced as a key tool for projecting embeddings into a hyperbolic space. It consists of simple linear mappings that transform embeddings to the tangent space at the Poincaré ball's origin. The main components are:

Linear transforms to project word embeddings into a lower-dimensional hyperbolic space.
Exponential mapping to project these transformed embeddings to the tangent space of the hyperbolic space.

Probing Syntax in Hyperbolic Spaces

The paper argues and demonstrates that hyperbolic spaces, due to their non-Euclidean nature, could potentially encode syntax more effectively than Euclidean spaces. The Poincaré probe is used to recover dependency tree structures and analyze BERT embeddings' performance on such tasks.

Key findings include:

Tree Structure Recovery: Hyperbolic probes recover syntactic structures with better sensitivity to deeper syntactic dependencies, outperforming their Euclidean counterparts.
Edge Length Distribution: Hyperbolic probes recover syntactic edges with distributions closer to ground truth, particularly excelling at identifying longer dependencies.

Sentiment Probing with Poincaré Probes

The sentiment analysis extends the probing to semantic spaces within hyperbolic geometry:

Meta-Embedding Initialization: Two distinct meta-embeddings for positive and negative sentiments are introduced, allowing the alignment of individual word sentiment with these meta points.
Word Sentiment Localization: The localization of words in the sentiment space reflects fine-grained sentiment nuances captured within BERT representations.

The hyperbolic-space adaptation demonstrates enhanced capability in uncovering semantics, as shown in:

Sentiment Accuracy: Despite training with binary sentence-level labels, the model captures word-level sentiments accurately.
Lexical Changes in Context: The model reflects nuanced shifts in word meaning based on context changes, evident in lexically-controlled contextualization scenarios.

Conclusion

The study presents strong evidence for the effectiveness of applying hyperbolic models for linguistic hierarchy probing in contextualized embeddings. By proposing the Poincaré probe, the paper opens the door for utilizing hyperbolic geometry for deeper, nuanced syntactic and semantic analysis, with potential applications in syntax-based NLP tasks. This exploratory work suggests fertile ground for further investigation into hyperbolic deep learning techniques and their applications in natural language understanding.

Markdown Report Issue