Hitting "Probe"rty with Non-Linearity, and More (2402.16168v1)

Published 25 Feb 2024 in cs.CL and cs.AI

Abstract: Structural probes learn a linear transformation to find how dependency trees are embedded in the hidden states of LLMs. This simple design may not allow for full exploitation of the structure of the encoded information. Hence, to investigate the structure of the encoded information to its full extent, we incorporate non-linear structural probes. We reformulate the design of non-linear structural probes introduced by White et al. making its design simpler yet effective. We also design a visualization framework that lets us qualitatively assess how strongly two words in a sentence are connected in the predicted dependency tree. We use this technique to understand which non-linear probe variant is good at encoding syntactical information. Additionally, we also use it to qualitatively investigate the structure of dependency trees that BERT encodes in each of its layers. We find that the radial basis function (RBF) is an effective non-linear probe for the BERT model than the linear probe.

References (12)

Authors (2)

Avik Pal (16 papers)
Madhura Pawar (1 paper)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

Probing BERT in Hyperbolic Spaces (2021)
Pareto Probing: Trading Off Accuracy for Complexity (2020)
A Tale of a Probe and a Parser (2020)
Probing for Labeled Dependency Trees (2022)
A Non-Linear Structural Probe (2021)