What Lives? A meta-analysis of diverse opinions on the definition of life
(2505.15849v1)
Published 19 May 2025 in q-bio.OT, cs.AI, cs.CY, q-bio.BM, q-bio.CB, q-bio.SC, and stat.AP
Abstract: The question of "what is life?" has challenged scientists and philosophers for centuries, producing an array of definitions that reflect both the mystery of its emergence and the diversity of disciplinary perspectives brought to bear on the question. Despite significant progress in our understanding of biological systems, psychology, computation, and information theory, no single definition for life has yet achieved universal acceptance. This challenge becomes increasingly urgent as advances in synthetic biology, artificial intelligence, and astrobiology challenge our traditional conceptions of what it means to be alive. We undertook a methodological approach that leverages LLMs to analyze a set of definitions of life provided by a curated set of cross-disciplinary experts. We used a novel pairwise correlation analysis to map the definitions into distinct feature vectors, followed by agglomerative clustering, intra-cluster semantic analysis, and t-SNE projection to reveal underlying conceptual archetypes. This methodology revealed a continuous landscape of the themes relating to the definition of life, suggesting that what has historically been approached as a binary taxonomic problem should be instead conceived as differentiated perspectives within a unified conceptual latent space. We offer a new methodological bridge between reductionist and holistic approaches to fundamental questions in science and philosophy, demonstrating how computational semantic analysis can reveal conceptual patterns across disciplinary boundaries, and opening similar pathways for addressing other contested definitional territories across the sciences.
Summary
The paper introduces a novel computational methodology using LLMs to quantify semantic relationships in 68 expert definitions of life.
It employs pairwise correlation, hierarchical clustering, and t-SNE to reveal a continuous semantic landscape organized into eight distinct clusters.
The approach offers actionable insights for mapping contested concepts in complex fields like consciousness, intelligence, and sustainability.
The paper "What Lives? A meta-analysis of diverse opinions on the definition of life" (2505.15849) explores the long-standing challenge of defining "life" by analyzing definitions provided by 68 cross-disciplinary experts. Recognizing that traditional approaches struggle to reconcile diverse perspectives from biology, physics, computer science, and philosophy, the authors propose a novel methodology leveraging LLMs to analyze these definitions and map the conceptual landscape.
The core methodology involves a five-step computational approach:
Expert Curation and Definition Collection: The authors hand-selected 68 experts from various fields based on their peer-reviewed work related to the definition of life. These experts were asked to provide their definition of life (or argue against defining it) in three sentences or less via email. A manual categorization of these definitions revealed recurring themes like thermodynamics, information, dynamics, autonomy, cognition, structure, functionality, replication, and material composition (Table 1). They also analyzed general properties, finding that most definitions were objective (57%), framed life as a continuous property (54%), and were actionable (88%) (Figure 1).
Quantitative Pairwise Correlation Analysis: To quantify the semantic relationships between all pairs of definitions, LLMs (Claude 3.7 Sonnet, GPT-4o, Llama 3.3 70B Instruct) were used to score the agreement between each pair on a scale of -1.0 (complete disagreement) to 1.0 (complete agreement). The prompt provided specific score ranges to guide the LLM. Multiple inferences (n=3) per pair were averaged for robustness, and results from the three different LLMs were also averaged, producing a final symmetrized correlation matrix encoding the conceptual relationships between definitions (Supplemental Figure 2).
Analyze the following two definitions of life, where:
-1.0 = Fundamentally opposing or incompatible primary frameworks;
-0.5 to -0.9 = Significantly different emphasis with some contradiction;
0.0 = Independent or orthogonal frameworks;
0.1 to 0.4 = Slight overlap in secondary elements;
0.5 to 0.9 = Significant overlap with some differences;
1.0 = Aligned core frameworks and secondary elements.
Definition 1: {{definition_1}}
Definition 2: {{definition_2}}
What is the correlation metric between -1.0 and 1.0 for these two definitions?
Respond with ONLY a single number!
Agglomerative Clustering: Hierarchical agglomerative clustering was applied to the correlation matrix (transformed into a distance matrix where d=2(1−r)). Complete linkage was used as the merging criterion to produce compact clusters. An "elbow" analysis on the linkage matrix identified the natural number of clusters, resulting in 8 clusters for the multi-model average analysis (Figure 3). This process groups definitions based on their semantic proximity in the high-dimensional correlation space.
LLM Cluster Semantic Analysis: For each identified cluster, an LLM (Claude 3.7 Sonnet) was used for further analysis.
Intra-cluster thematic analysis: A structured prompt (available in Supplemental Code) guided the LLM to identify core ideas, count concept frequencies, group similar concepts, map conceptual connections, and identify underlying frameworks within each cluster.
Consensus definition generation: A structured prompt (available in Supplemental Code) instructed the LLM to synthesize a single representative definition for the cluster, using only majority-shared concepts (>50% frequency) and technical terms from the source definitions, adhering to length constraints.
Cluster Title Generation: A final prompt (available in Supplemental Code) generated a short, descriptive title for each cluster based on its thematic analysis and consensus definition.
Table 2 presents the resulting cluster titles and their consensus definitions, such as "Cognitive Autonomy" and "Dissipative Self-Organizing Systems."
t-SNE Dimensionality Reduction: t-distributed Stochastic Neighbor Embedding (t-SNE) was applied to project the high-dimensional correlation features onto a two-dimensional plane (Figure 2). t-SNE was chosen for its ability to preserve local neighborhood structures, creating a spatially interpretable map of the definitional landscape. The 2D map visualizes the relationships between individual definitions (colored by cluster membership) and reveals the overall structure of conceptual space.
The results demonstrate that expert definitions of life form a continuous semantic landscape rather than discrete, irreconcilable categories. The t-SNE projection revealed two primary dimensions: an axis distinguishing observer-dependent/perceptual frameworks from objective/material-structural ones, and an axis separating process-based/teleological views from entity-based/structural ones. These dimensions echo historical philosophical debates about life.
The analysis identified 8 distinct conceptual clusters, with the largest concentration of definitions (66%) residing in overlapping "Cognitive Autonomy" and "Dissipative Self-Organizing Systems" clusters. This suggests an emerging convergence around integrated frameworks that combine physical principles, information processing, and agential properties. Peripheral clusters, like "Perceptual Categorization" and "Pragmatic Definitional Skepticism," highlight alternative or skeptical viewpoints.
The high consistency of correlation and clustering patterns across different LLMs (correlation coefficients >0.7) supports the robustness of the identified semantic structure. Definitions with low clustering consistency were often found in transitional zones between clusters, acting as conceptual bridges.
Practical Implications for Implementation:
This research provides a concrete example of using computational semantic analysis with LLMs to map and understand complex, multi-faceted concepts debated across disciplines. The methodology could be implemented to analyze other domains where definitions are contested or ambiguous, such as "consciousness," "intelligence," "health," or "sustainability."
Translating Concepts to Systems: The workflow of collecting definitions, quantifying relationships using LLMs, clustering, and performing semantic analysis can be generalized. Developers can adapt the prompting strategies for pairwise comparison and thematic analysis to other types of textual data (e.g., policy documents, scientific abstracts, user feedback) to reveal underlying conceptual structures or areas of agreement/disagreement.
Code Implementation: The methodology relies on standard natural language processing libraries and tools for clustering and dimensionality reduction (e.g., scikit-learn, SciPy, TensorFlow/PyTorch for LLM integration). The core LLM interaction involves API calls structured by specific prompts. The project's code is provided open-source (Supplemental Code), offering a template for implementing this approach.
# Pseudocode for pairwise correlationfromsklearn.metrics.pairwiseimportcosine_similarityimportnumpyas np
importopenai# Example for GPT-4o APIdefget_llm_correlation(def1, def2, model="gpt-4o"):
prompt = f"""Analyze the following two definitions of life, where:-1.0 = Fundamentally opposing or incompatible primary frameworks;... (rest of the scoring guide) ...Definition 1: {def1}Definition 2: {def2}What is the correlation metric between -1.0 and 1.0 for these two definitions?Respond with ONLY a single number!"""
response = openai.ChatCompletion.create(
model=model,
messages=[{"role": "system", "content": "You are an expert analyst."},
{"role": "user", "content": prompt}]
)
try:
# Attempt to parse the single number responsereturnfloat(response.choices[0].message.content.strip())
exceptValueError:
# Handle cases where LLM doesn't return a single number
print(f"Warning: LLM did not return a single number for pair:\n{def1}\n{def2}")
return np.nan # Or handle error appropriately
definitions = [...] # List of expert definitions
n = len(definitions)
correlation_matrix = np.zeros((n, n))
# Compute pairwise correlations (pseudo code - actual implementation would handle errors, retries, parallelization)for i inrange(n):
for j inrange(n):
if i == j:
correlation_matrix[i, j] = 1.0elif i < j:
# Average results from multiple LLMs/replicates
avg_corr = np.mean([get_llm_correlation(definitions[i], definitions[j], model="claude-3-sonnet-20240229") for _ inrange(3)] +
[get_llm_correlation(definitions[i], definitions[j], model="gpt-4o") for _ inrange(3)] +
[get_llm_correlation(definitions[i], definitions[j], model="llama-3-70b-chat-hf") for _ inrange(3)]) # Example model names
correlation_matrix[i, j] = avg_corr
correlation_matrix[j, i] = avg_corr # Symmetrize# Handle potential NaNs resulting from LLM errors before clustering
correlation_matrix = np.nan_to_num(correlation_matrix, nan=0.0) # Simple handling, may need more sophisticated imputation# Clustering (using scipy)fromscipy.cluster.hierarchyimportlinkage, dendrogram, fclusterfromscipy.spatial.distanceimportsquareform# Convert correlation to distance
distance_matrix = np.sqrt(2 * (1 - correlation_matrix))
# Ensure distance matrix is valid for scipy linkage (condensed format)
condensed_distance_matrix = squareform(distance_matrix, checks=False)
linked = linkage(condensed_distance_matrix, method='complete')
# Determine clusters (example: fcluster with a distance threshold or number of clusters)# The paper used 'elbow method' on the linkage matrix to find number of clusters# For 8 clusters:
num_clusters = 8
cluster_labels = fcluster(linked, num_clusters, criterion='maxclust')
# Pseudocode for LLM thematic analysis and consensus generation per cluster
clusters = {}
for i, label inenumerate(cluster_labels):
if label notin clusters:
clusters[label] = []
clusters[label].append(definitions[i])
for cluster_id, cluster_defs in clusters.items():
cluster_text = "\n".join(cluster_defs)
# Call LLM for thematic analysis
thematic_analysis_prompt = f"""Perform a thematic analysis on the following definitions:\n{cluster_text}\n1. WHAT Are The Core Ideas? ... (rest of prompt)"""# ... LLM call ...
cluster_analysis_result = ... # Get result# Call LLM for consensus definition
consensus_prompt = f"""Synthesize a consensus definition from these experts:\n{cluster_text}\nBased on this analysis:\n{cluster_analysis_result}\n- Start: "Life is ..."... (rest of prompt)"""# ... LLM call ...
consensus_definition = ... # Get result
print(f"Cluster {cluster_id}: {consensus_definition}")
# t-SNE Visualizationfromsklearn.manifoldimportTSNEimportmatplotlib.pyplotas plt
# Use the distance matrix for t-SNE
tsne = TSNE(n_components=2, metric='precomputed', random_state=42) # Use precomputed distance matrix
tsne_results = tsne.fit_transform(distance_matrix)
plt.figure(figsize=(10, 8))
scatter = plt.scatter(tsne_results[:, 0], tsne_results[:, 1], c=cluster_labels, cmap='tab10')
plt.title("t-SNE Projection of Life Definitions")
plt.xlabel("t-SNE Dimension 1")
plt.ylabel("t-SNE Dimension 2")
plt.colorbar(scatter, label='Cluster ID')
plt.show()
Computational Requirements: The pairwise correlation step scales quadratically (O(n2)) with the number of definitions (n). For 68 definitions, this is manageable (68*67/2 = ~2300 pairs * 3 models * 3 replicates = ~21,000 LLM calls). Scaling to thousands of definitions would require significant computational resources and potentially self-hosted LLMs or highly optimized API usage to manage cost and rate limits. Clustering and t-SNE are generally faster once the distance matrix is computed.
Limitations and Trade-offs: The analysis is sensitive to the capabilities and potential biases of the LLMs used for correlation and semantic analysis. Averaging across models helps mitigate this, but subtle biases might remain. The choice of clustering algorithm and parameters (e.g., number of clusters) can influence the results. t-SNE visualization is non-deterministic and can be sensitive to parameters (though the paper states they validated against other methods). The 2D projection simplifies the high-dimensional space, losing some nuance.
Deployment: For analyzing static sets of definitions or documents, the process can be run offline. For dynamic analysis (e.g., tracking how definitions evolve over time or with new input), the pipeline would need to be automated. Deployment requires access to LLM APIs or self-hosted models, computational infrastructure for matrix computations and clustering, and potentially a visualization layer.
Overall, the paper offers a valuable framework for applying LLMs and quantitative methods to qualitative data in scientific and philosophical discourse. It translates the abstract problem of defining life into a concrete computational analysis of expert opinions, revealing a structured conceptual space that bridges disparate perspectives. This approach has direct practical applications in mapping understanding in complex, multidisciplinary fields.