Stop Misusing t-SNE and UMAP for Visual Analytics (2506.08725v1)

Published 10 Jun 2025 in cs.HC and cs.LG

Abstract: Misuses of t-SNE and UMAP in visual analytics have become increasingly common. For example, although t-SNE and UMAP projections often do not faithfully reflect true distances between clusters, practitioners frequently use them to investigate inter-cluster relationships. In this paper, we bring this issue to the surface and comprehensively investigate why such misuse occurs and how to prevent it. We conduct a literature review of 114 papers to verify the prevalence of the misuse and analyze the reasonings behind it. We then execute an interview study to uncover practitioners' implicit motivations for using these techniques -- rationales often undisclosed in the literature. Our findings indicate that misuse of t-SNE and UMAP primarily stems from limited discourse on their appropriate use in visual analytics. We conclude by proposing future directions and concrete action items to promote more reasonable use of DR.

Summary

The paper reveals that t-SNE and UMAP are widely misused beyond local structure preservation, which can mislead analytical outcomes.
It systematically reviews literature and interviews to identify gaps in user understanding, hyperparameter tuning, and DR methodology.
The study advocates for task-specific DR workflows and principled parameter management to enhance the reliability of visual analytics.

This paper, "Stop Misusing t-SNE and UMAP for Visual Analytics" (2506.08725), critically examines the prevalent misuse of t-SNE and UMAP in visual analytics applications. Unlike papers introducing new algorithms, this work is a paper focused on understanding how and why these popular dimensionality reduction (DR) techniques are often applied incorrectly in practice and proposes actions to address this issue.

The core problem highlighted is that while t-SNE and UMAP are excellent for certain tasks, particularly those related to preserving local neighborhood structures like identifying clusters or outliers, they are frequently used for tasks they are not suitable for. These unsuitable tasks include investigating distances between clusters, comparing cluster densities, or accurately judging global distances between points. The paper argues that using t-SNE and UMAP for these purposes can lead to misleading conclusions and undermine the reliability of visual analytics systems.

Through a literature review of 114 visual analytics papers and interviews with 12 practitioners, the authors verify that this misuse is widespread. They find that t-SNE and UMAP dominate DR usage, often applied across various tasks without clear justification. When justifications are provided, they are frequently based on potentially invalid reasons like perceived "faithfulness" (even for unsuitable tasks), popularity, interpretability (i.e., producing visually distinct clusters), aesthetics, scalability claims (often outdated), stability claims (despite evidence of instability), and even a sense of "immunity" to criticism due to their common use.

The root causes identified for this misuse are:

Practitioners' Limited Understanding: Many users lack deep knowledge of the strengths and weaknesses of different DR techniques and how hyperparameters impact the resulting projections. This often leads to unsystematic hyperparameter tuning ("cherry-picking") to achieve visually pleasing results rather than faithful representations for the task.
Lack of Standardized Principles: There is a community-wide gap in clear, easily accessible guidelines for selecting DR techniques based on specific analytic tasks and for properly evaluating projections.
Lack of Discourse and Motivation: The importance of using DR properly is often overlooked in academic publishing and practitioner communities, contributing to a cycle where misuse is perpetuated through examples in publications and recommendations from peers or even LLMs trained on potentially biased data.

Practical Implications for Developers and Practitioners

The paper's findings have significant practical implications for anyone building or using visual analytics tools that incorporate DR:

Choosing the Right Tool for the Task: Developers should implement and make available a range of DR techniques beyond just t-SNE and UMAP. Critically, visual analytics systems should guide users toward selecting the appropriate DR method based on the analytical question they are trying to answer.
- For Cluster Identification, Outlier Identification, Neighborhood Identification: t-SNE and UMAP are generally suitable choices due to their strength in preserving local structure.
- For Cluster Distance Investigation, Class Separability Investigation, Cluster Density Investigation, Point Distance Investigation: Global methods like PCA or MDS are often more appropriate. Newer techniques like PaCMAP [2] or densMAP [54] might offer better compromises or specific improvements for density/distance preservation.
Implementing Task-Specific DR Pipelines: Instead of a generic "apply DR" function, consider workflows that implicitly or explicitly tie the DR application to a defined analytic task. For instance, a system could offer distinct "Explore Clusters" (using t-SNE/UMAP) and "Compare Groups" (using PCA/MDS or alternatives) views, potentially using different DR techniques optimized for those goals.
Principled Hyperparameter Management: Directly exposing raw hyperparameters without guidance is problematic. Implement features that help users understand hyperparameter impact or provide automated hyperparameter tuning mechanisms aimed at maximizing objective evaluation metrics relevant to the selected task.
- Implementation: This could involve integrating hyperparameter optimization libraries (e.g., using scikit-optimize or similar) with DR evaluation metrics (e.g., Trustworthiness, Continuity, or global preservation metrics like correlation between high-dimensional and low-dimensional distances). The optimization process might run in the background or be offered as an advanced feature.
- Evaluation Metrics: Use libraries or implement calculations for metrics like Trustworthiness & Continuity [68] for local structure, or correlation metrics (e.g., Spearman correlation) between original high-dimensional distances and projected distances for global structure. Visualize these metrics alongside the projection to give users objective feedback.
Communicating DR Limitations: Visualizations should explicitly communicate the limitations of the chosen DR technique. For example, when displaying a t-SNE or UMAP projection, labels or tooltips could warn users that "distances between clusters may not reflect true similarity in the original data" or "this view primarily preserves local structure."
Educational Integration: Incorporate brief, interactive explanations within the visual analytics tool itself or as accompanying documentation to educate users on the appropriate use cases and common pitfalls of t-SNE and UMAP. This could involve showing side-by-side comparisons of how different DR methods represent the same data for different tasks.
Transparency and Documentation: Encourage or enforce documenting the specific DR technique, hyperparameters, and the intended analytic task when results are shared or published using the system. This builds good practice and allows for reproducibility and critical review.

Implementation Considerations

Implementing these recommendations requires careful consideration:

Multiple DR Implementations: Including multiple DR algorithms increases code complexity and maintenance. Choose a set of algorithms covering local and global preservation strengths (e.g., t-SNE, UMAP, PCA, MDS, potentially one or two promising newer techniques like PaCMAP). Utilize existing libraries (e.g., scikit-learn, umap-learn, pacmap) where possible.
Computational Cost: Running multiple DR techniques and hyperparameter optimization can be computationally expensive, especially for large datasets. Consider efficient implementations (e.g., GPU-accelerated versions where available), progressive DR algorithms [38, 41], or sampling strategies for initial exploration. Balance the desire for optimal projections with performance requirements for interactive visual analytics.
User Interface Design: Designing a user interface that effectively guides users through task-appropriate DR selection and potentially hyperparameter tuning without overwhelming them is crucial. Provide clear defaults based on common tasks but allow expert users control.
Deployment: The chosen DR algorithms and optimization workflows must be deployable within the target environment (desktop application, web application, cloud service). Consider the memory footprint and processing power required.

In essence, the paper serves as a call for the visual analytics community to move beyond treating t-SNE and UMAP as universal solutions. For developers, this translates into building more sophisticated DR workflows into tools – workflows that are task-aware, provide principled means for parameter tuning, and clearly communicate the characteristics and limitations of the resulting visualizations. It's about enabling users to perform reliable visual analytics, not just generating visually separated clusters.

PDF Markdown

Related Papers

Tweets

https://twitter.com/jeonhyeon11/status/1936763157156102291