Draco 2 (Yang et al., 2023 ) is presented as a significant evolution of the original Draco system, aiming to provide a more robust, flexible, and user-friendly platform for modeling and applying visualization design knowledge. The core idea remains the same: encoding visualization design guidelines as logical rules (constraints) that can be processed by an Answer Set Programming (ASP) solver like Clingo to assess chart validity and generate recommended designs. Draco 2 addresses key limitations of Draco 1, particularly its dependence on Vega-Lite, limited tooling, and difficulty of extension.
The paper highlights three main areas of improvement in Draco 2:
- Flexible Visualization Specification Format: Draco 2 introduces a new, renderer-agnostic format for describing visualizations. Unlike Draco 1, which was closely tied to Vega-Lite, the new format uses logical facts based on "entities" (objects like views, marks, encodings, scales) and "attributes" (properties of entities like mark type, field name, scale type). This structure can be represented as a nested dictionary or a flattened list of logical facts suitable for the ASP solver. A key enhancement is making scales first-class entities, allowing easier specification of shared scales across multiple marks. This format supports complex designs like multi-layer and multi-view visualizations and can be extended with new entities and attributes. It also handles both complete and partial specifications, enabling recommendation from incomplete user queries.
- Implementation Detail: The conversion between the nested dictionary format (easier for users/APIs) and the flat logical fact list (required by Clingo) is handled internally by functions like \mintinline{python}{dict_to_facts} and \mintinline{python}{answer_set_to_dict}.
- Example: A bar mark on a view is specified using facts like \mintinline{prolog}{entity(mark,v0,m1).} and \mintinline{prolog}{attribute((mark,type),m1,bar).}, where \mintinline{prolog}{v0} and \mintinline{prolog}{m1} are unique entity identifiers. The dictionary format abstracts these IDs, representing it structurally, e.g., \mintinline{json}{"view":[{"mark":[{"type":"bar"}]}]}.
- Comprehensive Tooling and Documentation: Draco 2 significantly improves usability and extensibility by providing thorough documentation, a comprehensive test suite with 100% unit test coverage, and a pure Python implementation (removing the need for a mixed Python/JavaScript setup like Draco 1). This makes it easier for researchers and practitioners to adopt and integrate Draco 2 into their systems or build upon it.
- Implementation Detail: The system runs entirely in Python. For web integration, it offers a REST API (\mintinline{python}{server} module) or a WebAssembly package (\mintinline{latex}{draco-pyodide} npm package) to run the Python API directly in the browser.
- Practical Implication: Developers can now easily install and use Draco 2 as a standard Python library or deploy it in various application architectures (backend service, frontend WebAssembly). The extensive testing ensures reliability when integrating into production systems.
- Flexible and Convenient APIs: Draco 2 provides well-documented Python APIs for interacting with the knowledge base and performing core functions. These include validating specifications, recommending optimal completions for partial specifications (\mintinline{python}{Draco.complete_spec}), and tools for debugging and adapting the knowledge base.
- Implementation Detail: The knowledge base is loaded as Answer Set Programs organized into functional blocks (definitions, constraints, generator, hard, soft). This modular structure allows users to filter or modify parts of the knowledge base programmatically. Soft constraint weights can be loaded from files or assigned manually via the API when instantiating a \mintinline{python}{Draco} object, enabling customization of preferences.
- Debugging Tools: The \mintinline{python}{debug.DracoDebug} module generates a Pandas DataFrame detailing constraint violations for a set of visualizations. The \mintinline{python}{debug.DracoDebugPlotter} allows visualizing these violations (e.g., as a heatmap, see Figure 2 in the paper) to understand why certain recommendations were generated or ranked. This is crucial for debugging constraints and tuning weights.
- Example Usage (Conceptual): To get recommendations from a partial spec, one would load data schema facts, add facts for desired entities (e.g., a view and a mark), and pass this partial fact list to \mintinline{python}{draco.complete_spec()}. The result is a ranked list of complete specifications (as dictionaries).
Knowledge Base Implementation:
The design guidelines are encoded as hard and soft constraints in ASP. Hard constraints filter out invalid designs (e.g., using a 'shape' channel with a 'bar' mark). Soft constraints assign costs to undesirable design choices, and the solver finds designs minimizing the total cost. Weights for soft constraints determine their relative importance and can be manually set or learned from data (similar to Draco 1's learning algorithm [Moritz2018formalizing]). The default knowledge base is based on rules from CompassQL [Wongsuphasawat2016towards].
Practical Application: Exploring and Adapting the Knowledge Base
The paper demonstrates Draco 2's utility using the Seattle weather dataset. Starting with a minimal specification (just data schema, one view, one mark), Draco 2 recommends basic charts like count-based visualizations. By iteratively adding constraints to the partial specification (e.g., targeting specific fields, preferring faceting), users can guide the recommendation process. The debugging tools (\mintinline{python}{DracoDebug}, \mintinline{python}{DracoDebugPlotter}) allow users to inspect why certain charts were recommended (which constraints were violated, their costs) and compare violation patterns across different designs. This process helps users understand the knowledge base's behavior, identify potential issues (e.g., non-expressive designs with low cost), and inform adjustments to constraints or weights. This iterative cycle of querying, inspecting results/violations, and adjusting the knowledge base is a core workflow enabled by Draco 2's practical tooling.
Comparison with Draco 1:
Draco 2 is shown to be easier to integrate into the Python ecosystem and offers more features via its API. The visualization specification format is a key difference, with Draco 2's entity-attribute model being more generic and extensible than Draco 1's Vega-Lite-centric format. While the underlying knowledge base and ranking behavior show high similarity (86% agreement on a test set), Draco 2's design allows for greater flexibility and future expansion, including support for multi-layer/view and explicit scales.
Future Work and Vision:
The authors envision Draco 2 as a foundational platform for future visualization research. Potential applications include integration with LLMs for natural language visualization generation (where LLMs generate partial specs and Draco 2 completes them) and using the constraint violation vectors as features in machine learning models.
In summary, Draco 2 is a refined, more accessible, and extensible platform for declarative modeling of visualization design knowledge using constraint solving. Its improved specification format, comprehensive tooling, and convenient APIs significantly lower the barrier to entry for researchers and practitioners wanting to build or experiment with automated visualization tools based on design principles.