Herculean: An Agentic Benchmark for Financial Intelligence
Abstract: As AI agents improve, the central question is no longer whether they can solve isolated well-defined financial tasks, but whether they can reliably carry out financial professional work. Existing financial benchmarks offer only a partial view of this ability, as they primarily evaluate static competencies such as question answering, retrieval, summarization, and classification. We introduce Herculean, the first skilled benchmark for agentic financial intelligence spanning four representative workflows, including Trading, Hedging, Market Insights, and Auditing. Each workflow is instantiated as a standardized MCP-based skill environment with its own tools, interaction dynamics, constraints, and success criteria, enabling consistent end-to-end assessment of heterogeneous agent systems. Across frontier agents, we find agents perform relatively well on Trading and Market Insights, but struggle substantially on Hedging and Auditing, where long-horizon coordination, state consistency, and structured verification are critical. Overall, our results point to a key gap in current agents in turning financial reasoning into dependable workflow execution in high-stakes financial workflows.
First 10 authors:
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Explaining “HERCULEAN: An Agentic Benchmark for Financial Intelligence”
Overview: What’s this paper about?
This paper introduces HERCULEAN, a new way to test how well AI “agents” can do real financial work from start to finish. Instead of just answering questions or summarizing documents, the agents have to carry out full workflows, like a human finance professional would. The benchmark includes four kinds of tasks: Trading, Hedging, Market Insights, and Auditing.
Think of it like a set of realistic “levels” in a finance game. Each level has rules, tools, and goals, and the AI must plan, act, and check its work over days or weeks—just like in the real world.
Key questions the researchers asked
The paper focuses on simple but important questions:
- Can current AI agents handle complete financial workflows, not just isolated tasks?
- Which kinds of financial tasks are easier or harder for AI agents?
- Does the way an agent is designed (its “framework”) matter as much as the LLM it uses?
- Do bigger, more advanced LLMs guarantee success in these realistic settings?
How they did the research
The team built HERCULEAN as four “skill environments,” each mirroring a real financial job. Agents interact with these environments through a standardized interface called the Model Context Protocol (MCP). MCP is like a shared controller: it defines what the agent can see, what tools it can use, and how it must report results, so every agent is tested fairly.
Here’s what each workflow looks like:
- Trading: The agent makes a daily choice—BUY, SELL, or HOLD—on one stock over three months. It can look at past prices, company news, and official filings, but not the future. Success is measured by:
- Cumulative return (how much you made or lost overall),
- Sharpe ratio (profit relative to volatility/risk),
- Maximum drawdown (the biggest drop from a peak).
- Hedging (pairs trading): The agent picks two related stocks (like MSFT and GOOG) and bets on their relationship, not the market’s direction. Each day, it chooses positions like LONG_SHORT (long one, short the other), HOLD, or CLOSE. The portfolio is “dollar neutral,” meaning it balances long and short so total dollar exposure is zero. It uses the same return/risk metrics as Trading.
- Market Insights: The agent writes a weekly investment report (with sections like summary, rating, risks) for one stock and gives a rating (STRONG_BUY to STRONG_SELL). The report must combine prices, news, filings, and peer comparisons. The team judges:
- Report quality (structure, accuracy, evidence, reasoning),
- Whether following the ratings would have made money (using the same trading metrics).
- Auditing: The agent checks a specific number in a company’s official XBRL filing (a digital format for financial statements). It must:
- Find the reported value,
- Compute the correct value using the filing’s calculation links and GAAP rules (the accounting standards),
- Report both and see if there’s an error.
- Accuracy is judged in steps: whether the output is valid, whether the right number was extracted, and whether the math/logic was correct.
To test fairness and robustness, the researchers ran five different agent frameworks with four different LLMs (including advanced closed-source models and smaller open-source ones). They turned off web search and persistent memory to make sure the agents relied only on the provided financial data and tools.
Main findings and why they matter
The results show clear patterns. Here are the highlights:
- Agents do better on “talking and summarizing” tasks than on “precise, step-by-step” tasks:
- Market Insights: Many agents wrote high-quality reports (often scoring above 9/10). However, good writing didn’t always mean good investment results.
- Trading: Some agents beat a simple “Buy & Hold” baseline, but gains were small and inconsistent.
- Hedging: Agents struggled. This task requires tracking positions over time and understanding relationships between two stocks—skills current agents found hard.
- Auditing: This was the hardest. The best systems reached about 66% accuracy, but many made structural mistakes (like not following the required format or process), and calculation errors were common. This shows strict, rule-based financial checking is still very tough for AI.
- The agent’s design matters as much as the LLM:
- Agents with strong execution control (good at structured tool use and following protocols) did much better, especially in Auditing.
- Agents that rely on a simple “think-then-act” loop often broke the rules or failed to complete tasks over long time spans.
- Bigger LLMs help, but they aren’t a magic fix:
- Advanced models improved performance overall.
- Still, being great at writing a convincing report didn’t mean the agent could verify accounting numbers correctly or manage a hedged position reliably.
- The core gap:
- Today’s agents can reason and write well, but turning that reasoning into consistent, correct actions over time—especially under strict rules—remains a major challenge.
Implications: What this means going forward
This benchmark pushes AI beyond simple Q&A and toward real professional work. The findings suggest:
- To make AI agents truly useful in finance, we need better “execution brains,” not just better “language brains.” That means stronger:
- State tracking (remembering what’s already done or open),
- Tool orchestration (using the right tools, in the right order, with the right parameters),
- Verification (checking math and rules carefully).
- Teams building financial AI should focus on workflow stability, not just flashy reasoning. An agent that can follow rules, maintain consistent state, and verify results is more valuable—and safer—than one that only sounds smart.
- HERCULEAN gives researchers a common, realistic testbed. It can help the community compare methods fairly and improve agents so they can handle high-stakes tasks more reliably.
Note: The authors released code and data for research purposes. The benchmark uses public information and doesn’t offer financial advice. It focuses on US markets and English-language filings, so results may not generalize globally.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
Below is a consolidated list of unresolved issues that future work could address to strengthen the benchmark’s validity, coverage, and scientific conclusions:
- External validity across markets: Assess generalization beyond large-cap US equities to small/mid-cap, international markets, and non-English filings/markets.
- Accounting regime coverage: Extend Auditing from US GAAP to IFRS and cross-standard mappings; evaluate mixed-standard corpora.
- Asset-class breadth: Add fixed income, FX, commodities, options/derivatives (e.g., Greeks-based hedging), ETFs, and crypto to test cross-asset reasoning.
- Longer horizons and live evaluation: Run multi-quarter/multi-year out-of-sample and longer live periods with regime shifts; report stability over time.
- Market realism: Incorporate transaction costs, slippage, fees, borrow availability, margin, liquidity/impact, and order execution constraints.
- Position sizing and risk: Move beyond discrete BUY/SELL/HOLD to include position sizing, leverage, VaR/stop-loss, and risk budgets; evaluate portfolio-level constraints.
- Intraday/microstructure: Introduce intraday data and execution tasks (limit/market orders, order book dynamics) to test timing and microstructure-aware decisions.
- Trading semantics clarity: Specify inventory/position persistence, shorting rules, and how BUY/SELL map to exposure changes; include stronger baselines (e.g., momentum/mean-reversion) with and without costs.
- Hedging strategy realism: Support dynamic pair re-selection, dynamic hedge ratios (e.g., rolling OLS), z-score thresholds, cointegration tests, and rebalancing cadence.
- Hedging constraints: Model short-sale constraints, borrow costs, margin calls, and re-hedging under volatility spikes.
- Portfolio workflows: Evaluate multi-asset portfolio construction (cross-sectional long–short, risk parity) rather than single-asset or single-pair tasks.
- Market Insights impact mapping: Calibrate and justify the mapping from weekly ratings to trading strategies; compare to analyst-style benchmarks and buy-side baselines.
- Report quality measurement validity: Validate the rubric and judge-LLM with expert auditors/analysts; report inter-rater agreement and robustness to prompt variations.
- Evidence fidelity and citation: Enforce source-linked claims with verifiable citations; score citation correctness and coverage beyond rubric pass/fail.
- Data leakage audits: Rigorously audit chronology for prices/news/filings and the weekly metrics aggregator; publish leakage checks and unit tests.
- News summarization bias: Quantify how the aggregation/summarization pipeline affects evidence fidelity and downstream decisions; compare to raw/newswire feeds.
- Auditing ground truth scale: Expand beyond 65 instances; include diverse concept types, periods, dimensional contexts (hypercubes), footnotes, segment data, and restatements.
- Deterministic auditing labels: Provide human-verified numeric ground truth (not LLM-judged) for a substantial subset to benchmark judge accuracy and agent correctness.
- Narrative–numeric cross-checks: Add tasks that reconcile narrative disclosures (MD&A/notes) with numeric facts and detect inconsistencies.
- Judge-LLM bias and contamination: Quantify judge sensitivity to fluency and backbone family; use multiple orthogonal judges and human adjudication to estimate bias.
- Variance and reliability: Report run-to-run variance, seeds, confidence intervals, and sensitivity to reasoning depth and temperature; analyze failure rates over time.
- Prompt/tool sensitivity: Ablate skill prompts, tool schemas, and MCP API designs; quantify how interface changes affect execution stability and outcomes.
- Memory and retrieval effects: Evaluate the impact of persistent memory, retrieval augmentation, and external web search on all workflows under identical constraints.
- Execution-control methods: Compare self-verification, planning, program-of-thought, code execution, and constrained decoding; quantify their effect on Auditing/Hedging.
- Learning inside the environment: Explore RL/fine-tuning/curriculum learning for trajectory control and verification, measuring sample efficiency and safety.
- Multi-agent and human-in-the-loop: Test role-specialized agent teams and escalation-to-human protocols; measure coordination overhead and error reduction.
- Adversarial robustness: Introduce corrupted/noisy/mislabeled data, adversarial tool responses, and stress scenarios; measure brittleness and recovery.
- Cost–performance trade-offs: Report token/tool-call budgets, latency, throughput, and cost-normalized performance; study compute-aware agent design.
- Process analytics: Publish telemetry on tool-call counts, depth, time per step, and failure taxonomy (SER/EER/CER analogs across workflows) to link behavior to outcomes.
- Cross-workflow transfer: Test whether improvements in Auditing (verification) transfer to Trading/Hedging execution; study shared skills vs. specialization.
- Additional workflows: Add credit risk/underwriting, AML/KYC/compliance monitoring, portfolio rebalancing, corporate actions processing, and earnings forecasting.
- Alternative data and macro: Incorporate macro indicators, options surface/skew, supply-chain and satellite data, and sell-side transcripts to test multimodal integration.
- Safety and governance: Define metrics for safe failure, escalation criteria, and policy constraints to prevent high-risk actions; audit for hallucinated or non-compliant outputs.
- Interoperability and standards: Evaluate portability beyond MCP to other tool protocols; propose reference schemas for financial tools to reduce framework-induced variance.
- Reproducibility of agent setups: Provide full agent configuration, prompts, and deterministic seeds; document API versioning to enable faithful replication.
Practical Applications
Immediate Applications
The following applications can be deployed now, leveraging the benchmark’s MCP-based skill environments, evaluation protocols, and observed agent performance characteristics.
- Industry: AI agent vendor evaluation and procurement harness
- What: Use HERCULEAN as a standardized, workflow-faithful testbed to compare agent frameworks/backbones for Trading, Hedging, Market Insights, and Auditing before purchase or deployment.
- Sector(s): Finance, Software
- Tool/Product/Workflow: Evaluation dashboards reporting CR/SR/MDD for Trading/Hedging, rubric scores for Market Insights, and ACC/SER/EER/CER for Auditing; repeatable MCP-based test suites; vendor scorecards.
- Dependencies/Assumptions: US large-cap equities, GAAP/XBRL focus, benchmark time windows; LLM/API access; does not model transaction costs or market frictions.
- Industry: Research analyst co-pilot with rubric-based quality gates
- What: Generate weekly, structured investment reports (ratings plus 8-section Markdown) and gate them with the benchmark’s LLM-as-judge rubrics (structure, fidelity, accuracy, reasoning).
- Sector(s): Finance (buy-side/sell-side), Enterprise Research Ops
- Tool/Product/Workflow: Market Insights MCP skill + rubric evaluator as a CI/CD-like gate for research notes; report templates for portfolio meetings.
- Dependencies/Assumptions: Human-in-the-loop review to mitigate LLM-judge bias toward fluency; coverage limited to benchmark asset universe unless extended.
- Industry/Enterprise: Pre-filing XBRL self-checker for internal audit/compliance
- What: Use the Auditing skill as a pre-screen tool to detect calculation-network inconsistencies and sign/balance issues in draft filings.
- Sector(s): Audit, Corporate Finance, RegTech
- Tool/Product/Workflow: Auditing MCP server integrated into SEC-reporting workflows; red-flag reports with traceable calculation steps.
- Dependencies/Assumptions: Current agent accuracy is uneven; treat as assistive triage, not a replacement for professional audit; GAAP taxonomy coverage; requires internal mapping to company-specific extensions.
- Engineering/MLOps: Execution-control–oriented agent selection and hardening
- What: Prefer CLI-oriented or schema-enforcing agent frameworks (lower SER) for tool-heavy workflows (e.g., Auditing), based on benchmark findings.
- Sector(s): Software, Finance
- Tool/Product/Workflow: Agent orchestration policies (typed tools, schema validators, trajectory monitors), regression suites using HERCULEAN scenarios.
- Dependencies/Assumptions: Access to frameworks that expose tool typing and strict I/O schemas; ops discipline to maintain evaluation baselines.
- Academia/Education: End-to-end finance labs for teaching and evaluation
- What: Course modules that mirror professional workflows (trading, hedging, insights, auditing) rather than static QA.
- Sector(s): Education
- Tool/Product/Workflow: MCP servers + DuckDB datasets as lab infrastructure; assignments on pair selection, weekly reporting, and XBRL verification.
- Dependencies/Assumptions: Faculty/IT support to deploy MCP; curated guardrails to avoid “live trading” misconceptions.
- Industry: Paper-trading sandboxes for strategy prototyping
- What: Rapid prototyping of single-asset trading and pairs hedging in a controlled, reproducible environment with tool-mediated data access.
- Sector(s): Asset Management, Fintech
- Tool/Product/Workflow: Broker-simulated backtests driven via MCP; scenario libraries (different assets/time windows).
- Dependencies/Assumptions: Historical-only and limited universe; no slippage/fees unless added; agents show instability on Hedging—keep in sandbox.
- Policy/RegTech: SupTech prototypes for disclosure consistency checks
- What: Pilot automated audits of EDGAR filings for calculation-network consistency and taxonomy conformance.
- Sector(s): Policy, Regulation, Audit
- Tool/Product/Workflow: Batch auditing of recent filings with triage dashboards; human reviewer queue for flagged facts.
- Dependencies/Assumptions: Treat outputs as leads, not determinations; align with regulator data-access and security policies; extend to new taxonomy updates.
- Daily life/Prosumer: Transparent weekly market summaries
- What: Consumer-facing “explain like I’m an analyst” reports using the Market Insights skill on covered tickers, emphasizing evidence links (news/filings) and risks.
- Sector(s): Personal Finance, Education
- Tool/Product/Workflow: Web app that generates weekly reports with caveats and learning prompts; no trading execution.
- Dependencies/Assumptions: Strict disclaimers (not investment advice); limited to public data; encourage diversified, long-term investing principles.
Long-Term Applications
These applications require further research, scaling, and/or integration work (e.g., stronger execution control, broader data, regulatory acceptance).
- Industry: Production-grade agent portfolio managers with market-neutral modules
- What: End-to-end agents that trade/hedge live with robust state tracking, risk budgeting, and compliance-aware execution.
- Sector(s): Asset Management, Brokerage
- Tool/Product/Workflow: Live MCP skills wired to real-time market data and broker APIs; policy engines enforcing position limits, P&L stop-outs, and audit trails.
- Dependencies/Assumptions: Significant improvements in long-horizon coordination and cross-asset reasoning; model risk governance; full treatment of costs and slippage.
- Policy/RegTech: Continuous, automated XBRL auditing at scale
- What: Always-on agents that recompute and cross-verify reported facts across issuers/periods, raising probabilistic anomalies for human follow-up.
- Sector(s): Regulation, Exchanges, Audit
- Tool/Product/Workflow: High-throughput auditing pipelines; concept graph reasoning across company-specific extensions; cross-filing consistency checks.
- Dependencies/Assumptions: Higher ACC and lower CER; regulator acceptance; robust handling of taxonomy evolution and restatements.
- Standardization: Certification regimes for AI financial agents
- What: Industry-standard tests (built on HERCULEAN-like workflows) certifying execution stability and verification competence before live deployment.
- Sector(s): Finance, Standards Bodies, Risk Management
- Tool/Product/Workflow: Tiered benchmarks and thresholds per workflow; periodic re-certification; incident reporting tied to benchmark regressions.
- Dependencies/Assumptions: Broad community adoption; transparent, versioned datasets; governance for test leakage.
- Software/Tooling: Execution-control platforms for agentic finance
- What: Products that provide typed tool schemas, deterministic verification loops, state stores, and “trajectory stabilizers” as a layer atop LLMs.
- Sector(s): Software, Finance, Compliance
- Tool/Product/Workflow: Agent SDKs with schema enforcement, retry/rollback, and deterministic calculators for financial primitives; audit logs.
- Dependencies/Assumptions: Integration with diverse data vendors; compatibility with MCP and future agent standards.
- Cross-sector MCP skill libraries
- What: Extend the skill-based, MCP-grounded approach to insurance underwriting, credit risk scoring, procurement auditing, and energy trading.
- Sector(s): Insurance, Banking, Supply Chain, Energy
- Tool/Product/Workflow: Domain-specific skills with canonical tools, constraints, and evaluation criteria (e.g., loss triangles, PD/LGD estimation, invoice matching).
- Dependencies/Assumptions: Domain ontologies, regulatory rulesets, and high-quality labeled data; sector buy-in.
- Learning: RL/DPO training for workflow competence
- What: Use HERCULEAN environments as training grounds to optimize for execution metrics (low SER/EER/CER; improved CR/SR with risk constraints).
- Sector(s): AI Research, Finance
- Tool/Product/Workflow: Offline RL with logged trajectories; curriculum learning from Market Insights to Auditing; verifier-in-the-loop optimization.
- Dependencies/Assumptions: Reliable reward shaping without overfitting; cost-effective training; safety constraints.
- Multi-agent systems with verifier and memory roles
- What: Architectures where a “doer” agent is paired with a “verifier” and a “state manager” to ensure deterministic checks and consistent long-horizon behavior.
- Sector(s): Finance, Software
- Tool/Product/Workflow: Agent ensembles with explicit role APIs; shared state stores and calculation graphs; escalation to humans on uncertainty.
- Dependencies/Assumptions: Coordination overhead and latency budgets; secure shared memory; robust arbitration policies.
- Enterprise: Automated research pipelines from ingestion to publish
- What: Semi-autonomous production of house views and sector decks, with evidence linking, peer-relative benchmarking, and compliance review.
- Sector(s): Investment Research, Corporate Strategy
- Tool/Product/Workflow: End-to-end content generation with rubric gates, fact-citation checks, and compliance sign-offs; knowledge-base integration.
- Dependencies/Assumptions: Strong hallucination control; IP/document access rights; alignment with editorial standards.
- Trading infrastructure: Broker and OMS/EMS integration with compliance guards
- What: Agents that propose actions which must pass pre-trade checks (mandates, exposure limits) and post-trade surveillance using audit logs.
- Sector(s): Brokerage, Asset Management
- Tool/Product/Workflow: Policy-as-code libraries; explainability artifacts tied to each action; real-time risk dashboards.
- Dependencies/Assumptions: Low-latency tool orchestration; regulator-ready auditability; robust kill-switches.
- Policy and education: Model risk management and regulatory curricula
- What: Regulator and practitioner training programs built around workflow benchmarks to teach where agents fail (e.g., verification-heavy tasks).
- Sector(s): Policy, Education, Risk
- Tool/Product/Workflow: Case libraries, hands-on labs, and simulation exercises; guidelines for acceptable use and controls.
- Dependencies/Assumptions: Cross-institution collaboration; continuously updated examples reflecting new agent capabilities and failure modes.
Glossary
- Adjusted close: A stock’s closing price adjusted for corporate actions like splits and dividends to reflect true economic value. "OHLCV prices with adjusted close"
- Agentic: Refers to AI systems that can act autonomously using tools and multi-step interactions. "the first skilled benchmark for agentic financial intelligence"
- Alpha: Excess return relative to a benchmark, often attributed to skill or unique insights. "single-stock alpha, momentum, and sector-relative beta blocks"
- Backbone models: The underlying LLMs that power agent frameworks. "Each agent system is tested on four backbone models"
- Balance semantics: XBRL/accounting property indicating whether a concept increases with debits or credits. "the concept’s balance semantics"
- Beta: Sensitivity of an asset’s returns relative to a market or sector benchmark. "sector-relative beta blocks"
- Buy&Hold baseline: A benchmark strategy that buys an asset and holds it without trading. "outperform the negative Buy&Hold baseline"
- Calculation linkbase: The XBRL file that encodes arithmetic relationships among reported concepts. "(instance, calculation linkbase, schema, definition linkbase, label linkbase, presentation linkbase)"
- Calculation network: The graph of linked XBRL concepts and formulas used to compute or validate values. "the filing’s calculation network"
- Cumulative return (CR): Total percentage gain or loss over a period. "cumulative return (CR), Sharpe ratio (SR), and maximum drawdown (MDD)."
- Definition linkbase: The XBRL file capturing semantic relationships among concepts beyond pure arithmetic. "(instance, calculation linkbase, schema, definition linkbase, label linkbase, presentation linkbase)"
- Dimensional context: The XBRL specification of dimensions (e.g., segments, products) qualifying a reported fact. "dimensional-context resolution"
- Dollar-neutral portfolio: A long-short position with equal dollar amounts on each side, yielding zero net exposure. "Any open pair position is implemented as a dollar-neutral portfolio"
- DuckDB: An in-process analytical database used here to store and query market data offline. "is materialized in an offline DuckDB"
- Equal-weighted: A portfolio or basket where each constituent has the same weight. "equal-weighted sector basket"
- Extraction error rate (EER): The fraction of audit cases where the system extracted the wrong value from a filing. "extraction error rate (EER)"
- Form 10-K: The SEC’s annual report filing that provides a comprehensive overview of a company’s business and financials. "Form 10-K and Form 10-Q"
- Form 10-Q: The SEC’s quarterly report filing summarizing interim financial performance. "Form 10-K and Form 10-Q"
- Hedging: Strategies designed to reduce or offset risk, often via offsetting positions. "Hedging strategies seek to profit not from predicting market direction"
- Hierarchical fact-verification: A structured evaluation approach that checks multiple layers of correctness when verifying reported facts. "hierarchical fact-verification task lineage"
- Instance document: The XBRL file that contains actual numeric facts reported by a company. "instance document"
- Label linkbase: The XBRL file that provides human-readable labels for taxonomy concepts. "label linkbase"
- LLM-as-a-judge: An evaluation paradigm where a LLM assesses correctness or quality of outputs. "hierarchical LLM-as-a-judge framework"
- Market-neutral: A strategy designed to have minimal net market exposure, focusing on relative performance. "market-neutral pairs trading strategies"
- Market timing: Making buy/sell decisions based on forecasts of short-term market movements. "daily market-timing decisions"
- Maximum drawdown (MDD): The largest peak-to-trough decline in a portfolio over a period. "maximum drawdown (MDD)."
- MCP (Model Context Protocol): A standardized protocol that packages tools and interactions for agents to use within an environment. "built following the Model Context Protocol (MCP)"
- MCP server: The service that implements MCP tools and enforces environment state and evaluation logic. "an MCP server that exposes workflow-specific observations, tools, actions, and evaluation criteria."
- Momentum: A signal based on the tendency of assets with recent strong performance to continue performing well (and vice versa). "single-stock alpha, momentum, and sector-relative beta blocks"
- Notional exposure: The total value controlled by a position, irrespective of leverage effects. "equal absolute notional exposure"
- OHLCV: Open, High, Low, Close, Volume — standard fields in price time series data. "OHLCV prices with adjusted close"
- Pair trading: A market-neutral strategy that exploits relative mispricings between two correlated assets. "Pair trading, one of the most representative market-neutral hedging strategies"
- Parametric memory: Knowledge encoded in model parameters rather than retrieved from external data/tools. "parametric memory"
- Peer mapping: A mapping from a company to its sector peers used for relative comparisons. "a static peer mapping"
- Presentation linkbase: The XBRL file that specifies how concepts are organized for display. "(instance, calculation linkbase, schema, definition linkbase, label linkbase, presentation linkbase)"
- Schema (XBRL schema): The XBRL file defining the taxonomy’s elements and their data types. "(instance, calculation linkbase, schema, definition linkbase, label linkbase, presentation linkbase)"
- SEC EDGAR: The SEC’s Electronic Data Gathering, Analysis, and Retrieval system that hosts company filings. "SEC EDGAR 27."
- Sector basket: A portfolio of peer companies within the same sector used for relative performance benchmarking. "equal-weighted sector basket"
- Sharpe ratio (SR): Risk-adjusted return metric defined as excess return over volatility. "cumulative return (CR), Sharpe ratio (SR), and maximum drawdown (MDD)."
- U.S. GAAP taxonomy: The standardized set of accounting concepts used in US financial reporting within XBRL. "against the U.S. GAAP taxonomy"
- XBRL: eXtensible Business Reporting Language used for machine-readable financial reporting. "verify individual XBRL numeric facts"
Collections
Sign up for free to add this paper to one or more collections.