Applicability of Program-of-Thoughts to semantic commonsense reasoning

Determine whether Program-of-Thoughts prompting is not the best option for semantic commonsense reasoning tasks such as StrategyQA, in comparison to chain-of-thought prompting.

Background

The paper proposes Program-of-Thoughts (PoT) prompting for numerical and symbolic reasoning tasks and demonstrates strong empirical gains on math and finance benchmarks.

In the discussion, the authors posit that PoT is particularly suitable for symbolic reasoning, but they conjecture it may be suboptimal for semantic commonsense tasks (e.g., StrategyQA), implying an unresolved question about PoT’s relative effectiveness beyond numerical domains.

References

We believe PoT is suitable for problems which require highly symbolic reasoning skills. For semantic reasoning tasks like commonsense reasoning (StrategyQA), we conjecture that PoT is not the best option.