DIGIC: Domain Generalizable Imitation Learning by Causal Discovery (2402.18910v1)
Abstract: Causality has been combined with machine learning to produce robust representations for domain generalization. Most existing methods of this type require massive data from multiple domains to identify causal features by cross-domain variations, which can be expensive or even infeasible and may lead to misidentification in some cases. In this work, we make a different attempt by leveraging the demonstration data distribution to discover the causal features for a domain generalizable policy. We design a novel framework, called DIGIC, to identify the causal features by finding the direct cause of the expert action from the demonstration data distribution via causal discovery. Our framework can achieve domain generalizable imitation learning with only single-domain data and serve as a complement for cross-domain variation-based methods under non-structural assumptions on the underlying causal models. Our empirical study in various control tasks shows that the proposed framework evidently improves the domain generalization performance and has comparable performance to the expert in the original domain simultaneously.
- Apprenticeship learning via inverse reinforcement learning. In International Conference on Machine Learning, 1.
- Invariant risk minimization games. In International Conference on Machine Learning, 145–155. PMLR.
- Invariant risk minimization. arXiv preprint arXiv:1907.02893.
- Robust optimization, volume 28. Princeton University Press.
- Invariant causal imitation learning for generalizable policies. Advances in Neural Information Processing Systems, 34: 3952–3964.
- Latent variable graphical model selection via convex optimization. The Annals of Statistics, 40(4): 1935–1967.
- Mitigating covariate shift in imitation learning via offline data with partial coverage. Advances in Neural Information Processing Systems, 34: 965–979.
- Chickering, D. M. 2002. Optimal structure identification with greedy search. Journal of Machine Learning Research, 3(Nov): 507–554.
- Quantifying generalization in reinforcement learning. In International Conference on Machine Learning, 1282–1289. PMLR.
- Causal confusion in imitation learning. Advances in Neural Information Processing Systems, 32.
- Distributionally robust optimization under moment uncertainty with application to data-driven problems. Operations research, 58(3): 595–612.
- One-shot imitation learning. Advances in Neural Information Processing Systems, 30.
- Statistics of robust optimization: A generalized empirical likelihood approach. Mathematics of Operations Research, 46(3): 946–969.
- Learning models with uniform performance via distributionally robust optimization. The Annals of Statistics, 49(3): 1378–1406.
- Learning gaussian networks. In Uncertainty Proceedings 1994, 235–243. Elsevier.
- Review of causal discovery methods based on graphical models. Frontiers in Genetics, 10: 524.
- beta-vae: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations.
- Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems, 21.
- Generalized score functions for causal discovery. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 1551–1560.
- Sequential causal imitation learning with unobserved confounders. Advances in Neural Information Processing Systems, 34: 14669–14680.
- Lauritzen, S. L. 1996. Graphical models, volume 17. Clarendon Press.
- Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses. The Annals of Statistics, 3022–3049.
- Invariant causal representation learning for generalization in imitation and reinforcement learning. In ICLR2022 Workshop on the Elements of Reasoning: Objects, Structure and Causality.
- Pearl, J. 2009. Causality. Cambridge university press.
- Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(5): 947–1012.
- Elements of causal inference: foundations and learning algorithms. The MIT Press.
- Pomerleau, D. A. 1988. Alvinn: An autonomous land vehicle in a neural network. Advances in Neural Information Processing Systems, 1.
- Efficient reductions for imitation learning. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 661–668. JMLR Workshop and Conference Proceedings.
- A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 627–635. JMLR Workshop and Conference Proceedings.
- Russell, S. 1998. Learning agents for uncertain environments. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, 101–103.
- Toward causal representation learning. Proceedings of the IEEE, 109(5): 612–634.
- A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(10).
- Feedback in imitation learning: The three regimes of covariate shift. arXiv preprint arXiv:2102.02872.
- Spirtes, P. 1995. Directed Cyclic Graphical Representations of Feedback Models. In Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, 491–498. Morgan Kaufmann.
- Causation, prediction, and search. MIT press.
- Recovering latent causal factor for generalization to distributional shifts. Advances in Neural Information Processing Systems, 34: 16846–16859.
- Interpreting latent variables in factor models via convex optimization. Mathematical programming, 167(1): 129–154.
- Geometry of the faithfulness assumption in causal inference. The Annals of Statistics, 436–463.
- D’ya like DAGs? A survey on structure learning and causal discovery. ACM Computing Surveys, 55(4): 1–36.
- Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2): 7–36.
- Generalized independent noise condition for estimating latent variable causal graphs. Advances in Neural Information Processing Systems, 33: 14891–14902.
- Differentiable linearized ADMM. In International Conference on Machine Learning, 6902–6911. PMLR.
- Invariant causal prediction for block mdps. In International Conference on Machine Learning, 11214–11224. PMLR.
- Learning invariant representations for reinforcement learning without reconstruction. arXiv preprint arXiv:2006.10742.
- ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1828–1837.
- Causal imitation learning with unobserved confounders. Advances in Neural Information Processing Systems, 33: 12263–12274.
- Kernel-based conditional independence test and application in causal discovery. Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, 804–813.
- Maximum entropy inverse reinforcement learning. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, volume 8, 1433–1438. Chicago, IL, USA.