Intrinsically motivated graph exploration using network theories of human curiosity (2307.04962v4)
Abstract: Intrinsically motivated exploration has proven useful for reinforcement learning, even without additional extrinsic rewards. When the environment is naturally represented as a graph, how to guide exploration best remains an open question. In this work, we propose a novel approach for exploring graph-structured data motivated by two theories of human curiosity: the information gap theory and the compression progress theory. The theories view curiosity as an intrinsic motivation to optimize for topological features of subgraphs induced by nodes visited in the environment. We use these proposed features as rewards for graph neural-network-based reinforcement learning. On multiple classes of synthetically generated graphs, we find that trained agents generalize to longer exploratory walks and larger environments than are seen during training. Our method computes more efficiently than the greedy evaluation of the relevant topological properties. The proposed intrinsic motivations bear particular relevance for recommender systems. We demonstrate that next-node recommendations considering curiosity are more predictive of human choices than PageRank centrality in several real-world graph environments.
- Curiosity-driven exploration by self-supervised prediction. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 2778–2787. JMLR.org, 2017.
- Large-scale study of curiosity-driven learning. In ICLR, 2019.
- BYOL-explore: Exploration by bootstrapped prediction. In Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=qHGCH75usg.
- The growth and form of knowledge networks by kinesthetic curiosity. Current Opinion in Behavioral Sciences, 35:125–134, 2020. ISSN 2352-1546. doi: https://doi.org/10.1016/j.cobeha.2020.09.007. URL https://www.sciencedirect.com/science/article/pii/S235215462030142X.
- George Loewenstein. The psychology of curiosity: A review and reinterpretation. Psychological Bulletin, 116(1):75–98, 1994.
- The wick in the candle of learning: Epistemic curiosity activates reward circuitry and enhances memory. Psychological Science, 20(8):963–973, 2009. doi: 10.1111/j.1467-9280.2009.02402.x. URL https://doi.org/10.1111/j.1467-9280.2009.02402.x. PMID: 19619181.
- Intrinsically motivated oculomotor exploration guided by uncertainty reduction and conditioned reinforcement in non-human primates. Scientific Reports, 6(1):20202, 2016. doi: 10.1038/srep20202. URL https://doi.org/10.1038/srep20202.
- Juergen Schmidhuber. Driven by compression progress: A simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. arXiv pre-print, 2008.
- Jürgen Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247, 2010. doi: 10.1109/TAMD.2010.2056368.
- How to grow a mind: Statistics, structure, and abstraction. Science, 331(6022):1279–1285, 2011. doi: 10.1126/science.1192788. URL https://www.science.org/doi/abs/10.1126/science.1192788.
- Anne G. E. Collins. The cost of structure learning. Journal of Cognitive Neuroscience, 29(10):1646–1655, 10 2017. ISSN 0898-929X. doi: 10.1162/jocn_a_01128. URL https://doi.org/10.1162/jocn_a_01128.
- Ida Momennejad. Learning structures: predictive representations, replay, and generalization. Current Opinion in Behavioral Sciences, 32:155–166, 2020.
- Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends in Cognitive Sciences, 17(11):585–593, 2013. ISSN 1364-6613. doi: https://doi.org/10.1016/j.tics.2013.09.001. URL https://www.sciencedirect.com/science/article/pii/S1364661313002052.
- The psychology and neuroscience of curiosity. Neuron, 88(3):449–460, 2015. ISSN 0896-6273. doi: https://doi.org/10.1016/j.neuron.2015.09.010. URL https://www.sciencedirect.com/science/article/pii/S0896627315007679.
- The Pandora effect: The power and peril of curiosity. Psychological Science, 27(5):659––666, 2016.
- Smokers’ curiosity for tobacco-related trivia aids memory of tobacco-related information. PsyArXiv, 2021.
- Intrinsic valuation of information in decision making under uncertainty. PLOS Computational Biology, 12(7):1–21, 07 2016. doi: 10.1371/journal.pcbi.1005020. URL https://doi.org/10.1371/journal.pcbi.1005020.
- The neural encoding of information prediction errors during non-instrumental information seeking. Scientific Reports, 8(1):6134, 2018. doi: 10.1038/s41598-018-24566-x. URL https://doi.org/10.1038/s41598-018-24566-x.
- On curiosity: A fundamental aspect of personality, a practice of network growth. Personality Neuroscience, 1:e13, 2018. doi: 10.1017/pen.2018.3.
- Hunters, busybodies and the knowledge network building associated with deprivation curiosity. Nature Human Behaviour, 5(3):327–336, 2021. doi: 10.1038/s41562-020-00985-7. URL https://doi.org/10.1038/s41562-020-00985-7.
- Curiosity as filling, compressing, and reconfiguring knowledge networks. Collective Intelligence, 2(4):26339137231207633, 2023. doi: 10.1177/26339137231207633. URL https://doi.org/10.1177/26339137231207633.
- A survey on intrinsic motivation in reinforcement learning. arXiv pre-print, 2019.
- Exploration in deep reinforcement learning: A survey. Information Fusion, 85:1–22, 2022. ISSN 1566-2535. doi: https://doi.org/10.1016/j.inffus.2022.03.003. URL https://www.sciencedirect.com/science/article/pii/S1566253522000288.
- Unifying count-based exploration and intrinsic motivation. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 1479–1487, Red Hook, NY, USA, 2016. Curran Associates Inc. ISBN 9781510838819.
- Count-based exploration with neural density models. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 2721–2730. JMLR.org, 2017.
- #exploration: A study of count-based exploration for deep reinforcement learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 2750–2759, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964.
- Count-based exploration with the successor representation, 2018.
- Leco: Learnable episodic count for task-specific intrinsic reward, 2022.
- What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 1, 2007. ISSN 1662-5218. doi: 10.3389/neuro.12.006.2007. URL https://www.frontiersin.org/articles/10.3389/neuro.12.006.2007.
- Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2):265–286, 2007. doi: 10.1109/TEVC.2006.890271.
- Incentivizing exploration in reinforcement learning with deep predictive models. arXiv pre-print, 2015.
- Vime: Variational information maximizing exploration. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 1117–1125, Red Hook, NY, USA, 2016. Curran Associates Inc. ISBN 9781510838819.
- Ex2: Exploration with exemplar models for deep reinforcement learning, 2017.
- Episodic curiosity through reachability. arXiv pre-print, 2018.
- Reward design via online gradient ascent. In J. Lafferty, C. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 23. Curran Associates, Inc., 2010. URL https://proceedings.neurips.cc/paper_files/paper/2010/file/168908dd3227b8358eababa07fcaf091-Paper.pdf.
- On learning intrinsic rewards for policy gradient methods. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 4649–4659, Red Hook, NY, USA, 2018. Curran Associates Inc.
- Self-supervised online reward shaping in sparse-reward environments. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), page 2369–2375. IEEE Press, 2021. doi: 10.1109/IROS51168.2021.9636020. URL https://doi.org/10.1109/IROS51168.2021.9636020.
- Exploration-guided reward shaping for reinforcement learning under sparse rewards. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=W7HvKO1erY.
- Combinatorial Optimization: Algorithms and Complexity. Dover Publications, 1982.
- Graph learning for combinatorial optimization: A survey of state-of-the-art. Data Science and Engineering, 6(2):119–141, Jun 2021. ISSN 2364-1541. doi: 10.1007/s41019-021-00155-3. URL https://doi.org/10.1007/s41019-021-00155-3.
- Reinforcement learning for combinatorial optimization: A survey. Computers & Operations Research, 134:105400, 2021. ISSN 0305-0548. doi: https://doi.org/10.1016/j.cor.2021.105400. URL https://www.sciencedirect.com/science/article/pii/S0305054821001660.
- Challenges and opportunities in deep reinforcement learning with graph neural networks: A comprehensive review of algorithms and applications. arXiv pre-print, 2022.
- Learning combinatorial optimization algorithms over graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 6351–6361, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964.
- Learning the travelling salesperson problem requires rethinking generalization. arXiv pre-print, 2020. doi: 10.4230/LIPIcs.CP.2021.33.
- A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowledge-Based Systems, 204:106244, 2020. ISSN 0950-7051. doi: https://doi.org/10.1016/j.knosys.2020.106244. URL https://www.sciencedirect.com/science/article/pii/S0950705120304445.
- Learning what to defer for maximum independent sets. In Proceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020.
- Learning from obstructions: An effective deep learning approach for minimum vertex cover. Annals of Mathematics and Artificial Intelligence, Aug 2022. ISSN 1573-7470. doi: 10.1007/s10472-022-09813-2. URL https://doi.org/10.1007/s10472-022-09813-2.
- Reinforcement learning based query vertex ordering model for subgraph matching. In 2022 IEEE 38th International Conference on Data Engineering (ICDE), pages 245–258, 2022. doi: 10.1109/ICDE53745.2022.00023.
- Optimizing tensor network contraction using reinforcement learning. arXiv pre-print, 2022.
- Learning transferable graph exploration. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/afe434653a898da20044041262b3ac74-Paper.pdf.
- Zero-shot reinforcement learning on graphs for autonomous exploration under uncertainty. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 5193–5199, 2021. doi: 10.1109/ICRA48506.2021.9561917.
- Goal-directed graph construction using reinforcement learning. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 477(2254):20210168, 2021. doi: 10.1098/rspa.2021.0168. URL https://royalsocietypublishing.org/doi/abs/10.1098/rspa.2021.0168.
- Dynamic network reconfiguration for entropy maximization using deep reinforcement learning. Proceedings of the First Learning on Graphs Conference (LoG 2022), PMLR 198:49:1-49:15, 2022.
- David F Gleich. PageRank beyond the web. siam REVIEW, 57(3):321–363, 2015.
- Identification of key nodes in a power grid based on modified pagerank algorithm. Energies, 15(3):797, 2022.
- Random walk for generalization in goal-directed human navigation on Wikipedia. In International Conference on Complex Networks and Their Applications, pages 202–213. Springer, 2022.
- A large-scale characterization of how readers browse Wikipedia. ACM Transactions on the Web, 17(2):1–22, 2023.
- Concept hierarchies and human navigation. In 2015 IEEE International Conference on Big Data (Big Data), pages 38–45. IEEE, 2015.
- Comparing personalized PageRank and activation spreading in Wikipedia diagram-based search. In 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pages 41–50. IEEE, 2021.
- What makes a link successful on Wikipedia? In Proceedings of the 26th International Conference on World Wide Web, pages 917–926, 2017.
- Hoprank: How semantic structure influences teleportation in PageRank (a case study on bioportal). In The World Wide Web Conference, pages 2708–2714, 2019.
- Topological Graph Theory. Wiley-Interscience, USA, 1987. ISBN 0471049263.
- Gunnar Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46:255–308, January 2009.
- Robert Ghrist. Barcode: The persistent topology of data. Bulletin of the American Mathematical Society, 45:61–75, October 2007.
- Allen Hatcher. Algebraic Topology. Cambridge University Press, 2002.
- Computing persistent homology. Discrete & Computational Geometry, 33(2):249–274, 2005. doi: 10.1007/s00454-004-1146-y. URL https://doi.org/10.1007/s00454-004-1146-y.
- Ginestra Bianconi. Higher-Order Networks. Elements in Structure and Dynamics of Complex Networks. Cambridge University Press, 2021. doi: 10.1017/9781108770996.
- Quantifying the compressibility of complex networks. Proceedings of the National Academy of Sciences, 118(32):e2023473118, 2021. doi: 10.1073/pnas.2023473118. URL https://www.pnas.org/doi/abs/10.1073/pnas.2023473118.
- Reinforcement Learning: An Introduction. The MIT Press, 2018.
- A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1):4–24, 2021. doi: 10.1109/TNNLS.2020.2978386.
- Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964.
- A simple yet effective baseline for non-attributed graph classification. arXiv pre-print, 2018.
- Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, Feb 2015. ISSN 1476-4687. doi: 10.1038/nature14236. URL https://doi.org/10.1038/nature14236.
- The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1-7):107–117, 1998.
- Extracting link spam using biased random walks from spam seed sets. In Proceedings of the 3rd international workshop on Adversarial information retrieval on the web, pages 37–44, 2007.
- Olle Häggström et al. Finite Markov chains and algorithmic applications, volume 52. Cambridge University Press, 2002.
- An analytical comparison of approaches to personalizing PageRank. Technical report, Technical report, Stanford University, 2003.
- Mark Newman. Networks: An Introduction. Oxford University Press, Inc., USA, 2010. ISBN 0199206651.
- Random geometric graphs. Physical review E, 66(1):016121, 2002.
- Exploring complex networks through random walks. Physical Review E, 75(1):016102, 2007.
- Average complexity of matrix reduction for clique filtrations. In Proceedings of the 2022 International Symposium on Symbolic and Algebraic Computation, pages 187–196, 2022.
- The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5(4):1–19, 2015.
- Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In proceedings of the 25th international conference on world wide web, pages 507–517, 2016.
- Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pages 43–52, 2015.
- Human wayfinding in information networks. In Proceedings of the 21st international conference on World Wide Web, pages 619–628, 2012.
- Wikispeedia: An online game for inferring semantic distances between concepts. In Twenty-First International Joint Conference on Artificial Intelligence, 2009.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.