Monte-Carlo Based Online planning Under Partial Observability : Solving Single and Multi-Agent Problems

do Carmo Alves, Matheus Aparecido and Soriano Marcolino, Leandro and Elkhatib, Yehia (2024) Monte-Carlo Based Online planning Under Partial Observability : Solving Single and Multi-Agent Problems. PhD thesis, Lancaster University.

Text (2024matheusphd)
2024matheusphd.pdf - Published Version
Available under License Creative Commons Attribution.
Download (5MB)

Abstract

This thesis thoroughly explores the integration of statistical and reinforcement learning techniques, aiming to provide fresh perspectives and solutions for enhancing the current state-of-the-art methods considering the capabilities of autonomous agents to perform learning, planning and estimation in an online manner in a single and multi-agent systems context. We aim to address a critical demand in the field, steering away from the prevailing dependence on the application of intensive computational resources and large amounts of data as a requirement to achieve peak performance in our context. Our primary focus centres on studying and refining solutions in the ``online planning under uncertainty'' research area. We have ventured beyond the boundaries of existing literature, pushing our proposals to more complex and challenging problems. As concrete contributions, we introduce three new algorithms: IB-POMCP, an online planning algorithm which uses information entropy to augment a single agent's decision making capabilities; OEATE, a type and parameter estimation method to handle coordination with multiple unknown teammates in cooperative environments; and BAE, a method capable of detecting adverserial agents disguised as teammates in cooperative environments on-the-fly. Our proposals contribute to the evolution of autonomous systems and are supported by empirical and theoretical results. We demonstrate that our new perspectives for agents' reasoning processes can present generic and extendable solutions to diverse scenarios and problems. Finally, during the PhD journey, we have developed and presented to the research community a new framework designed to aggregate relevant baselines and benchmarks for multi-agent systems: the AdLeap-MAS. AdLeap-MAS framework stands out as a novel tool centred on the implementation and simulation of ad-hoc reasoning domains for multi-agent, collaborative, and adversarial contexts. The framework aims to facilitate the execution of experiments and the re-use existing codes across different environments. We provide a user-friendly environment that not only extends the frontiers of our research but also serves as a valuable resource for the research community.

Item Type:

Thesis (PhD)

Subjects:

?? planning under uncertaintymulti-agent systemsinformation-guided planningadversarial detectionad-hoc teamworkonline planningestimation methods ??

Departments:

Faculty of Science and Technology > School of Computing & Communications

ID Code:

222684

Deposited By:

ep_importer_pure

Deposited On:

05 Aug 2024 11:45

Refereed?:

Published?:

Published

Last Modified:

19 Sep 2025 00:47

URI:

https://eprints.lancs.ac.uk/id/eprint/222684

Altmetric