This is the first paper (in review) in the Multi-agent learning in dynamic systems series. Developing new algorithms to optimise task allocation in multi-agent systems. Q-learning, historical reward convolution, and dynamically adaptable risk-based system exploration approaches are developed.
No words go here