Folgen
Peter Sunehag
Peter Sunehag
Google - DeepMind
Bestätigte E-Mail-Adresse bei google.com
Titel
Zitiert von
Zitiert von
Jahr
Value-decomposition networks for cooperative multi-agent learning
P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ...
arXiv preprint arXiv:1706.05296, 2017
19402017
Deep reinforcement learning in large discrete action spaces
G Dulac-Arnold, R Evans, H van Hasselt, P Sunehag, T Lillicrap, J Hunt, ...
arXiv preprint arXiv:1512.07679, 2015
7732015
Scalable evaluation of multi-agent reinforcement learning with melting pot
JZ Leibo, EA Dueñez-Guzman, A Vezhnevets, JP Agapiou, P Sunehag, ...
International conference on machine learning, 6187-6199, 2021
972021
The sample-complexity of general reinforcement learning
T Lattimore, M Hutter, P Sunehag
International Conference on Machine Learning, 28-36, 2013
752013
Learning to incentivize other learning agents
J Yang, A Li, M Farajtabar, P Sunehag, E Hughes, H Zha
Advances in Neural Information Processing Systems 33, 15208-15219, 2020
732020
Deep reinforcement learning with attention for slate markov decision processes with high-dimensional states and actions
P Sunehag, R Evans, G Dulac-Arnold, Y Zwols, D Visentin, B Coppin
arXiv preprint arXiv:1512.01124, 2015
602015
Malthusian reinforcement learning
JZ Leibo, J Perolat, E Hughes, S Wheelwright, AH Marblestone, ...
arXiv preprint arXiv:1812.07019, 2018
492018
Wearable sensor activity analysis using semi-Markov models with a grammar
O Thomas, P Sunehag, G Dror, S Yun, S Kim, M Robards, A Smola, ...
Pervasive and Mobile Computing 6 (3), 342-350, 2010
472010
Variable metric stochastic approximation theory
P Sunehag, J Trumpf, SVN Vishwanathan, N Schraudolph
Artificial Intelligence and Statistics, 560-566, 2009
452009
Value-decomposition networks for cooperative multi-agent learning. arXiv 2017
P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ...
arXiv preprint arXiv:1706.05296, 2017
382017
Melting Pot 2.0
JP Agapiou, AS Vezhnevets, EA Duéñez-Guzmán, J Matyas, Y Mao, ...
arXiv preprint arXiv:2211.13746, 2022
372022
Reinforcement learning agents acquire flocking and symbiotic behaviour in simulated ecosystems
P Sunehag, G Lever, S Liu, J Merel, N Heess, JZ Leibo, E Hughes, ...
Artificial life conference proceedings, 103-110, 2019
332019
A review of cooperation in multi-agent learning
Y Du, JZ Leibo, U Islam, R Willis, P Sunehag
arXiv preprint arXiv:2312.05162, 2023
252023
Q-learning for history-based reinforcement learning
M Daswani, P Sunehag, M Hutter
Asian Conference on Machine Learning, 213-228, 2013
252013
Rationality, optimism and guarantees in general reinforcement learning
P Sunehag, M Hutter
The Journal of Machine Learning Research 16 (1), 1345-1390, 2015
212015
Semi-markov kmeans clustering and activity recognition from body-worn sensors
MW Robards, P Sunehag
2009 Ninth IEEE International Conference on Data Mining, 438-446, 2009
192009
Axioms for rational reinforcement learning
P Sunehag, M Hutter
Algorithmic Learning Theory, 338-352, 2011
172011
Feature reinforcement learning: state of the art
M Daswani, P Sunehag, M Hutter
Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014
162014
Adaptive context tree weighting
A O'Neill, M Hutter, W Shao, P Sunehag
2012 Data Compression Conference, 317-326, 2012
152012
Optimistic agents are asymptotically optimal
P Sunehag, M Hutter
AI 2012: Advances in Artificial Intelligence: 25th Australasian Joint …, 2012
152012
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20