Folgen
Victoria Krakovna
Victoria Krakovna
Sonstige NamenViktoriya Krakovna
Senior Research Scientist at Google DeepMind
Bestätigte E-Mail-Adresse bei google.com - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ...
arXiv preprint arXiv:2312.11805, 2023
35332023
AI safety gridworlds
J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ...
arXiv preprint arXiv:1711.09883, 2017
3712017
Specification gaming: the flip side of AI ingenuity
V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ...
https://deepmind.com/blog/article/Specification-gaming-the-flip-side-of-AI …, 2020
139*2020
Reinforcement Learning with a Corrupted Reward Channel
T Everitt, V Krakovna, L Orseau, M Hutter, S Legg
IJCAI AI & Autonomy, 2017
1322017
Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective
T Everitt, M Hutter, R Kumar, V Krakovna
Synthese 198 (Suppl 27), 6435-6467, 2021
1092021
Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models
V Krakovna, F Doshi-Velez
ICML Workshop on Human Interpretability (WHI 2016), arXiv preprint arXiv …, 2016
892016
Goal misgeneralization: Why correct specifications aren't enough for correct goals
R Shah, V Varma, R Kumar, M Phuong, V Krakovna, J Uesato, Z Kenton
arXiv preprint arXiv:2210.01790, 2022
832022
Evaluating Frontier Models for Dangerous Capabilities
M Phuong, M Aitchison, E Catt, S Cogan, A Kaskasoli, V Krakovna, ...
arXiv preprint arXiv:2403.13793, 2024
74*2024
Penalizing side effects using stepwise relative reachability
V Krakovna, L Orseau, R Kumar, M Martic, S Legg
arXiv preprint arXiv:1806.01186, 2018
672018
The ethics of advanced ai assistants
I Gabriel, A Manzini, G Keeling, LA Hendricks, V Rieser, H Iqbal, ...
arXiv preprint arXiv:2404.16244, 2024
632024
Avoiding Side Effects By Considering Future Tasks
V Krakovna, L Orseau, R Ngo, M Martic, S Legg
NeurIPS 2020, arXiv preprint arXiv:2010.07877, 2020
532020
Specification gaming examples in AI
V Krakovna
tinyurl.com/specification-gaming, 2018
47*2018
Modeling AGI safety frameworks with causal influence diagrams
T Everitt, R Kumar, V Krakovna, S Legg
arXiv preprint arXiv:1906.08663, 2019
262019
Measuring and avoiding side effects using relative reachability
V Krakovna, L Orseau, M Martic, S Legg
arXiv preprint arXiv:1806.01186, 2018
222018
Power-seeking can be probable and predictive for trained agents
V Krakovna, J Kramar
arXiv preprint arXiv:2304.06528, 2023
16*2023
REALab: An embedded perspective on tampering
R Kumar, J Uesato, R Ngo, T Everitt, V Krakovna, S Legg
arXiv preprint arXiv:2011.08820, 2020
132020
Memory-bounded left-corner unsupervised grammar induction on child-directed input
C Shain, W Bryce, L Jin, V Krakovna, F Doshi-Velez, T Miller, W Schuler, ...
Proceedings of COLING 2016, the 26th International Conference on …, 2016
11*2016
Avoiding tampering incentives in deep rl via decoupled approval
J Uesato, R Kumar, V Krakovna, T Everitt, R Ngo, S Legg
arXiv preprint arXiv:2011.08827, 2020
102020
Possible takeaways from the coronavirus pandemic for slow AI takeoff
V Krakovna
https://vkrakovna.wordpress.com/2020/05/31/possible-takeaways-from-the …, 2020
82020
A Minimalistic Approach to Sum-Product Network Learning for Real Applications
V Krakovna, M Looks
ICLR 2016 workshop, arXiv preprint arXiv:1602.04259, 2016
62016
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20