Victoria Krakovna

Zitiert von

	Alle	Seit 2019
Zitate	2150	2059
h-index	13	13
i10-index	17	16

1200

600

300

900

201720182019202020212022202320249 62 106 160 165 188 250 1183

Öffentlicher Zugriff

Alle anzeigen

1 Artikel

0 Artikel

verfügbar

nicht verfügbar

Basierend auf Fördermandaten

Koautoren

Tom EverittStaff Research Scientist at Google DeepMindBestätigte E-Mail-Adresse bei google.com
Laurent OrseauResearch Scientist at Google DeepMindBestätigte E-Mail-Adresse bei google.com
Ramana KumarDeepMindBestätigte E-Mail-Adresse bei cl.cam.ac.uk
Miljan MarticDeepMindBestätigte E-Mail-Adresse bei google.com
Jonathan UesatoBestätigte E-Mail-Adresse bei mit.edu
Marcus HutterResearcher@DeepMind & Professor at ANUBestätigte E-Mail-Adresse bei anu.edu.au
Zachary KentonGoogle DeepMindBestätigte E-Mail-Adresse bei google.com
Pedro A. OrtegaArtificial Intelligence & Machine LearningBestätigte E-Mail-Adresse bei adaptiveagents.org
Jan LeikeOpenAIBestätigte E-Mail-Adresse bei openai.com
Richard NgoOpenAIBestätigte E-Mail-Adresse bei openai.com
Matthew RahtzGoogle DeepMindBestätigte E-Mail-Adresse bei google.com
Finale Doshi-VelezProfessor, HarvardBestätigte E-Mail-Adresse bei seas.harvard.edu
Vladimir MikulikDeepMindBestätigte E-Mail-Adresse bei google.com
Rohin ShahResearch Scientist, Google DeepMindBestätigte E-Mail-Adresse bei deepmind.com
Vikrant VarmaDeepMindBestätigte E-Mail-Adresse bei deepmind.com
Mary PhuongIST AustriaBestätigte E-Mail-Adresse bei ist.ac.at
Janos KramarDeepMindBestätigte E-Mail-Adresse bei google.com
Yannis AssaelStaff Research Scientist, Google DeepMindBestätigte E-Mail-Adresse bei google.com
Sarah CoganSoftware Engineer, Google DeepMindBestätigte E-Mail-Adresse bei google.com
Anian RuossGoogle DeepMindBestätigte E-Mail-Adresse bei google.com

Folgen

Victoria Krakovna

Sonstige NamenViktoriya Krakovna

Senior Research Scientist at Google DeepMind

Bestätigte E-Mail-Adresse bei google.com - Startseite

AI Alignment Model Evaluations Specification Gaming Agent Incentives


Titel Nach Zitationen sortieren Nach Jahr sortieren Nach Titel sortieren	Zitiert von Zitiert von	Jahr
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023	1042	2023
AI safety gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017	338	2017
Reinforcement Learning with a Corrupted Reward Channel T Everitt, V Krakovna, L Orseau, M Hutter, S Legg IJCAI AI & Autonomy, 2017	125	2017
Specification gaming: the flip side of AI ingenuity V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ... https://deepmind.com/blog/article/Specification-gaming-the-flip-side-of-AI …, 2020	108*	2020
Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective T Everitt, M Hutter, R Kumar, V Krakovna Synthese 198 (Suppl 27), 6435-6467, 2021	92	2021
Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models V Krakovna, F Doshi-Velez ICML Workshop on Human Interpretability (WHI 2016), arXiv preprint arXiv …, 2016	86	2016
Penalizing side effects using stepwise relative reachability V Krakovna, L Orseau, R Kumar, M Martic, S Legg arXiv preprint arXiv:1806.01186, 2018	60	2018
Goal misgeneralization: Why correct specifications aren't enough for correct goals R Shah, V Varma, R Kumar, M Phuong, V Krakovna, J Uesato, Z Kenton arXiv preprint arXiv:2210.01790, 2022	56*	2022
Avoiding Side Effects By Considering Future Tasks V Krakovna, L Orseau, R Ngo, M Martic, S Legg NeurIPS 2020, arXiv preprint arXiv:2010.07877, 2020	46	2020
Specification gaming examples in AI V Krakovna tinyurl.com/specification-gaming, 2018	44*	2018
Modeling AGI safety frameworks with causal influence diagrams T Everitt, R Kumar, V Krakovna, S Legg arXiv preprint arXiv:1906.08663, 2019	24	2019
Measuring and avoiding side effects using relative reachability V Krakovna, L Orseau, M Martic, S Legg arXiv preprint arXiv:1806.01186, 2018	20	2018
Evaluating Frontier Models for Dangerous Capabilities M Phuong, M Aitchison, E Catt, S Cogan, A Kaskasoli, V Krakovna, ... arXiv preprint arXiv:2403.13793, 0	18*
REALab: An embedded perspective on tampering R Kumar, J Uesato, R Ngo, T Everitt, V Krakovna, S Legg arXiv preprint arXiv:2011.08820, 2020	13	2020
Power-seeking can be probable and predictive for trained agents V Krakovna, J Kramar arXiv preprint arXiv:2304.06528, 2023	11*	2023
The ethics of advanced ai assistants I Gabriel, A Manzini, G Keeling, LA Hendricks, V Rieser, H Iqbal, ... arXiv preprint arXiv:2404.16244, 2024	10	2024
Memory-bounded left-corner unsupervised grammar induction on child-directed input C Shain, W Bryce, L Jin, V Krakovna, F Doshi-Velez, T Miller, W Schuler, ... Proceedings of COLING 2016, the 26th International Conference on …, 2016	10*	2016
Avoiding tampering incentives in deep RL via decoupled approval J Uesato, R Kumar, V Krakovna, T Everitt, R Ngo, S Legg arXiv preprint arXiv:2011.08827, 2020	8	2020
Interpretable selection and visualization of features and interactions using bayesian forests V Krakovna, J Du, JS Liu Statistics and its Interface 2018 (Volume 11 Number 3), arXiv preprint arXiv …, 2015	6*	2015
A generalized-zero-preserving method for compact encoding of concept lattices M Skala, V Krakovna, J Kramár, G Penn Proceedings of the 48th annual meeting of the Association for Computational …, 2010	6	2010

Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.

Artikel 1–20

Zitate pro Jahr

Doppelte Zitate

Zusammengeführte Zitate

Koautor hinzufügenKoautoren

Folgen

Zitiert von

Koautoren