Folgen
Anikait Singh
Anikait Singh
Bestätigte E-Mail-Adresse bei stanford.edu - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Rt-2: Vision-language-action models transfer web knowledge to robotic control
B Zitkovich, T Yu, S Xu, P Xu, T Xiao, F Xia, J Wu, P Wohlhart, S Welker, ...
Conference on Robot Learning, 2165-2183, 2023
1100*2023
Open x-embodiment: Robotic learning datasets and rt-x models
JJ Lim
IEEE International Conference on Robotics and Automation, 2024
570*2024
When should we prefer offline reinforcement learning over behavioral cloning?
A Kumar, J Hong, A Singh, S Levine
International Conference on Learning Representations, 2021
157*2021
Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning
M Nakamoto, S Zhai, A Singh, M Sobol Mark, Y Ma, C Finn, A Kumar, ...
Advances in Neural Information Processing Systems 36, 62244-62269, 2023
1222023
A workflow for offline model-free robotic reinforcement learning
A Kumar, A Singh, S Tian, C Finn, S Levine
arXiv preprint arXiv:2109.10813, 2021
1012021
Preference fine-tuning of llms should leverage suboptimal, on-policy data
F Tajwar, A Singh, A Sharma, R Rafailov, J Schneider, T Xie, S Ermon, ...
arXiv preprint arXiv:2404.14367, 2024
742024
Pre-training for robots: Offline rl enables learning new tasks from a handful of trials
A Kumar, A Singh, F Ebert, M Nakamoto, Y Yang, C Finn, S Levine
arXiv preprint arXiv:2210.05178, 2022
732022
Robotic offline rl from internet videos via value-function pre-training
C Bhateja, D Guo, D Ghosh, A Singh, M Tomar, Q Vuong, Y Chebotar, ...
arXiv preprint arXiv:2309.13041, 2023
222023
Offline rl with realistic datasets: Heteroskedasticity and support constraints
A Singh, A Kumar, Q Vuong, Y Chebotar, S Levine
arXiv preprint arXiv:2211.01052, 2022
20*2022
Rt-2: Vision-language-action models transfer web knowledge to robotic control, 2023
A Brohan, N Brown, J Carbajal, Y Chebotar, X Chen, K Choromanski, ...
URL https://arxiv. org/abs/2307.15818, 0
20
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
V Xiang, C Snell, K Gandhi, A Albalak, A Singh, C Blagden, D Phung, ...
arXiv preprint arXiv:2501.04682, 2025
162025
A mobile application for keyword search in real-world scenes
S Pundlik, A Singh, G Baghel, V Baliutaviciute, G Luo
IEEE Journal of Translational Engineering in Health and Medicine 7, 1-10, 2019
122019
Adaptive inference-time compute: Llms can predict if they can do better, even mid-generation
R Manvi, A Singh, S Ermon
arXiv preprint arXiv:2410.02725, 2024
72024
D5rl: Diverse datasets for data-driven deep reinforcement learning
R Rafailov, K Hatch, A Singh, L Smith, A Kumar, I Kostrikov, ...
arXiv preprint arXiv:2408.08441, 2024
62024
Robotic offline rl from internet videos via value-function learning
C Bhateja, D Guo, D Ghosh, A Singh, M Tomar, Q Vuong, Y Chebotar, ...
2024 IEEE International Conference on Robotics and Automation (ICRA), 16977 …, 2024
42024
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
K Gandhi, A Chakravarthy, A Singh, N Lile, ND Goodman
arXiv preprint arXiv:2503.01307, 2025
12025
Personalized Preference Fine-tuning of Diffusion Models
M Dang, A Singh, L Zhou, S Ermon, J Song
arXiv preprint arXiv:2501.06655, 2025
12025
Test-time alignment via hypothesis reweighting
Y Lee, J Williams, H Marklund, A Sharma, E Mitchell, A Singh, C Finn
arXiv preprint arXiv:2412.08812, 2024
12024
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users
A Singh, S Hsu, K Hsu, E Mitchell, S Ermon, T Hashimoto, A Sharma, ...
arXiv preprint arXiv:2502.19312, 2025
2025
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
A Albalak, D Phung, N Lile, R Rafailov, K Gandhi, L Castricato, A Singh, ...
arXiv preprint arXiv:2502.17387, 2025
2025
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20