Folgen
Prashanth L.A.
Prashanth L.A.
Associate Professor, Department of Computer Science and Engg., IIT Madras
Bestätigte E-Mail-Adresse bei cse.iitm.ac.in - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods
S Bhatnagar, HL Prasad, LA Prashanth
Springer 434, 302, 2013
451*2013
Reinforcement Learning With Function Approximation for Traffic Signal Control
P LA, S Bhatnagar
Intelligent Transportation Systems, IEEE Transactions on, 1-10, 2011
3802011
Actor-critic algorithms for risk-sensitive MDPs
P La, M Ghavamzadeh
Advances in neural information processing systems 26, 2013
3392013
Reinforcement learning with average cost for adaptive control of traffic lights at intersections
LA Prashanth, S Bhatnagar
2011 14th International IEEE Conference on Intelligent Transportation …, 2011
892011
Cumulative prospect theory meets reinforcement learning: Prediction and control
LA Prashanth, C Jie, M Fu, S Marcus, C Szepesvári
International Conference on Machine Learning, 1406-1415, 2016
862016
Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs
LA Prashanth, M Ghavamzadeh
arXiv preprint arXiv:1403.6530, 2014
802014
Policy gradients for CVaR-constrained MDPs
LA Prashanth
International Conference on Algorithmic Learning Theory, 155-169, 2014
712014
Two-timescale algorithms for learning Nash equilibria in general-sum stochastic games
HL Prasad, P LA, S Bhatnagar
Proceedings of the 2015 International Conference on Autonomous Agents and …, 2015
702015
Concentration bounds for empirical conditional value-at-risk: The unbounded case
RK Kolla, LA Prashanth, SP Bhat, K Jagannathan
Operations Research Letters 47 (1), 16-20, 2019
552019
Threshold tuning using stochastic optimization for graded signal control
LA Prashanth, S Bhatnagar
IEEE Transactions on Vehicular Technology 61 (9), 3865-3880, 2012
532012
Concentration of risk measures: A Wasserstein distance approach
SP Bhat, P LA
Advances in neural information processing systems 32, 2019
522019
On TD (0) with function approximation: Concentration bounds and a centered variant with exponential convergence
N Korda, P La
International conference on machine learning, 626-634, 2015
512015
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions
LA Prashanth, K Jagannathan, RK Kolla
Proceedings of the 37th International Conference on Machine Learning, 5577-5586, 2020
492020
Stochastic optimization in a cumulative prospect theory framework
C Jie, LA Prashanth, M Fu, S Marcus, C Szepesvári
IEEE Transactions on Automatic Control 63 (9), 2867-2882, 2018
482018
Risk-sensitive reinforcement learning: A constrained optimization viewpoint
LA Prashanth, M Fu
arXiv 2018, 2018
352018
Adaptive system optimization using random directions stochastic approximation
LA Prashanth, S Bhatnagar, M Fu, S Marcus
IEEE Transactions on Automatic Control 62 (5), 2223-2238, 2017
352017
Risk-sensitive reinforcement learning via policy gradient search
LA Prashanth, MC Fu
Foundations and Trends® in Machine Learning 15 (5), 537-693, 2022
272022
Analysis of stochastic approximation for efficient least squares regression and LSTD
LA Prashanth, N Korda, R Munos
arXiv preprint arXiv:1306.2557, 2013
26*2013
Concentration bounds for temporal difference learning with linear function approximation: The case of batch data and uniform sampling
LA Prashanth, N Korda, R Munos
Machine Learning 110 (3), 559-618, 2021
172021
Risk-aware multi-armed bandits using conditional value-at-risk
RK Kolla, LA Prashanth, K Jagannathan
arXiv preprint arXiv:1901.00997, 2019
172019
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20