🌐 Adithya M. Devraj | Semantic ScholarSemantic Scholar

Zap Q-Learning

Computer Science, Mathematics

2017

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects and suggests that the approach will lead to stable and efficient computation even for non-ideal parameterized settings.

View Paper

Model-Free Primal-Dual Methods for Network Optimization with Application to Real-Time Optimal Power Flow

Yue-Chun ChenA. BernsteinAdithya M. DevrajSean P. Meyn

Engineering, Computer Science

American Control Conference

28 September 2019

This paper examines the problem of real-time optimization of networked systems and develops online algorithms that steer the system towards the optimal trajectory without explicit knowledge of the system model, and leverages the online zero-order primal-dual projected-gradient method.

IEEE

Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation

Shuhang ChenAdithya M. DevrajAna BušićSean P. Meyn

Mathematics, Computer Science

International Conference on Artificial…

7 February 2020

It is shown that mean square error achieves the optimal rate of $O(1/n)$, subject to conditions on the step-size sequence, which is of great value in algorithm design.

arXiv

Fastest Convergence for Q-learning

Adithya M. DevrajSean P. Meyn

Computer Science, Mathematics

arXiv.org

12 July 2017

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed so that its…

arXiv

The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning

V. BorkarShuhang ChenAdithya M. DevrajIoannis KontoyiannisS. Meyn

Computer Science, Mathematics

The Annals of Applied Probability

27 October 2021

The main results now allow for parameter-dependent noise, as is often the case in applications to reinforcement learning.

arXiv

Q-Learning With Uniformly Bounded Variance

Adithya M. DevrajS. Meyn

Computer Science, Mathematics

IEEE Transactions on Automatic Control

24 February 2020

It is shown that the asymptotic covariance of the tabular Q-learning algorithm with an optimized step-size sequence is a quadratic function of a factor that goes to infinity, as discount factor approaches 1; an essentially known result.

IEEE

On Matrix Momentum Stochastic Approximation and Applications to Q-learning

Adithya M. DevrajAna BušićSean P. Meyn

Computer Science, Mathematics

Allerton Conference on Communication, Control…

1 September 2019

It is shown that the parameter estimates obtained from the PolSA algorithm couple with those of the optimal variance SNR algorithm, at a rate of O(1/n^{2})$, and numerical results confirm the coupling of PolSA and SNR.

IEEE

Zap Q-Learning With Nonlinear Function Approximation

Shuhang ChenAdithya M. DevrajAna BušićSean P. Meyn

Computer Science

Neural Information Processing Systems

11 October 2019

This class of algorithms is generalized in this paper, and stability is established under very general conditions, and this general result can be applied to a wide range of algorithms found in reinforcement learning.

arXiv

Stochastic Variance Reduced Primal Dual Algorithms for Empirical Composition Optimization

Adithya M. DevrajJianshu Chen

Computer Science, Mathematics

Neural Information Processing Systems

1 July 2019

This work reforms the original minimization objective into an equivalent min-max objective, which brings out all the empirical averages that are originally inside the nonlinear loss functions, and develops a stochastic primal-dual algorithm, SVRPDA-I, which is shown to converge at a linear rate when the problem is strongly convex.

arXiv

Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation

Shuhang ChenAdithya M. DevrajA. BersteinS. Meyn

Mathematics, Computer Science

Journal of Systems Science and Complexity

1 October 2021

A brief survey of recent research in machine learning that shows the power of algorithm design in continuous time, following by careful approximation to obtain a practical recursive algorithm.

Springer Nature