OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

PDF Publication Title:

OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS ( optimizing-expectations-from-deep-reinforcement-learning-to- )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 101

[RGB13] [RMW14] [Sch+15a] [Sch+15b] [Sch+15c] [Sil+14] [Sil+16] [Str+06] [SB98] [SPS99] [Sut+99] [Sze10] [SL06] Bibliography 93 R. Ranganath, S. Gerrish, and D. M. Blei. “Black box variational inference.” In: arXiv preprint arXiv:1401.0118 (2013) (cit. on p. 76). D. J. Rezende, S. Mohamed, and D. Wierstra. “Stochastic backpropagation and ap- proximate inference in deep generative models.” In: arXiv:1401.4082 (2014) (cit. on pp. 64, 66, 67, 76, 81). J. Schulman, N. Heess, T. Weber, and P. Abbeel. “Gradient estimation using stochas- tic computation graphs.” In: Advances in Neural Information Processing Systems. 2015, pp. 3528–3536 (cit. on p. 7). J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel. “High-dimensional con- tinuous control using generalized advantage estimation.” In: arXiv preprint arXiv:1506.02438 (2015) (cit. on p. 7). J. Schulman, S. Levine, P. Moritz, M. I. Jordan, and P. Abbeel. “Trust region policy optimization.” In: CoRR, abs/1502.05477 (2015) (cit. on pp. 7, 54, 55, 61). D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller. “Determinis- tic Policy Gradient Algorithms.” In: ICML. 2014 (cit. on pp. 3, 67). D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. “Mastering the game of Go with deep neural networks and tree search.” In: Nature 529.7587 (2016), pp. 484–489 (cit. on p. 3). A. L. Strehl, L. Li, E. Wiewiora, J. Langford, and M. L. Littman. “PAC model-free reinforcement learning.” In: Proceedings of the 23rd international conference on Machine learning. ACM. 2006, pp. 881–888 (cit. on p. 85). R. S. Sutton and A. G. Barto. Introduction to reinforcement learning. MIT Press, 1998 (cit. on pp. 2, 7, 49, 50, 53). R. S. Sutton, D. Precup, and S. Singh. “Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning.” In: Artificial intelligence 112.1 (1999), pp. 181–211 (cit. on p. 85). R. S. Sutton, D. A. McAllester, S. P. Singh, Y. Mansour, et al. “Policy gradient methods for reinforcement learning with function approximation.” In: NIPS. Vol. 99. Citeseer. 1999, pp. 1057–1063 (cit. on pp. 4, 16, 64, 81). C. Szepesvári. “Algorithms for reinforcement learning.” In: Synthesis Lectures on Ar- tificial Intelligence and Machine Learning 4.1 (2010), pp. 1–103 (cit. on p. 2). I. Szita and A. Lörincz. “Learning Tetris using the noisy cross-entropy method.” In: Neural computation 18.12 (2006), pp. 2936–2941 (cit. on pp. 4, 11, 31).

PDF Image | OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

optimizing-expectations-from-deep-reinforcement-learning-to--101

PDF Search Title:

OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

Original File Name Searched:

thesis-optimizing-deep-learning.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com | RSS | AMP