
PDF Publication Title:
Text from PDF Page: 098
[KL02] [KK98] [KB14] [KW13] [KW14] [KBP13] [KT03] [KSH12] [LP03] [LCR02] [LeC+98] [LPW09] [LA14] Bibliography 90 S. Kakade and J. Langford. “Approximately optimal approximate reinforcement learning.” In: ICML. Vol. 2. 2002, pp. 267–274 (cit. on pp. 19–21, 28, 34). H. Kimura and S. Kobayashi. “An Analysis of Actor/Critic Algorithms Using Eli- gibility Traces: Reinforcement Learning with Imperfect Value Function.” In: ICML. 1998, pp. 278–286 (cit. on pp. 45, 46). D. P. Kingma and J. Ba. “Adam: A method for stochastic optimization.” In: arXiv preprint arXiv:1412.6980 (2014) (cit. on p. 16). D. P. Kingma and M. Welling. “Auto-encoding variational Bayes.” In: arXiv:1312.6114 (2013) (cit. on pp. 64, 66, 76, 81). D. P. Kingma and M. Welling. “Efficient gradient-based inference through transfor- mations between bayes nets and neural nets.” In: arXiv preprint arXiv:1402.0480 (2014) (cit. on p. 76). J. Kober, J. A. Bagnell, and J. Peters. “Reinforcement learning in robotics: A survey.” In: The International Journal of Robotics Research (2013), p. 0278364913495721 (cit. on p. 1). V. R. Konda and J. N. Tsitsiklis. “On Actor-Critic Algorithms.” In: SIAM journal on Control and Optimization 42.4 (2003), pp. 1143–1166 (cit. on pp. 61, 62). A. Krizhevsky, I. Sutskever, and G. E. Hinton. “Imagenet classification with deep convolutional neural networks.” In: Advances in neural information processing systems. 2012, pp. 1097–1105 (cit. on pp. 1, 2). M. G. Lagoudakis and R. Parr. “Reinforcement learning as classification: Leveraging modern classifiers.” In: ICML. Vol. 3. 2003, pp. 424–431 (cit. on p. 25). G. Lawrence, N. Cowan, and S. Russell. “Efficient gradient estimation for motor control learning.” In: Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc. 2002, pp. 354–361 (cit. on p. 86). Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. “Gradient-based learning applied to document recognition.” In: Proceedings of the IEEE 86.11 (1998), pp. 2278–2324 (cit. on p. 64). D. A. Levin, Y. Peres, and E. L. Wilmer. Markov chains and mixing times. American Mathematical Society, 2009 (cit. on p. 37). S. Levine and P. Abbeel. “Learning neural network policies with guided policy search under unknown dynamics.” In: Advances in Neural Information Processing Sys- tems. 2014, pp. 1071–1079 (cit. on p. 29).PDF Image | OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS
PDF Search Title:
OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHSOriginal File Name Searched:
thesis-optimizing-deep-learning.pdfDIY PDF Search: Google It | Yahoo | Bing
Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info
Cruising Review Topics and Articles More Info
Software based on Filemaker for the travel industry More Info
The Burgenstock Resort: Reviews on CruisingReview website... More Info
Resort Reviews: World Class resorts... More Info
The Riffelalp Resort: Reviews on CruisingReview website... More Info
| CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com | RSS | AMP |