OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

PDF Publication Title:

OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS ( optimizing-expectations-from-deep-reinforcement-learning-to- )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 073

5.2 preliminaries 65 tation graphs that we introduce. For these examples, the variance-reduced gradient esti- mators derived in prior work are special cases of the results in Sections 5.3 and 5.4. The contributions of this chapter are as follows: • We introduce a formalism of stochastic computation graphs, and in this general setting, we derive unbiased estimators for the gradient of the expected loss. • We show how this estimator can be computed as the gradient of a certain dif- ferentiable function (which we call the surrogate loss), hence, it can be computed efficiently using the backpropagation algorithm. This observation enables a prac- titioner to write an efficient implementation using automatic differentiation soft- ware. • We describe variance reduction techniques that can be applied to the setting of stochastic computation graphs, generalizing prior work from reinforcement learn- ing and variational inference. • We briefly describe how to generalize some other optimization techniques to this setting: majorization-minimization algorithms, by constructing an expression that bounds the loss function; and quasi-Newton / Hessian-free methods [Mar10], by computing estimates of Hessian-vector products. The main practical result of this chapter is that to compute the gradient estimator, one just needs to make a simple modification to the backpropagation algorithm, where extra gradient signals are introduced at the stochastic nodes. Equivalently, the resulting algorithm is just the backpropagation algorithm, applied to the surrogate loss function, which has extra terms introduced at the stochastic nodes. The modified backpropagation algorithm is presented in Section 5.5. 5.2 preliminaries 5.2.1 Gradient Estimators for a Single Random Variable This section will discuss computing the gradient of an expectation taken over a single random variable—the estimators described here will be the building blocks for more complex cases with multiple variables. Suppose that x is a random variable, f is a func- tion (say, the cost), and we are interested in computing ∂ Ex [f(x)]. There are a few ∂θ different ways that the process for generating x could be parameterized in terms of θ, which lead to different gradient estimators. • We might be given a parameterized probability distribution x ∼ p(·; θ). In this case,

PDF Image | OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

PDF Search Title:

OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

Original File Name Searched:

thesis-optimizing-deep-learning.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)