OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

PDF Publication Title:

OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS ( optimizing-expectations-from-deep-reinforcement-learning-to- )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 051

3.14 Figure 5: Computation of factored discrete probability distribution in Atari domain experiment parameters slice 8-d output of final fully-connected layer softmax softmax softmax Probabilities Table 2: Parameters for continuous control tasks, vine and single path (SP) algorithms. Swimmer Hopper Walker 50 1000 1000 1000 1000 5 35 100 3.14 experiment parameters 43 State space dim. Control space dim. Total num. policy params Sim. steps per iter. Policy iter. Stepsize (DKL ) Hidden layer size Discount (γ) Vine: rollout length Vine: rollouts per state Vine: Q-values per batch Vine: num. rollouts for sampling Vine: len. rollouts for sampling Vine: computation time (minutes) SP: num. path SP: path len. SP: computation time 10 12 20 2 3 6 364 4806 8206 50K 1M 1M 200 200 200 0.01 0.01 0.01 30 50 50 0.99 0.99 0.99 50 100 100 4 4 4 500 2500 2500 16 16 16 1000 1000 1000 2 14 40 10000

PDF Image | OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

PDF Search Title:

OPTIMIZING EXPECTATIONS: FROM DEEP REINFORCEMENT LEARNING TO STOCHASTIC COMPUTATION GRAPHS

Original File Name Searched:

thesis-optimizing-deep-learning.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)