Table 2.
Parameters used in DDQN
Parameter | Value |
---|---|
The parameter ε of ε-greedy algorithm | 0.8 |
Discount factor γ | 0.9 |
Learning rate α | 0.01 |
Maximum number of transitions M in replay memory | 30,000 |
The size of mini-batch training set Ms | 64 |
Descending degree tar | 0.1 |
τ steps for updating the target network | 200 |