Value Function Iteration with Grid Search

In this notebook, I will solve a simple Dynamic Programming Problem: The Neoclassical Growth Model without uncertainty. I will solve the DP problem by discretizing the Bellman equation and performing value function iteration. The maximization problem is carried on using grid search (brute force).

This note aims to be pedagogical, and does not try to achieve maximum performance.

Neoclassical growth model

Consider the Neoclassical growth model: the production function is kα and capital depreciates at rate δ. The law of motion for the capital stock is:

kt+1=(1δ)kt+ktαct.

The representative household chooses a consumption sequence to maximize the following utility function:

t=0βtlog(ct).

It is easy to show that the steady-state capital stock is given by:

kss=(α1/β(1δ))11α.

Bellman Equation

The Bellman Equation associated to this problem is:

V(k)=maxk{log(kα+(1δ)k)+βV(k)},

with the policy function: k=g(k).

Numerical Implementation

First, define the parameters of the model: α=0.3, β=0.96, and δ=0.1. Second, discretize the capital space in nk=200 points. The points are defined in a grid (a vector) linearly spaced between 2×kss/nk and 2×kss.

5.5 ms
203 ms
Value Function Iteration

I will now perform the iteration of the Value Function. Define the Value Function in the grid nk: V(ki)=Vi. That is, V(ki) is the associated value at capital ki.

The general procedure involve to guess a value function, Vn(ki), solve the maximization problem and compute the next iteration, Vn+1(ki):

Vn+1(ki)=maxkj{log(cij)+βVn(kj)},

where i is the subscript associated with the state ki, while j is the grid point associated with control kj. cij is the consumption associated with state i and control j.

Afterwise, compute the (maximum) distance between the Vn and Vn+1:

d=maxi=1,..,nk|Vin+1Vin|.

If this distance is smaller than some tolerance, d<ε, stop. Otherwise, use Vn+1(ki) as the new guess and repeat the procedure until convergence.

In the code below, I will initialize the Value function (Vi0=0.0, but it could be different) and the Policy function in the grid nk. Note that the array g will tell us that if we are in state i, we will choose the point j in the capital grid, so gi=j (so j is the grid point NOT the capital value).

It will be useful to pre-compute the consumption array (matrix nk×nk) with elements cij defined as:

cij=kiα+(1δ)kikj.

I will also define the tolerance and the maximum number of iterations. See the consumption matrix below:

14.6 μs
200×200 Matrix{Float64}:
 0.343538  0.31433   0.285121  0.255913  0.226705  …  -5.41048  -5.43969   -5.4689
 0.449907  0.420699  0.391491  0.362283  0.333074     -5.30411  -5.33332   -5.36253
 0.531366  0.502158  0.47295   0.443742  0.414534     -5.22265  -5.25186   -5.28107
 0.601075  0.571866  0.542658  0.51345   0.484242     -5.15294  -5.18215   -5.21136
 0.663719  0.634511  0.605303  0.576095  0.546887     -5.0903   -5.11951   -5.14872
 0.721574  0.692366  0.663157  0.633949  0.604741  …  -5.03245  -5.06165   -5.09086
 0.775931  0.746723  0.717515  0.688307  0.659098     -4.97809  -5.0073    -5.0365
 ⋮                                                 ⋱                       
 6.78208   6.75287   6.72366   6.69445   6.66524       1.02806   0.998847   0.969639
 6.81095   6.78174   6.75253   6.72333   6.69412   …   1.05693   1.02772    0.998515
 6.83982   6.81061   6.7814    6.75219   6.72298       1.0858    1.05659    1.02738
 6.86867   6.83947   6.81026   6.78105   6.75184       1.11465   1.08545    1.05624
 6.89752   6.86831   6.83911   6.8099    6.78069       1.1435    1.11429    1.08509
 6.92636   6.89715   6.86794   6.83874   6.80953       1.17234   1.14313    1.11392
211 ms

Note that some cij are negative. Be careful to rule out this possibility during the maximization.

In the code below, I will implement the value function iteration. The outer loop iterate over n. Then, in the first inner loop, I start to solve for every value function Vi. In short, I iterate over the state i.

Then, in the most inner loop, I choose the j that maximizes the VF. There are many ways you can do this step. Here, I am just looping over all j values (but skipping the ones that have negative consumption), sometimes people refer to this approach as grid search. The procedure is summarized as following: first, initialize Vmax to large negative number; second, for every j, compute the VF associated with the control j, and in case Vj>Vmax, set Vmax=Vj.

Note: The loop can be very inefficient in some languages. Nevertheless, it is pretty easy to incorporate extensions where we exploit the monotonicity/concavity of the VF.

2.4 ms
142 ms

Let us now see the array V and g:

6.5 μs
136 ns
129 ns

It is useful to construct the policy function with capital values instead of grid points.

19.8 μs
g
3.4 μs
2.5 ms
1.2 ms