import numpy as np
import matplotlib.pyplot as plt
import random
np.random.seed(12345)
The simulation of random walks provides an illustrative application of utilizing array operations. Let’s first consider a simple random walk starting at 0 with steps of 1 and –1 occurring with equal probability.
Here is a pure Python way to implement a single random walk with 1,000 steps using
the built-in random
module:
position = 0
walk = [position]
steps = 1000
for i in range(steps):
step = 1 if random.randint(0, 1) else -1
position += step
walk.append(position)
plt.figure()
plt.plot(walk[:100])
[<matplotlib.lines.Line2D at 0x11eb25e20>]
Now let's use numpy instead of using for loop to generate the array walk
. You are expected to use three lines of code to generate a single random walk with 2,000 steps and save the position after each step in a numpy array walk
.
steps
.steps
to -1, which means walking to the -1 direction. walk
.nsteps = 2000
# line 1:
steps = np.random.randint(0, 2, size=nsteps)
# line 2
steps[steps==0] = -1
# line 3
walk = steps.cumsum()
Replace the question mark and print out the statistics of the array walk
on each line.
print("The most left (minimal) position of the walk is ", walk.min())
print("The most right (maximal) position of the walk is ", walk.max())
print("The average position of the walk is ", walk.mean())
print("The median position of the walk is ", np.median(walk))
print("The variance of the position of the walk is ", walk.var())
The most left (minimal) position of the walk is -3 The most right (maximal) position of the walk is 40 The average position of the walk is 16.114 The median position of the walk is 16.0 The variance of the position of the walk is 79.79500399999999
A more complicated statistic is the first crossing time, the step at which the random walk reaches a particular value. Here we might want to know how long it took the random walk to get at least 10 steps away from the origin 0 in either direction.
You are expected to use two lines of code to return the first crossing time that this walk needs to cross either 10 or -10.
On line 1, use a boolean condition to generate a boolean array walk10
from walk
to indicate if or not the walk crosses 10 or -10.
On line 2, use np.argmax to return the index of the first True from walk10
.
Note the method above may not be efficient since we only want to know the first crossing time in this question.
# line 1
walk10 = np.abs(walk) >= 10
# line 2
walk10.argmax()
37
Simulating Many Random Walks at Once
If your goal was to simulate many random walks, say 5,000 of them, you can generate all of the random walks with minor modifications to the preceding code.
Please modify your previous code a little bit to compute all 5,000 random
walks with 1000 steps in each and save the result in a two dimensional array walks
, in which each row is one walk and each column represnts current position of the walk.
nwalks = 5000
nsteps = 1000
# line 1: modify the size to generate a two dimensional random array.
steps = np.random.randint(0, 2, size=(nwalks, nsteps))
# line 2
steps[steps==0] = -1
# line 3: collapes steps along the column to calculate the cumulative sum of each row.
walks = steps.cumsum(axis = 1)
Replace the question mark and print out the statistics of the array walks
on each line.
print("The maximum value obtained over all of the walks is: ", walks.max())
print("The minimum value obtained over all of the walks is: ", walks.min())
The maximum value obtained over all of the walks is: 138 The minimum value obtained over all of the walks is: -133
Out of these walks, let’s compute the first crossing time to 30 or –30 for each walk. This is slightly tricky because not all 5,000 of them reach 30. Please answer following two questions:
hits30 = (np.abs(walks) >= 30).any(axis = 1)
hits30
hits30.sum() # Number that hit 30 or -30
3411
# method 1
crossing_times = (np.abs(walks) >= 30).argmax(axis = 1)
crossing_times[crossing_times != 0].mean() # do not count 0
499.00996775139254
# method 2
hits30 = (np.abs(walks) >= 30).any(axis = 1)
crossing_times = (np.abs(walks[hits30]) >= 30).argmax(axis = 1) # slice out walks who eventually cross 30/-30
crossing_times.mean()
499.00996775139254