HOME ARCHIVE TAGS ABOUT RSS

FiveThirtyEight: The Riddler (2016-Feb-05)

For the past few months FiveThirtyEight has been posting a weekly riddle. I decided to give it a shot this week. The title for the riddle is “How Many Cars Will Get Stuck In Traffic?”. Below is the entire question:

There is a very long, straight highway with some number of cars (N) placed somewhere along it, randomly. The highway is only one lane, so the cars can’t pass each other. Each car is going in the same direction, and each driver has a distinct positive speed at which she prefers to travel. Each preferred speed is chosen at random. Each driver travels at her preferred speed unless she gets stuck behind a slower car, in which case she remains stuck behind the slower car. On average, how many groups of cars will eventually form? (A group is one or more cars travelling at the same speed.) For example, if the car in the very front happens to be slowest, there will be exactly one group — everybody will eventually pile up behind the slowpoke. If the cars happen to end up in order, fastest to slowest, there will be Ngroups — no car ever gets stuck behind a slower car.

Seems fairly simple? To get started I wrote a quick Monte Carlo simulation in Python. Below is the code:

#------------------------------------------------------------------------------+
#
#   Nathan A. Rooy
#   FiveThirtyEight: The Riddler [July 8th, 2016]
#
#------------------------------------------------------------------------------+

#--- IMPORT DEPENDENCIES ------------------------------------------------------+

from __future__ import division
from collections import defaultdict
import random

#--- MAIN ----------------------------------------------------------------+

def traffic(ncars):
    cars=[]

    # GENERATE RANDOM CAR SPEEDS
    for i in range(ncars):
        cars.append(random.random())

    groups=0
    slowest_car=cars[0]

    # CYCLE THROUGH CAR LIST
    for i in range(ncars-1):
        if cars[i]<=slowest_car:
            slowest_car=cars[i]
            groups+=1

    # CHECK LAST CAR
    if cars[-1] < slowest_car:
        groups+=1

    return groups

def main(iters,ncars):
    group_hist=[]
    for i in range(0,iters):
        group_hist.append(traffic(ncars))
    return group_hist

#--- RUN ---------------------------------------------------------------------+

ncars_list=[3,4,5,6,7,8,9,10,
            15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,
            150,200,250,300,350,400,450,500,550,600,650,700,750,800,850,900,950,1000,
            2000,3000,4000,5000,6000,7000,8000,9000,10000]

iters=1000000
for ncars in ncars_list:
    group_hist=main(iters,ncars)

    # RESULTS
    results_dict=defaultdict(int)
    for item in group_hist:
        results_dict[item]+=1

    results_sorted=sorted(results_dict.items(),key=lambda(group,count):group,reverse=False)

    # SAVE RESULTS TO CSV
    with file('results.csv','a') as csvFile:
        csvFile.write(str(ncars)+','+str(results_sorted[:30])+'\n')
    csvFile.close()

#--- END ----------------------------------------------------------------------+

This script will generate random speeds for a given number of cars (ncars). From this information, we can then calculate the number of groups that will form. However because of the randomness of this particular situation, we can get a better answer by simulating it many times and taking the average. So for every value of ncars that was tested, I ran 10 million simulations and averaged the result to determine the mean group size. I let this run for few hours and let it cycle through ncar values ranging from 3 to 10,000. Below is an animation of the results…

It looks a bit jumpy because I didn’t use an even increment for ncars, I skipped around a bit. After I collected the results, I was able to plot the mean number of groups that formed for every value ncars:

Right of the bat, I noticed that the results resembled a ln relationship. This proved to be fairly close, but not exact. I then tried ln(2*ncars) and that got really close but still not perfect.

It was at this point that I got a bit lucky. I was trying to fit different variations of the ln curve when I remembered harmonic series from calculus. For my first try I went with: groups(ncars)=sum(1/1:ncars). Below I plotted the new curve with the results of the Monte Carlo simulation.

Looks good to me! Just as a final check, I plotted the percent difference between the Monte Carlo simulation results and my fitted curves.

The ln(ncars) solution is under predicting while ln(2*ncars) solution is slightly over predicting. The power series curve is definitely the best option. It is interesting to see how close the natural log comes though. I guess it’s really not that surprising considering the natural log in this case is just the integral of from 1 -> ncars of (1/ncars). That’s why the two solutions parallel each other. I wrote out the solution in standard notation below (n=ncars).

$$ groups(n)=\sum_{i=1}^n \left(\frac{1}{n}\right)$$

FiveThirtyEight has not posted the solution yet, but this was my final answer anyway. I’m sure there is a much simpler way of solving this. In fact, I won’t be surprised at all when the solution is just a couple lines. Ah well, I still got there…

The Python code can be downloaded from my GitHub (here).