This document illustrates the results of a parameterizable reproducible research process. It pulls train delay data from WMATA, then does some simple analyses on that data.
5466 delays were loaded.
The data was filtered to only include the 1675 events on the Red line. The first delay on that line was at 2012-04-30 08:11:00; the last was at 2013-07-07 17:20:00. Here are the three most common causes for delays on that line: a brake problem, a door problem, an equipment problem.
This table shows the mean delay and counts of the most frequent causes:
Cause | mean_delay | n |
---|---|---|
a brake problem | 8.337 | 499 |
a door problem | 7.291 | 255 |
an equipment problem | 6.94 | 215 |
a signal problem | 9.991 | 112 |
expressed for schedule adherence/improved train spacing | NaN | 102 |
an operational problem | 6.337 | 101 |
a sick customer | 7.25 | 66 |
did not operate | 6.929 | 60 |
police activity | 6.822 | 46 |
11.64 | 37 |
And this graph shows when delays happened by date and hour.