Data Sets
So far, most of the data sets that I have are randomly generated. I would
love to get more real data sets: a complete schedule for an airline or a
city bus network would be great. To see the best reported tours, check the
Speedy Tourist Page.
Railway Schedules
This data is from the Indian Bradshaw Railway guide (January 1996). My current
effort has been to include the full set of routes between "major"
cities in South India. A city was considered major if I had heard of it
previously, or if it was a railway junction.
The data was entered by hand, so it probably has a
few bugs. The schedule is presented as a daily schedule, with the
incorrect assumption that trains run every day. (A more accurate schedule
would be a weekly schedule). Another minor nit is that I have not
distinguished between different railway stations in the same city, so you
are allowed to depart from Madras Central one minute after your arrival from
Madras Egmore.
Data from real schedules has different properties from the artificial data.
The vertex degree is generally small (except for a few junctions), but the
number of parallel edges can be very high. There can be a strong correlation
between arrival and departure times, so (for good connections), the waiting
time in a city can be very small.
This data set is now complete, with the 75 city example covering essentially
all the rail routes in southern India.
Random Instance Generators
Several different data sets have been generated. For some of the data
sets, I have used real cities, and then randomly generated bus routes
between pairs of cities. The source code for the generators is
available and a sketch of the generation methods is given below.
The graphs are checked for strong connectivity before being included
in the official data set.
- Complete Graph A complete graph with a route between each pair
of cities. The start time is chosen uniformly at random, and the duration
is a based upon the euclidean distance betwen the cities multiplied by
a random value.
- Sparse Graph The coordinates of the cities are based on
a real map. The routes out of a city are chosen randomly, with
greater likelihood of routes to near by cities than to far away
cities. The algorithm to generate the routes leaving city X sorts the
cities by their distance from city X. The city that is at position
k in this list has probability ~ 1/k of being the destination of a
route leaving X. (This means the expected degree is O(log n) for
whatever that is worth).
- Triangular Graph
The graph is a triangle, with five random routes on each edge.
- Mesh Graph
The graph is a mesh, with five random routes on each edge.
Data Sets
Each data set consists of a pair of files, a schedule file, and a city
file. The schedule file has all of the information about the routes.
The city file gives names to the cities, and display coordinates.
(The city files aren't really necessary for the problem).
There are several families of data sets. Within each family, a common
generation algorithm was used. Each problem instance specificies a
starting time, and a starting city. (The times and cities are given
as integers - the minutes since midnight, and the city number.)
The instances are specified: schedule_file, city_file, start_city, start_time
- Southern Railway
- south-sch.12
south.12 1 480 (Bangalore, 8 am)
- south-sch.18
south.18 2 480 (Bangalore, 8 am)
- south-sch.25
south.25 2 480 (Bangalore, 8 am)
- south-sch.36
south.36 2 480 (Bangalore, 8 am)
- south-sch.49
south.49 3 480 (Bangalore, 8 am)
- south-sch.66
south.66 3 480 (Bangalore, 8 am)
- south-sch.75
south.75 3 480 (Bangalore, 8 am)
- USA Complete
- usa-sch.10
usa.10 9 60 (Yakima, 1 am)
- usa-sch.12
usa.12 3 600 (Seattle, 10 am)
- usa-sch.15
usa.15 3 0 (Walla Walla, Midnight)
- usa-sch.20
usa.20 18 720 (Weed, Noon)
- USA Sparse
- usa-sch.25
usa.25 1 510 (Reno, 8:30am)
- usa-sch.30
usa.30 11 720 (Roswell, Noon)
- usa-sch.35
usa.35 27 720 (San Francisco, Noon)
- usa-sch.40
usa.40 38 1200 (Waycross, 8:00 pm)
- usa-sch.45
usa.45 9 540 (Traverse City, 9:00 am)
- usa-sch.50
usa.50 38 60 (Weed, 1:00 am)
- usa-sch.60
usa.60 59 180 (Toronto, 3:00 am)
- usa-sch.80
usa.80 31 1200 (Vicksburg, 8:00 pm)
- usa-sch.100
usa.100 56 360 (Regina, 6:00 am)
- usa-sch.128
usa.128 98 1080 (Waco, 6:00 pm)
- UK Sparse
- uk-sch.25
uk.25 17 510 (St. Bees Head, 8:30 am)
- uk-sch.50
uk.50 4 720 (Middle Wallop, Noon)
- uk-sch.75
uk.75 39 180 (Liverpool, 3:00 am)
- uk-sch.100
uk.100 99 600 (Glenlivet, 10:00 am)
- uk-sch.150
uk.150 100 500 (Lizard Lighthouse, 8:20 am)
- uk-sch.200
uk.200 193 1000 (London, 4:40 pm)
- uk-sch.284
uk.284 268 900 (Edinburgh, 3:00 pm)
- Meshes
- mesh-sch.3
mesh.3 0 100 (Calf of Man, 1:40 am)
- mesh-sch.4
mesh.4 15 200 (Newton, 3:20 am)
- mesh-sch.5
mesh.5 21 300 (Jubilee Corner, 5:00 am)
- mesh-sch.6
mesh.6 3 400 (Leeds, 6:40 am)
- mesh-sch.7
mesh.7 48 500 (Anvil Green, 8:20 am)
- mesh-sch.8
mesh.8 3 600 (Bentwaters, 10:00 am)
- mesh-sch.9
mesh.9 7 700 (Fair Isle, 11:40 am)
- mesh-sch.10
mesh.10 68 800 (Guernsey, 1:20 pm)
- Triangles
- tri-sch.3
tri.3 0 100 (North Rona Island, 1:40 am)
- tri-sch.4
tri.4 0 200 (Lydd, 3:20 am)
- tri-sch.5
tri.5 0 300 (Hawarden, 5:00 am)
- tri-sch.6
tri.6 1 400 (Lundy Island, 6:40 am)
- tri-sch.7
tri.7 13 500 (Greenham Common, 8:20 am)
- tri-sch.8
tri.8 35 600 (Sculthorpe, 10:00 am)
- tri-sch.9
tri.9 44 700 (Oban, 11:40 am)
- tri-sch.10
tri.10 21 800 (Inverness, 1:20 pm)
- tri-sch.11
tri.11 59 900 (St. Bees Head, 3:00 pm)
- tri-sch.12
tri.12 63 1000 (Muckle Flugga, 4:40 pm)