Public Transport Info (Czech) – data¶
This directory contains the database used by the Czech Public Transport Info system, i.e. a list of public transportation stops, time expressions etc. that are understood by the system.
The main database module is located in
database.py. You may obtain a dump of the database by running
To build all needed generated files that are not versioned, run
Contents of additional data files¶
Some of the data (for the less populous slots) is included directly in the code
database.py, but most of the data (e.g., stops and cities) is located in additional list files.
Resources used by public transport direction finders¶
The sources of the data that are loaded by the application are:
cities.expanded.txt– list of known cities and towns in the Czech Rep. (tab-separated: slot value name + possible surface forms separated by semicolons; lines starting with ‘#’ are ignored)
stops.expanded.txt– list of known stop names (same format)
cities_stops.tsv– “compatibility table”: lists compatible city-stops pairs, one entry per line (city and stop are separated by tabs). Only the primary stop and city names are used here.
stops.expanded.txt are generated from
stops.txt using the
expand_stops.py script (see documentation in the file itself; you need to have Morphodita Python bindings installed to successfully run this script). Please note that the surface forms in them are lowercased and do not include any punctuation (this can be obtained by setting the
-p parameters of the
Colloquial stop names’ variants that are added by hand are located in the
stops-add.txt file and are appended to
stops.txt before performing the expansion.
Additional resources for the CRWS/IDOS directions finder¶
Since the CRWS/IDOS directions finder uses abbreviated stop names that need to be spelled out in ALEX, there is an additional resource file loaded by the system:
idos_map.tsv– a mapping from the slot value names (city + stop) to abbreviated CRWS/IDOS names (stop list + stop)
convert_idos_stops.py script is used to expand all possible abreviations and produce a mapping from/to the original CRWS/IDOS stop names as they appear, e.g., at the IDOS portal .
Resources used by the weather information service¶
The weather service uses one additional file:
cities_locations.tsv– this file contains GPS locations of all cities in the Czech Republic.