RepeatAfterMe (RAM) for Czech - speech data collection¶
This application is useful for bootstraping of speech data. It asks the caller to repeat sentences which are randomly sampled from a set of preselected sentences.
- The Czech sentences (
sentences_es.txt) are from Karel Capek novels Matka and RUR, and the Prague’s Dependency Treebank.
- The Spanish sentences (
sentences_es.txt) are taken from the Internet
If you want to run
ram_hub.py on some specific phone number than specify the appropriate extension config:
$ ./ram_hub.py -c ram_hub_LANG.cfg ../../resources/private/ext-PHONENUMBER.cfg
After collection desired number of calls, use
copy_wavs_for_transcription.py to extract the wave files from
call_logs subdirectory for transcription. The files will be copied into into
These calls must be transcribed by the Transcriber or some similar software.