1. TV Listings with TAD
How would TAD perform in a real-world application?
The following scenario attempts to answer this by using TAD as a database
of TV listings (ie. like those downloaded from Schedules Direct).
Various resource usage metrics are then compared against MySql, eg. as
in current use by MythTV.
The downloaded data is 7Mb of XML schedule
information for 40 channels over 14 days.
The Tcl XML utility XTL is used to convert the data into Tcl list form.
Then
a script post-processes this into the
output tables: programs, schedules and channels.
1.1 TAD Memory Usage
The initial raw TAD table data has a total size of about 3 Mb on disk in about 22,000 rows.
The base memory consumed by Tcl+TAD is 2.5 Mb before
loading tables.
After loading the db and full-scanning, the memory footprint rises to 9 Mb, eg:
% tclsh tad.tcl
table load tvdb
q schedules 1&1 -count 1
q programs 1&1 -count 1
q stations 1&1 -count 1
This memory usage compares favorably with MySql consumption
in MythTV which in the scenario uses about 100 Mb (30Mb resident)
including indexes and incidental tables.
1.2 Schema and CPU Performance
With Tad, large table full-scans are inherently slow.
However as Tad uses Tcl arrays, advantage can be taken of hashes to give
fast key lookups.
Suitable choices for key fields allows fast typical-case lookups.
The key fields chosen for the TV listings are as follows:
TABLE KEY
stations - stationid
programs - programid
schedule - {date stationid}
The query command in Tad provides simple ways to avoid
the overhead of table full-scans.
The -match option limits processing to elements with a given matching key
pattern. Alternatively,
-keys can specify an exact list of keys to use.
Examples are given in the following sections.
Programs
The programs table is keyed on programid. Thus there are
3 query forms:
% conf -time 1
# 1 - Slow full-scan.
% q programs $rowid=="EP000441070120"
# ...
366201 microseconds per iteration
# 2 - Limit to matching key pattern.
% q programs 1 -match EP00044*
# ...
28695 microseconds per iteration
# 3 - List of specific keys
% q programs 1 -keys {EP000441070120 EP000441070063}
# ...
2500 microseconds per iteration
Similar queries are used for stations.
Schedules
As the schedules table key is a concatenation of datestamp and stationid,
pattern matching can be used to find shows via channel and/or time. eg.
# A (slow) full-scan count of all shows on a station.
% q schedules {$station == "20203"} -count 1
306
572371 microseconds per iteration
# Same as above, but uses a key pattern match.
% q schedules 1 -match "* 20203" -count 1
306
23781 microseconds per iteration
# All shows at a certain time, on any station.
% q schedules 1 -match "2008-06-04T03:00:00Z*" -count 1
33
9190 microseconds per iteration
# All shows on a station for the given day.
% q schedules 1 -match "2008-06-04T* 10136" -count 1
28
9912 microseconds per iteration
# Same as above, but with CC
% q schedules {$closeCaptioned} -match "2008-06-04T* 10136" -count 1
28
10181 microseconds per iteration
# Same as above, and rated TV-14
% q schedules {$closeCaptioned == "true" && $tvRating == "TV-14"} \
-match "2008-06-04T* 10136" -count 1
3
8817 microseconds per iteration
# A full-scan time query.
% q schedules {$time<=[clock scan 2008-06-04]} -count 1
1217
792456 microseconds per iteration
© 2008 Peter MacDonald