Mod /
Tad Tv
Search:  


1.  TV Listings with TAD

How would TAD perform in a real-world application? The following scenario attempts to answer this by using TAD as a database of TV listings (ie. like those downloaded from Schedules Direct). Various resource usage metrics are then compared against MySql, eg. as in current use by MythTV.

The downloaded data is 7Mb of XML schedule information for 40 channels over 14 days. The Tcl XML utility XTL is used to convert the data into Tcl list form. Then a script post-processes this into the output tables: programs, schedules and channels.

1.1  TAD Memory Usage

The initial raw TAD table data has a total size of about 3 Mb on disk in about 22,000 rows. The base memory consumed by Tcl+TAD is 2.5 Mb before loading tables. After loading the db and full-scanning, the memory footprint rises to 9 Mb, eg:

% tclsh tad.tcl
table load tvdb
q schedules 1&1 -count 1
q programs 1&1 -count 1
q stations 1&1 -count 1

This memory usage compares favorably with MySql consumption in MythTV which in the scenario uses about 100 Mb (30Mb resident) including indexes and incidental tables.

1.2  Schema and CPU Performance

With Tad, large table full-scans are inherently slow. However as Tad uses Tcl arrays, advantage can be taken of hashes to give fast key lookups. Suitable choices for key fields allows fast typical-case lookups. The key fields chosen for the TV listings are as follows:

   TABLE      KEY
   stations - stationid
   programs - programid
   schedule - {date stationid}

The query command in Tad provides simple ways to avoid the overhead of table full-scans. The -match option limits processing to elements with a given matching key pattern. Alternatively, -keys can specify an exact list of keys to use. Examples are given in the following sections.

Programs

The programs table is keyed on programid. Thus there are 3 query forms:


  % conf -time 1

  # 1 - Slow full-scan.
  % q programs $rowid=="EP000441070120"
  # ...
  366201 microseconds per iteration

  # 2 - Limit to matching key pattern.
  % q programs 1 -match EP00044*
  # ...
  28695 microseconds per iteration

  # 3 - List of specific keys
  % q programs 1 -keys {EP000441070120 EP000441070063}
  # ...
  2500 microseconds per iteration


Similar queries are used for stations.

Schedules

As the schedules table key is a concatenation of datestamp and stationid, pattern matching can be used to find shows via channel and/or time. eg.


  # A (slow) full-scan count of all shows on a station.
  % q schedules {$station == "20203"} -count 1
  306
  572371 microseconds per iteration

  # Same as above, but uses a key pattern match.
  % q schedules 1 -match "* 20203" -count 1
  306
  23781 microseconds per iteration

  # All shows at a certain time, on any station.
  % q schedules 1 -match "2008-06-04T03:00:00Z*" -count 1
  33
  9190 microseconds per iteration

  # All shows on a station for the given day.
  % q schedules 1 -match "2008-06-04T* 10136" -count 1
  28
  9912 microseconds per iteration

  # Same as above, but with CC
  % q schedules {$closeCaptioned} -match "2008-06-04T* 10136" -count 1
  28
  10181 microseconds per iteration

  # Same as above, and rated TV-14
  % q schedules {$closeCaptioned == "true" && $tvRating == "TV-14"} \
      -match "2008-06-04T* 10136" -count 1
  3
  8817 microseconds per iteration

  # A full-scan time query.
  % q schedules {$time<=[clock scan 2008-06-04]} -count 1
  1217
  792456 microseconds per iteration

© 2008 Peter MacDonald

Page last modified on August 31, 2008, at 10:04 AM