While trying to scan them, I was repeatedly running up against the same buzzwords and -phrases, and was promising myself to write a tool to hit back and filter out those entries, given a text file that I've downloaded from What's New Too. I had expected to be done with a simple parametrization of agrep; in the end, Byron Rakitzis' version of Tom Duff's rc design turned out to be the best tool for the job.
(The#!/home/pub/bin/rc nl=' ' beep='^G' exclude_file=/home/kbs/jutta/etc/exclude source_file=/home/kbs/jutta/tmp/newtoo/in source=``($beep){sed 's/^$/'$beep/ $source_file} exclude=``($nl){cat $exclude_file} for (s in $source) if (! ~ $s * ^ $exclude ^ *) echo $s
^G
above should be control-G, not circumflex G.)
With my list of patterns, the filter took about 10 seconds to throws out one third of all entries from the 300 K file. And I'll never have to read a triple exclamation mark again.