Submission rejected as potential spam (Maximum number of external links per post exceeded, Akismet says content is spam)
I wanted to post this here
Code:
''' Fighting Image SPAM with FuzzyOCR and !ImageInfo plugins'''
First we need some new packages
{{{
# apt-get install lsb-base libice6 libx11-6 xlibs-data libsm6 x11-common
# apt-get install gifsicle ocrad
}}}
You should not have both giflib-bin and libungif-bin installed.[[BR]]
Simulate removing giflib-bin
{{{
# apt-get -s remove giflib-bin
}}}
If it's not installed, then you can move on. If it's the only thing that will be removed, then remove it. Note: an etch system may try to remove libungif-bin instead of giflib-bin. There is no need to remove libungif-bin
{{{
# apt-get remove giflib-bin
}}}
Continue to install other required programs.
{{{
# apt-get install netpbm bzip2 libmldbm-perl libstring-approx-perl libmldbm-sync-perl
# apt-get install liblog-agent-perl libdbi-perl libdbd-mysql-perl libtie-cache-perl
}}}
Download, extract, patch, compile and install libungif (even if you have libungif-bin installed)
{{{
# cd /usr/local/src
# wget http://internap.dl.sourceforge.net/sourceforge/libungif/libungif-4.1.4.tar.gz
# tar xzvf libungif-4.1.4.tar.gz
# cd libungif-4.1.4/util
# wget http://users.own-hero.net/~decoder/fuzzyocr/giftext-segfault.patch
# patch giftext.c < giftext-segfault.patch
# cd ..
# ./configure --prefix=/usr && make && make install
}}}
Download, extract, compile and install gocr
{{{
# cd /usr/local/src
# wget http://www-e.uni-magdeburg.de/jschulen/ocr/gocr-0.43.tar.gz
# tar xzvf gocr-0.43.tar.gz
# cd gocr-0.43
# ./configure --with-netpbm=/usr/lib --prefix=/usr && make && make install
}}}
At this point make we should have all of these programs installed in /usr/bin
{{{
# which gifsicle
# which giffix
# which giftext
# which gifinter
# which giftopnm
# which jpegtopnm
# which pngtopnm
# which bmptopnm
# which tifftopnm
# which ppmhist
# which pamfile
# which ocrad
# which gocr
# which pnmnorm
# which pnminvert
# which ppmtopgm
}}}
Install FuzzyOCR 3.5.1
{{{
# cd /usr/local/src
# wget http://users.own-hero.net/~decoder/fuzzyocr/fuzzyocr-3.5.1-devel.tar.gz
# tar xzvf fuzzyocr-3.5.1-devel.tar.gz
# cd FuzzyOcr-3.5.1
}}}
If you are using netpbm < 10.34 (Debian uses 10.0-10.1) you need to apply these patches. They disable some features only available in newer versions:
{{{
# wget http://www200.pair.com/mecham/spam/FuzzyOcr-3.5.0-rc1.netpbm_less_than_10.34.patch
# wget http://www200.pair.com/mecham/spam/FuzzyOcr-3.5.0-rc1.netpbm_less_than_10.34.patch2
# wget http://www200.pair.com/mecham/spam/FuzzyOcr-3.5.0-rc1.netpbm_less_than_10.34.patch3
# patch -p0 < FuzzyOcr-3.5.0-rc1.netpbm_less_than_10.34.patch
# patch -p0 < FuzzyOcr-3.5.0-rc1.netpbm_less_than_10.34.patch2
# patch -p0 < FuzzyOcr-3.5.0-rc1.netpbm_less_than_10.34.patch3
}}}
Our Debian version of Netpbm (10.0) may not contain both of these
{{{
# which pamtopnm
# which pamditherbw
}}}
Those programs are used together.Therefore we need to disable any scansets that use them and we will also remove them from the preprocessors file. If you have both of these programs, you do not need to apply these patches. If you are missing either one of them, apply these patches:
{{{
# wget http://www200.pair.com/mecham/spam/gary.3.5.0-rc1.old.netpbm.patch1
# wget http://www200.pair.com/mecham/spam/gary.3.5.0-rc1.old.netpbm.patch2
# wget http://www200.pair.com/mecham/spam/gary.3.5.0-rc1.old.netpbm.patch3
# patch -p0 < gary.3.5.0-rc1.old.netpbm.patch1
# patch -p0 < gary.3.5.0-rc1.old.netpbm.patch2
# patch -p0 < gary.3.5.0-rc1.old.netpbm.patch3
}}}
Copy the files
{{{
# cp -r FuzzyOcr /etc/mail/spamassassin
# cp FuzzyOcr.cf /etc/mail/spamassassin
# cp FuzzyOcr.pm /etc/mail/spamassassin
# cp FuzzyOcr.preps /etc/mail/spamassassin
# cp FuzzyOcr.scansets /etc/mail/spamassassin
# cp FuzzyOcr.words /etc/mail/spamassassin
}}}
Configure !FuzzyOcr.cf:
{{{
# vi /etc/mail/spamassassin/FuzzyOcr.cf
}}}
Set log level to 2 (only while we test):
{{{
#focr_verbose 3
focr_verbose 2
}}}
Enable logging by uncommenting this:
{{{
focr_logfile /tmp/FuzzyOcr.log
}}}
uncomment
{{{
focr_timeout 15
}}}
uncomment
{{{
focr_threshold 0.20
}}}
Set focr_base_score to 3
{{{
#focr_base_score 5
focr_base_score 3
}}}
I change focr_add_score from the default of 1, to 0.5:
{{{
#focr_add_score 0.375
focr_add_score 0.5
}}}
I lower focr_corrupt_score:
{{{
#focr_corrupt_score 2.5
focr_corrupt_score 1.5
}}}
I lower focr_corrupt_unfixable_score:
{{{
#focr_corrupt_unfixable_score 5
focr_corrupt_unfixable_score 2.5
}}}
Start the test by linting spamassassin:
{{{
# spamassassin --lint
}}}
If you get this error
{{{
Subroutine FuzzyOcr::O_NONBLOCK redefined at /usr/share/perl/5.8/Exporter.pm line 65.
at /usr/lib/perl/5.8/POSIX.pm line 19
}}}
it appears to be related somehow to Net::Ident. If you run spamd with the --auth-ident option then you need this module and will have to deal with the harmless error message (actually it may not be harmless if you depend on a clean --lint). If you don't need Net::Ident (you don't have any programs that use it), then I suggest you remove it: 'apt-get remove libnet-ident-perl'