- To: php-general@xxxxxxxxxxxxx
- Subject: looking for a PHP texte indexer
- From: Mihamina Rakotomandimby <mihamina@xxxxxxxxx>
- Date: Mon, 11 Jun 2012 12:12:41 +0300
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1
Hi all,
I have a small job ad website, where some poster tend to flood with the
same ad, just in order to be on top of the recent sort.
To perturb the strict duplication detection (yes it's weak), they add
one or two words that makes difference.
The result is a duplication of many ads.
I would like to search for duplicates by looking for ads with 80%-90%
same words and decide they're the same, so that I can group them.
Of course, putting a limiting mecanism or even a moderation is
scheduled, but I want to process existing first.
I dont want to use MySQL for indexing, I believe text indexers are best
tools for this: Am I wrong?
What would you suggest me to process and lookup for duplicates in that
situation?
--
RMA.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
[PHP Home]
[Apache Users]
[PHP on Windows]
[Kernel Newbies]
[PHP Install]
[PHP Classes]
[Pear]
[Postgresql]
[Postgresql PHP]
[PHP on Windows]
[Find Someone]
[PHP Database Programming]
[PHP SOAP]