[Dailydave] approximate string matching
Mateusz Berezecki
mateuszb at gmail.com
Fri Sep 1 06:48:50 EST 2006
Hello Arun,
On 9/1/06, Arun Koshy <arunkoshy at gmail.com> wrote:
> On 9/1/06, Mateusz Berezecki <mateuszb at gmail.com> wrote:
> > Is anyone aware of a good implementation of any of these algorithms
> > in C or perhaps some opensource C library for that purpose?
> > Do you have any recommendations?
>
> Check :
>
> http://www.dcs.shef.ac.uk/~sam/stringmetrics.html#jaccard
>
> The above links into a sourceforge project that has an implementation
>
> http://sourceforge.net/projects/simmetrics/
>
> Hope that helps
>
Well, sort of :-) I did check the simmetrics project and it's in C#
and reimplementing the interfaces and all required tokenizer libraries
is too much effort for now.
I want something fast yet simple like
http://en.wikipedia.org/wiki/Bitap_algorithm - that one uses Levenshtein
distance function
Thank you for the quick reply and for a reminder of simmetrics. If there
is no other alternative I'll try porting it to C and post the link to the list
so if anyone needs that as well it'll be available
thanks,
Mateusz Berezecki
More information about the Dailydave
mailing list