Even in these days of Big Data, Data Deduplication is even more needed. Common examples include contact list deduplication. Yahoo mail offers this, but it doesn’t handle two contacts that are identical except for one field. That additional field may not be even important to you or me.
I still haven’t found a contact deduplicator that I like, though bbdb does a pretty good job. Before I write something, I’m doing a survey of what is out there in the open source world, so here are some links:
- general information
- bbdb related
- sacha chu’s article on bbdb – smartphone integration
- file system de-duplication