A lead engineer at eHarmony has given a talk on how the dating company utilises big data to improve their matching algorithm.
David Gevorkyan, principle software engineer at eHarmony, speaks about using Hadoop, an open source Java framework for processing huge amounts of data on large clusters of commodity hardware.
In an hour-long presentation, posted on eHarmony’s engineering blog, Gevorkyan explains “how we take a billion+ potential matches that we find through MongoDB, store them in a Voldemort NoSQL datastore, and then run multiple Hadoop jobs to come up with a filtered list based on Machine Learned models.”
Watch the talk below, and see Gevorkyan’s slide set here.