Yariv’s Blog » Blog Archive » Parallel merge sort in Erlang
Parallel merge sort in Erlang
Posted by Yariv on March 09, 2009
I’ve been thinking lately about the problem of scaling a service like Twitter or the Facebook news feed. When a user visits the site, you want to show her a list of all the recent updates from her friends, sorted by date. It’s easy when the user doesn’t have too many friends and all the updates are on a single database (as in Twoorl’s case :P). You use this query:
"select * from update where uid in ([fid1], [fid2], ...) order by creation_date desc limit 20"
(After making sure you created an index on uid and creation_date, of course :) )
However, what do you when the user has many thousands of friends, and each friend’s updates are stored on a different database? Clearly, you should fetch those updates in parallel. In Erlang, it’s easy. You use pmap():