As more people use the internet to share statuses, tweets, links, and other content the task of separating the wheat from the chaff is quickly becoming more and more important. Luckily, there are a number of approaches to finding the most interesting content in use across the internet, both by analyzing content itself and by giving users themselves the tools to identify what is good. Our panel will explore the details of how sites we use everyday have attempted to solve this problem. We’ll talk about voting systems where democracy works on a smaller scale, social systems that try to figure out who you care about or whose style you share, content analysis approaches that try to show you things based on your explicit or implicit set of interests, and other interesting algorithms for scoring and ranking content. We’ll also talk about implementation, touching on scaling distributed databases, training Machine Learning models, etc. We’ll talk about some common issues across these systems. Something as simple as counting votes can actually turn into a long lesson in statistics. And there are other factors our algorithms must balance, including making sure we show recent stuff vs. the overall best, ensuring new content gets a fair chance to prove itself, and keeping the a site simple with all this complexity happening behind the scenes. Finally, we’ll talk about how algorithms that control content distribution end up being big targets for gaming and abuse.
http://schedule.sxsw.com/events/event_IAP6966