Here is a small blog post to celebrate the company presence at the ECUG Con 2010 (ECUG stands Effective Cloud User Group, former Erlang China User Group) in Beijing this week end (16th and 17th of October 2010). Our most famous coworker, Alvaro Videla, will be giving a talk about scaling web application using RabbitMQ along with fellow erlangists from China but also Bob Ippolito, CTO of MochiMedia.
Stay put for the after-con report, meanwhile I will present quickly how we are using Erlang in our projects. Erlang is a functional language with built-in support for concurrency, distribution, messaging and fault tolerance; it was developed by Ericsson to run on telecommunication systems. Its nature permits to build robust, scalable applications that we can use to power our projects.
Here is a list of the tools we are using:
* RabbitMQ: The famous messaging system helped us to remove a lot of heavy processing in the background and increase the response time of our sites. For example, signing up is now a matter of milliseconds for our users as IP logging, statistics update, cache clearing is done asynchronously. But we also use it for thumbnails generation, email notifications, error logging… We are running a simple setup with 2 servers and processing 27 millions messages per day for the moment. The current load is very low so we could easily increase the queue penetration in our system.
* Ejabberd: This is an instant messaging server using the XMPP protocol. We use it to track if the user is online or not and as a in-browser chat. We also start building a real-time notification platform on top of it; so that we could notify the user in his browser window if one of his friends is logged in or if the user receives a poke/message. The current setup is one server keeping connection with up to 300 concurrent users and we are doing capacity testing to reach 30,000 sessions.
* Riak: For one of our project, we are storing around 200 millions messages send among users. We are storing currently them on 4 servers using master-slave setup and basic partitioning. To alleviate the situation, we want to use a distributed key-value database, as in Riak. We should then be able to grow easily and store enormous amount of data without having to worry. Also over the years, we realized that denormalization and partitioning are critical in improving our SQL databases, so why not give a try to a NoSQL database; the important thing is to understand the data and how it is used. One of our colleagues, Joseph Lambert, worked several weeks on a indexing solution for Riak to accelerate items retrieval based on some criteria, which worked much better than to rely on Map/Reduce. Promising stuff until Riak Search was finally released last week with indexing, Lucene syntax and what not. Well at least we had to play with Riak internals and have a better understanding of it. The next step will be finalizing the prototype and deploy it for live testing. Exciting :)
* Tsung: I nearly forgot this one but this is also important. The overall goal for us of using Erlang is to build performant systems able to respond to the quick users growth. Still this doesn’t make so much sense if we are not able to prove it and this is where Tsung intervenes. Tsung is a load testing tool that allows to generate different kind of load like MySQL queries, HTTP requests, XMPP connections… It works with scenario that can simulate user traffic and is distributed to simulate heavy loads. We use it to stress tests our system; for example we did some experiments for our migration from MyISAM to XtraDB on our main MySQL database so that we could see what will be the real impact in production and if the migration was worth with our kind of traffic (note: it is!).
Finally we are also looking into improving our skills in Erlang to understand better how those tools are working but also at some point build our own services more business orientated in a scalable way.