New possibilities offered by smart deployment scripts
In the past year we were using several methods to deploy our PHP code to the production machines. Now finally we have a deployment method which seems to be very stable and gives us many advantages.
Deployment methods that we’ve tried
The main problems during the deployments have always been the filesystem-related caches. We have two of those:
- The higher level filesystem cache gets generated by our PHP framework. Its gathering a lot of configs from the whole filesystem, compiles them into PHP files and then stores them in some dedicated places.
- On the lower level there is APC, a PHP module which does opcode caching. For performance reasons we’ve set the “file_stat” option to 0. This means that APC, once it cached a compiled PHP file, will ignore every change that is done to it.
When we first built the deployment scripts for the new environment we often had the problem that we didn’t know how to clear the filesystem cache safely.
Let’s assume we first clear the filesystem caches, then deploy the PHP code. During the time it takes to deploy PHP the filesystem caches already start getting regenerated which means that we end up with an inconsistent cache because it is partly generated by the old code and partly generated by the new one.

On the other hand, let’s assume we first deploy code and then clear the filesystem caches. This would mean that after the deployment is finished we do have a clean site with consistent caches, but during the time of the deployment nobody knows what exactly is going to happen because there is already new code running with the old caches.

We also do not want to rely on APC and just assume that it really caches every single file. If we could rely on the fact that it cached every single file, we could simply deploy the new code and then clear APC. But if we deploy twice in a short time span we can’t be sure that each of the PHP nodes already cached each of the PHP files.
Furthermore, none of the above described deployment methods allows us to do cache warming before we throw the code into the wild. On nightly deployments that’s no problem, unfortunately we often ran into the situation that we had to fix critical bugs and deploy during daytime.
A first step into the right direction
After going through many problems due to the above described issues, we decided that we can’t deploy on a machine while it is in production use. Fortunately we have quite a lot of machines in our PHP cluster, which allowed us to split them half-half. For this i need to say that we run Nginx in front of the environment, which then uses FastCGI to connect back to the PHP servers. In this setup it didn’t take much to make the Nginx temporarily only use the first half of the backend servers and let the deployment script deploy to the second half. Then swap the Nginx to use the second half for prod and make the deployment deploy to the first half. That way we could deploy cleanly and the above described problems were solved.

The only problem with this solution was that we have many caches to clean and rsyncing the new code can take up to a minute, so altogether we divided the power of the cluster by two for around 5 minutes. Additionally that half which received the new code first, had to run the whole site in the second phase and regenerate its caches at the same time. During peak times we couldn’t afford that loss of computing power and we knew we had to find a better solution.
Final solution
I think we can say that we now, finally, found a solution which doesn’t have any disadvantages to the above described ones and it also solved all the described problems.
The trick is that we are versioning the directories on the PHP servers where our code is stored in. When the Nginx does the FastCGI request back to the PHP cluster, it always passes the absolute path of the PHP file it wants to have processed as part of the FastCGI header. On the Nginx it’s very simple to change the prefix for all those PHP files and then send all FastCGI requests simply to another version of the code on the backend servers while all the different versions of code on the PHP cluster can coexist in different directories. This solves the problem of the file-caching APC, because the absolute path of the same file in two different versions of code is different, because they reside in different document roots. The problem of the framework caches is also solved, because all the framework cache directories are specific to each version of the code and also reside together with the code in the versioned directory.

I show an example snippet of our Nginx config to try to make this whole thing a little clearer:
set $code_version 13;
location ~* ((.*).php(.*)) {
fastcgi_intercept_errors on;
error_page 404 = @404;
set $script $uri;
set $path_info "";
if ($uri ~ "^(.+\.php)(/.+)") {
set $script $1;
set $path_info $2;
}
include /etc/nginx/fastcgi_params;
fastcgi_param SCRIPT_FILENAME /srv/www/vhosts/$code_version.code/web$script;
fastcgi_param SCRIPT_NAME $script;
fastcgi_param REQUEST_URI $uri;
fastcgi_pass backend.cgi;
}
Now the deployment script will simply deploy the new code into a new directory on all the PHP servers and then we request certain PHP files on each of the backend nodes to warm the cache. As final step it will change the line “set $code_version” to the new version and tell the Nginx to reload its config, without any user interruption and without crazy high load due to cache regeneration.
New possibilities
Since we now have multiple versions of PHP code on each of the backend nodes we can switch between them in seconds. Simply by editing the version number which is saved in the nginx config via a script, and then tell the Nginx to reload the conf. This allows us to, just in case we deploy something that causes problems, roll back without any bigger service interruption.
The coolest new possibility that we have now is that we can compare the efficiency of different code versions life in the production systems. We deploy two versions of the code into each of the backend blades into different directories without making the Nginx use the new versions. Lets say one version is 51 and the other one is 52. then we simply create a filesystem symlink with the name 53, on half of the nodes it points to 51 and on the other half it points to 52. Then we make the Nginx use the version 53, which then means that half of the nodes will run one version and the other half runs the second version. Once we want to switch everything to one of those two versions we simply make the Nginx switch to use version 52 or 51.
First half node:
rsid-a-20:/srv/www/vhosts # ls -lha total 19M drwxr-xr-x 71 user users 4.0K Feb 10 11:55 . drwxr-xr-x 7 root root 4.0K Dec 7 04:40 .. drwxr-xr-x 14 wwwrun www 4.0K Feb 4 09:31 51.code drwxr-xr-x 14 wwwrun www 4.0K Feb 4 09:31 52.code lrwxrwxrwx 1 root root 28 Feb 4 09:33 53.code -> /srv/www/vhosts/51.code
Second half node:
rsid-a-20:/srv/www/vhosts # ls -lha total 19M drwxr-xr-x 71 user users 4.0K Feb 10 11:55 . drwxr-xr-x 7 root root 4.0K Dec 7 04:40 .. drwxr-xr-x 14 wwwrun www 4.0K Feb 4 09:31 51.code drwxr-xr-x 14 wwwrun www 4.0K Feb 4 09:31 52.code lrwxrwxrwx 1 root root 28 Feb 4 09:33 53.code -> /srv/www/vhosts/52.code
That way we can compare multiple different versions of code while they are running on different backend nodes on prod and live monitor if one of them has some efficiency/load problems.




