Best Practices and the Law of Unintended Consequences

A post I wrote for the Expensify blog about how the “recommended” Alternative PHP Cache (APC) configuration couldn’t handle our traffic:

At Expensify, everything we do is a balance. As a startup we can’t build every feature we and our users want, or install as many servers as we can imagine. Sometimes, though, we see a change we can make that won’t cost much (in time or money) and will benefit for our users. Here’s the story of one of those times that didn’t work out as well as we hoped.

Most of our website is written in PHP. While there is some healthy debate among our engineering staff, most of us like PHP for its rapid development and ease of deployment. Our web servers use the Alternative PHP Cache (APC) to cache compiled code and speed up requests for our users. A few months ago we updated our web server configuration to use less memory for each PHP process.

Read more…

Reconfiguring PHP

Earlier this summer, I undertook the process of writing a new PHP configuration file for the web servers running our core application. Until now almost every PHP installation I’ve worked on has used lightly-edited (or sometimes heavily-edited) version of the default files that ship with the PHP source or packages. This time, for reasons I’ll explain below, I started with a blank file and built up our configuration from scratch. Along the way I learned a bit about PHP history, found some unanswered questions, and hatched a few ideas.

Why Do It?

The primary motivation for writing the new file was our upcoming upgrade to PHP 5.4. At a minimum, the file we were using would need editing. Several configuration directives were removed in 5.4, and a few of them (most notably allow_calltime_pass_reference) would cause a startup error if they were defined. The task fell to me as the member of the team with the most interest of the inner workings of PHP.

Server-level configuration files occupy an interesting position in the web stack. Management of these files is typically handled by the operations team, but the values set in the configuration file directly affect the performance of the application and often need to be chosen by the developers.

It’s common for a server-level configuration file to be set up at the start of an application’s lifespan and not have much attention paid to it afterwards, even as other components get upgraded. As a result, older configuration directives have been deprecated or removed stay around. Even if they don’t cause any harm, leaving them in can make it difficult to edit the configuration when changes are necessary. Also, new directives that may be helpful are left undefined. As I worked through the process, I determined that our php.ini file was based off the sample file from PHP 5.2. The upgrade to 5.3 had been made without much attention to the configuration file, and I did not want to repeat that for 5.4.

A Brief History of php.ini

This is all based on my own research going through the php-src repository and the PHP Museum.

  • PHP/FI 1.0 does not appear to have had any external configuration.
  • PHP/FI 2.0 did not have an external configuration file. All run-time configuration was done in the Apache configuration file. There were very few directives, mostly related to system-level configuration (directories, databases).
  • PHP 3.0 marked the appearance of a separate configuration file, then called php3.ini. The file was included in the source as php3.ini-dist. The file included the following comment: “All the values in the php3.ini-dist file correspond to the builtin defaults.”
  • PHP 4.0 renamed the configuration file to php.ini (and php.ini-dist). A new file php.ini-recommended (originally called php.ini-optimized) was added to the source with recommended non-default settings.
  • PHP 5.3 replaced php.ini-dist in source with two files, php.ini-development and php.ini-production, with recommended values for each environment.

The biggest changes were those made for 3.0 and 5.3. Also worth noting are the changes made in 4.0. Starting with that version, some of the values in the php.ini-dist file are commented out. This means that although the file only contains default values, some of the directives are being “reset” to the default value, while others are only shown in comments. This was a source of confusion as I worked on our new configuration file.

No matter which version of PHP is installed, one key issue is that packaged distributions install one version of the file into /etc, and additional source-provided files may not be installed or are installed in an additional, unobvious location.

Writing a New File

Once I had decided to write a new configuration file, I had to decide which approach to take. The first option was the one taken by many developers, and used for our application in the past: start with the file from the source or package, and make changes as needed. The option I chose, however, was different: start with an empty file, and add only what was needed. I chose this method because I expected it to produce a configuration file that was easier to read (in part because it would be much shorter) and more clear about what non-default values are being set.

To get there I needed to know which configuration directives were important to us. I made a copy of our current php.ini file. First, I removed any directive that was no longer valid in 5.4. Then, I determined which directives had been changed from their defaults. I did this by cross-referencing each directive with the PHP manual and the source files for 5.2, 5.3, and 5.4. (Only the approximately 20 directives that are that are changed for the “development” and “production” sample files have their default values in comments.) When finished, I had a list of directives that were not set to the current defaults. I reviewed this list to and removed a couple that were no longer needed. Finally, I checked the list of new directives for 5.3 and 5.4 to see if any should be added to our configuration.

To assemble the final file, I wrote in all of the non-default directives that we had. I also added a small number of default values in cases when the application strongly depends on those values. (For example, we have a lot of timeout-sensitive SOA-style calls, so default_socket_timeout made the list.) Importantly, for every directive I added a comment briefly describing the setting, the default, and our reason for changing (or not changing) it. When we want to use different values in development and production (like error_reporting), both values are listed in the file and we uncomment the correct value for the environment.

Takeaways

The resulting file, including comments, is 194 lines long and includes 27 declarations. (The current php.ini-development in 5.4 is 1865 lines long.) We’re using the file in our development and production environments with easily-comparable changes between the two.

While doing this project, I searched for any php.ini analyzer or recommendations, but found very little beyond the comments in the source files. What does exist is almost entirely focused on security. I feel that a tool that takes a php.ini as input and outputs recommended values based on version and environment would be useful to many people.

As noted above, much of the time spent on this project involved reviewing every directive in our existing file and determining if, and how, it had been changed. This would have been much easier if the distributed files did not contain so many directives that are merely restatements of the defaults. In my opinion, the php.ini-development and php.ini-production files should only include the directives that have been changed from their defaults, and the php.ini-dist file should be brought back with all directives commented out, so that any change made is obvious.

(Special thanks to Brock for proof-reading a draft of this post.)