Lights-off for a Side Project

I received an email from my domain registrar last month. Actually, it was four emails. The domains for one of my side projects are up for renewal. It’s got me thinking about the project, what I put in, and what I got out.

The project in question is Electrious. You should click that link now because pretty soon it’s not going to work. The inspiration came from something as simple as my electric bill. One month it came with a small insert that described the “power mix” of the service–how much of our energy was being generated with nuclear, coal, hydro, and other types of generation. It got me thinking about what you could do with this data. One thing that I was struck with was the possibility of knowing the environmental impact of my electric usage, like how many pounds of CO2 were released into the air to run my refrigerator for a month.

It was another couple of years before I had a sense of what I wanted to do with this idea. Just being able to look up power plant emissions or Energy Star data wasn’t very exciting, and it certainly wasn’t anything more than looking up data that was already available. It finally came together in my head in early 2010. Around that time an odd service called Blippy was making headlines. By entering your credit card into the system, it would publicly share your purchase information. I realized that social was the key to making something like this work. The end result would be something like RunKeeper for your electric bill: enter the basic data (how much electricity did you use) and the site would put it in a larger context.

I worked on the site for much of the next year. I wrote a small proof-of-concept, then built it out into a full application. I came up with a name that I’m still proud of. I got the domain names (including .com, .net, .org, and the Twitter-friendly electrio.us). I spent several nights choosing colors, designing a logo, and drawing a cartoon-style background image. I set up a Facebook page and a Twitter account. I added a feedback service.

What happened next is entirely predictable: nobody used it except for me. “Calculate, compare, and share” the headline said, but after calculating your usage there wasn’t much to do if nobody else arrived. The site was limited to Maryland and DC at first (where I was living) because I had to manually enter the utility information into the database, and very few states made it easily available. For users, manually entering your usage data every month was easy to forget.

In his post “Shutting down a side project”, Bach Le concludes with three things he learned:

  1. Engineers like to build stuff, even if no one is asking for it (an engineer that truly loves the craft, anyway).
  2. Most engineers have a hard time SELLING. Many engineers have trouble getting out of their comfort zone and talk to strangers.
  3. Work on solving a problem that people have. Working on a problem that people don’t have probably means that people won’t be willing to pay for that product.

I realized that these were the same issues that Electrious had. I built something for myself, and it wasn’t even solving a specific problem I had. Bach is responding to Josh Sharp’s post “Shutting down blaster.fm, my side project of two years”. Both of them were looking to build revenue-generating services. While I did have revenue ideas for Electrious (ads, of course, and partnerships), that was never my reason for building the site. Still, no users meant no revenue. Most importantly, though, I am terrible at selling. I could never bring myself to shamelessly promote the site. The few friends who saw it gave words of encouragement, but it was never seen by a wider audience.

Even though Electrious was mostly dead-on-arrival, it was still a useful experience. I proved that I really did know enough to plan, build, and deploy an entire application with no outside help. I learned that given enough effort I can turn out a passable (if not stunning) website design. I had code I could share with potential employers. Most importantly, I learned that building a service that needs a large audience to work properly isn’t for me. My more recent side projects, like Whenopedia, have been about building something that is useful for me. If other people like it and want to use it, it’s reaffirming, but not a necessity. I think there is still an opening for exposing the data that Electrious combined at a consumer level. Even though Opower and the Green Button make accessing utility data easier for customers, they don’t get into the environmental impact. I hope that eventually someone else will give it a try.

By the middle of 2011, I had mostly walked away from Electrious. Still, there are costs to keeping a site up, even one without any users. It’s another thing to test when the server gets updated. It’s prevented me from using newer versions of software for other projects because I didn’t want to update the old code. Domains are cheap, but not free. I renewed them once in 2012, but this time it doesn’t seem worth the $60 to keep the site up for another year. I’ve kept it there this long out of sentiment and inertia. Luckily, the domain registrar can help me with those.

Best Practices and the Law of Unintended Consequences

A post I wrote for the Expensify blog about how the “recommended” Alternative PHP Cache (APC) configuration couldn’t handle our traffic:

At Expensify, everything we do is a balance. As a startup we can’t build every feature we and our users want, or install as many servers as we can imagine. Sometimes, though, we see a change we can make that won’t cost much (in time or money) and will benefit for our users. Here’s the story of one of those times that didn’t work out as well as we hoped.

Most of our website is written in PHP. While there is some healthy debate among our engineering staff, most of us like PHP for its rapid development and ease of deployment. Our web servers use the Alternative PHP Cache (APC) to cache compiled code and speed up requests for our users. A few months ago we updated our web server configuration to use less memory for each PHP process.

Read more…

Reconfiguring PHP

Earlier this summer, I undertook the process of writing a new PHP configuration file for the web servers running our core application. Until now almost every PHP installation I’ve worked on has used lightly-edited (or sometimes heavily-edited) version of the default files that ship with the PHP source or packages. This time, for reasons I’ll explain below, I started with a blank file and built up our configuration from scratch. Along the way I learned a bit about PHP history, found some unanswered questions, and hatched a few ideas.

Why Do It?

The primary motivation for writing the new file was our upcoming upgrade to PHP 5.4. At a minimum, the file we were using would need editing. Several configuration directives were removed in 5.4, and a few of them (most notably allow_calltime_pass_reference) would cause a startup error if they were defined. The task fell to me as the member of the team with the most interest of the inner workings of PHP.

Server-level configuration files occupy an interesting position in the web stack. Management of these files is typically handled by the operations team, but the values set in the configuration file directly affect the performance of the application and often need to be chosen by the developers.

It’s common for a server-level configuration file to be set up at the start of an application’s lifespan and not have much attention paid to it afterwards, even as other components get upgraded. As a result, older configuration directives have been deprecated or removed stay around. Even if they don’t cause any harm, leaving them in can make it difficult to edit the configuration when changes are necessary. Also, new directives that may be helpful are left undefined. As I worked through the process, I determined that our php.ini file was based off the sample file from PHP 5.2. The upgrade to 5.3 had been made without much attention to the configuration file, and I did not want to repeat that for 5.4.

A Brief History of php.ini

This is all based on my own research going through the php-src repository and the PHP Museum.

  • PHP/FI 1.0 does not appear to have had any external configuration.
  • PHP/FI 2.0 did not have an external configuration file. All run-time configuration was done in the Apache configuration file. There were very few directives, mostly related to system-level configuration (directories, databases).
  • PHP 3.0 marked the appearance of a separate configuration file, then called php3.ini. The file was included in the source as php3.ini-dist. The file included the following comment: “All the values in the php3.ini-dist file correspond to the builtin defaults.”
  • PHP 4.0 renamed the configuration file to php.ini (and php.ini-dist). A new file php.ini-recommended (originally called php.ini-optimized) was added to the source with recommended non-default settings.
  • PHP 5.3 replaced php.ini-dist in source with two files, php.ini-development and php.ini-production, with recommended values for each environment.

The biggest changes were those made for 3.0 and 5.3. Also worth noting are the changes made in 4.0. Starting with that version, some of the values in the php.ini-dist file are commented out. This means that although the file only contains default values, some of the directives are being “reset” to the default value, while others are only shown in comments. This was a source of confusion as I worked on our new configuration file.

No matter which version of PHP is installed, one key issue is that packaged distributions install one version of the file into /etc, and additional source-provided files may not be installed or are installed in an additional, unobvious location.

Writing a New File

Once I had decided to write a new configuration file, I had to decide which approach to take. The first option was the one taken by many developers, and used for our application in the past: start with the file from the source or package, and make changes as needed. The option I chose, however, was different: start with an empty file, and add only what was needed. I chose this method because I expected it to produce a configuration file that was easier to read (in part because it would be much shorter) and more clear about what non-default values are being set.

To get there I needed to know which configuration directives were important to us. I made a copy of our current php.ini file. First, I removed any directive that was no longer valid in 5.4. Then, I determined which directives had been changed from their defaults. I did this by cross-referencing each directive with the PHP manual and the source files for 5.2, 5.3, and 5.4. (Only the approximately 20 directives that are that are changed for the “development” and “production” sample files have their default values in comments.) When finished, I had a list of directives that were not set to the current defaults. I reviewed this list to and removed a couple that were no longer needed. Finally, I checked the list of new directives for 5.3 and 5.4 to see if any should be added to our configuration.

To assemble the final file, I wrote in all of the non-default directives that we had. I also added a small number of default values in cases when the application strongly depends on those values. (For example, we have a lot of timeout-sensitive SOA-style calls, so default_socket_timeout made the list.) Importantly, for every directive I added a comment briefly describing the setting, the default, and our reason for changing (or not changing) it. When we want to use different values in development and production (like error_reporting), both values are listed in the file and we uncomment the correct value for the environment.

Takeaways

The resulting file, including comments, is 194 lines long and includes 27 declarations. (The current php.ini-development in 5.4 is 1865 lines long.) We’re using the file in our development and production environments with easily-comparable changes between the two.

While doing this project, I searched for any php.ini analyzer or recommendations, but found very little beyond the comments in the source files. What does exist is almost entirely focused on security. I feel that a tool that takes a php.ini as input and outputs recommended values based on version and environment would be useful to many people.

As noted above, much of the time spent on this project involved reviewing every directive in our existing file and determining if, and how, it had been changed. This would have been much easier if the distributed files did not contain so many directives that are merely restatements of the defaults. In my opinion, the php.ini-development and php.ini-production files should only include the directives that have been changed from their defaults, and the php.ini-dist file should be brought back with all directives commented out, so that any change made is obvious.

(Special thanks to Brock for proof-reading a draft of this post.)

GimmeTable – a table-based viewer for Gimme Bar

Two weeks ago, Ed Finker (aka Funkatron) published his MicroPHP Manifesto. Alongside this, he’s put together a great collection of lightweight PHP libraries the he has published on Gimme Bar.

Gimme Bar is a great tool for collecting pages, images, videos–basically anything you can find on the web. The collection pages are beautiful, but I found it hard to browse Ed’s collection because the page shows screenshots and not text. So using the Gimme Bar API I created a tool that can show a collection as a table. Boring? Yes, but also easy to read quickly.

Check it out: http://www.kerzap.com/etc/gimmetable/

Proudly Mapped in DC

I’ll be the first to admit, if the subject ever comes up, that I like maps. So last week, when the New York Times ran a story titled “On the Move, in a Thriving Tech Sector”, I was interested as much in the accompanying map as I was the article. I also wondered, what would this map look like for the DC Tech community?

After many hours of data wrangling, here’s the answer: http://www.kerzap.com/projects/dc-startup-map/.

I started with the list from Proudly Made in DC. (Thanks to Zvi and Michael for the assistance.) I had to limit each listing to one category, so there were some editorial decisions. I then visited each web site to find their (offline) address. The data was imported into Google Fusion Tables, which also handles the mapping. Some things I found out while doing this:

  • Only about half of the companies on Proudly Made in DC had addresses listed on their web sites.
  • Corralling everyone into a small set of categories that I could color-code led to some interesting decisions. This is the part of the map that needs the most clean-up.
  • I was surprised how far out I had to zoom to show everyone by default–our community extends from the Fredricksburg to the Maryland-Pennsylvania border.

If you want your company to be shown on this map, please do two things. (If you’ve already done the second, make sure you do the first.)

  1. Add your address to your web site if it isn’t there already. (A street address is preferred; a P.O. Box is acceptable.)
  2. Get your company listed on Proudly Made In DC. Scroll down to the bottom of their page for info on submitting your info.
The introduction to the Times’ map implies that it was created using SEC filings to get the list of funded companies. I imagine using that data for DC and the surroundings would produce a very different–but equally interesting–map. Maybe someday I’ll try to pull that together.

Old Presentations

I’ve finally uploaded some old presentations to SlideShare:

First, a talking titled “Faking Data” that I gave at DC PHP in July. I talked about the process used to generate a good demo for AddThis analytics.

Second, a short overview of 3rd-party authentication and authorization that I put together for some coworkers. Not much meat, but it might be useful to somebody else.
I’ve decided not to upload my mad-cap sharing data Pecha Kucha from the PHP Community Conference in April. If you were there, you probably know why not.