I feel this way about questions that disparage good software practices.
There was an email chain at work about a possible bug that got sidetracked into a discussion about how all work should be done in standardized environments; there should be some kind of local VM setup, shared development environment or similar. Something about this didn’t sit well with me and I’d like to explore why. At the very least it is in direct conflict with my firm belief that your new hire should be able to check out a project on a clean machine and have it Just Work. Either my gut feeling about standardized environments is on track or that belief is misguided.
The reasons for this stance on environments make sense; if your development environment is identical to your production environment then there will be less room for surprises like subtle gotchas in interpreter, database, and other service versions. You don’t have to worry about urls to services being different in different environments. If you need to use some kind of cache you can get it setup and then rolled out to everyone to use. If there’s a file you need to access you always know where to find it and…
There is a danger that lurks in the shadows. The closer you work with one, specific environment the more your application will become tangled up with it. You will take for granted that a file exists at a specific place and so have a class or function access it directly. Or that a hostname will always be the same, or that credentials to a database will never change, so you hardcode them. You become reliant on an esoteric extension to your environment and then forget about it. These creep into your codebase and ever so slowly make it brittle.
Now that your database connections or web service calls are baked in you lose the flexibility to swap in a stub for testing or creating a new implementation when you need to store the data differently. When the reads or writes to that file create race conditions or become unscalable it’s going to be tough to remedy. And what happens when that esoteric extension stops being maintained but you want to upgrade language versions? Over 1k classes depend directly on it! Should you just stay a version behind? 2 versions behind? 3? 10?
I hear a bit of, “That’s impossible! There are databases to connect to! Web services to retrieve data from! This application needs those things or it cannot run!” I’ll posit this means your application is too coupled to the data or the way the data is stored and retrieved. Have an in memory database for development and add some preliminary data as part of the build process. Use a hard coded stub in place of those web service calls. That 3rd party thing that requires a license? Stub that out too. You are building to interfaces, right?
Automate all that configuration with a good Configuration Management tool. Don’t let the developers stumble through where the stubs live and how to configure them. Don’t make DevOps worry about what extensions and services need to be installed in production and how they’re setup.
At the end of the day having standardized environments makes development easier for your engineers but it does not make it simpler. Don’t get me wrong, your pipeline should still take your releases through a production-like environment at some point but don’t constrain your development cycle to use one. You want that new gal who just started to break something because she’s proficient with a new language feature. Now you know that there’s reason to consider upgrading the language. That guy who just started on your team and ran into an issue because he didn’t know how to configure the application? Well now you know you need to stub that out and add it to your configuration manager.
These little gotchas help suss out problems long before they become paralizing. You want to catch them before you wake up one day and you’re working in an environment with a version of your language that is 10 years old with strange hacks and patches and on an operating system that is 15 years old. And you can’t upgrade because, well, you’re just not sure what would happen.
All these little costs can be hard to measure. How much developer time is being wasted on ceremony when there’s a new library that automates this common task? How much time goes into writing unit tests to cover cases that a newer version of the language prevents from happening? How many opportunities are lost to work remotely because your application requires 100 remote connections per request and the airline wifi isn’t fast enough? How many good interview candidates walked in the door and then right back out because because they would have had to work in a stack that was 10 years old?
My friend Jon and I are taking the week to explore creating a proper Deployment Pipeline for a PHP project. PHP projects are often deployed by checking out all the appropriate source code to prescribed directories on the target server, running shell scripts that determine environments and perform search/replaces in source files, soft-linking environmental config in, copying files around, compressing and moving resources around, running unit and integration tests. At first blush it seems you could automate this by stringing together some small programs that run each of these steps and then hands off control for the next step but that would only be scripting. This is an important distinction. Automation buys a concept and relieves cognitive load, Scripting does not. Scripting would be the equivalent of, “…move the shifter into 1st, release the clutch until you feel the pressure point, give a little gas and let the clutch out all the way, give more throttle to accelerate, let off the gas and put in the clutch, move the shifter to 2nd…” where as Automation would be, “Put the car in Drive”. There’s certainly nothing wrong with scripting and I personally prefer to drive stick but if you run a company where shifting a transmission is incidental to your core business you should automate it.
One of the first issues with the above scenario is that we have a mutable build. The build is either transformed at every step or rebuilt in every environment. The artifact we start out with that passes our unit tests is not the same thing that gets delivered to production later. This lowers our confidence in the build which manifests as unknown issues with the deploy. Maybe you just try again. Maybe you try again and copy `config.php` from `sample/` to `prod/` because someone told you last week but it slipped your mind. Hopefully that will work. Depending on your background this may be a little foreign. In Java, for example, you would never copy your compiled classes between servers because it’s easier to package it as a `jar` or `war` or then copy that (not that I haven’t seen `.class` files copied individually to servers). The tooling makes that easy. In PHP-land this concept is very much in it’s infancy which makes it non-obvious and harder to do than just copying source files.
The first thing we did was define a deliverable. This was a little harder than we thought as the current definition of a project is often “Source code files copied to a certain directory on a server.” It’s easy enough to define the deliverable as a directory or tarball or zip file but we ended up going with PHP’s Webphar. Phar is PHP’s answer to Java’s JAR and Webphar is similar to WAR. The immediate benefit is that you have an artifact you can drop into your web server that will Just Work but the deeper benefit is that this gets people away from thinking of deploys as “copying files” and creates a clear distinction between source code and deliverable. This distinction comes for free with compiled languages but it still exists in interpreted ones, it’s just somewhat hidden; the source file you edit, although the same as the one you deliver, is still different. One is source code and one is a deliverable.
The Nitty Gritty
Our project name is `lobs` for myriad reasons so try not to worry about that. This is our `create_phar.php` file:
<?php ini_set("phar.readonly", 0); $phar = new Phar('lobs.phar', 0, 'lobs.phar'); $phar->buildFromDirectory(__DIR__ . '/src'); $phar->setStub('<?php Phar::webPhar(); __HALT_COMPILER();'); $phar->setSignatureAlgorithm(Phar::SHA256);
There's a little bit of ceremony involved but there are probably tools like `phing` which can automate some of this. To run this and create the PHAR:
php -dphar.readonly=0 create_phar.php
We’re using Apache as our web server and, provided it’s already setup to understand PHP, you need only tell it about the `.phar` extension.
AddType application/x-httpd-php .php .phar
And voila: http://localhost/lobs.phar/index.php
We were pretty pumped about getting this far. I had dabbled with Phar files before and though they had “worked” there was a lot of trial and error and I wasn’t 100% sure _why_ it worked which was very unsatisfying. I had also never gotten it working with Apache. There is just not much documentation out there on using and interacting with Phars. Working together we were able to overcome hurdles pretty quickly and it was very exciting the first time we copied our Phar to our deploy directory and were viewing files in the browser.
Join us next time when we try to rope in Dependency Management, Pipelines, CI, Build Tools, Static Code Analysis and maybe break some dishes.