Building Things Properly

February 23, 2016

As I'm working on Grawl to push it to MVP, I thought I would share some thoughts about building things properly.

The Client That Wanted an MVP

A few months ago, a prospect client asked for an estimate to build an MVP. That MVP would have access to users bank accounts and be able to perform real ACH transactions on their behalf.

The client wanted to build something extremely quickly to be able to prove the market fit of his idea, which really does make sense. So I crafted an estimate that would include the desired minimal feature set. He was surprised by the amount of time needed to build that MVP (a few weeks).

For many reasons, I think he is a smart guy with a great business idea. However, he had some misconceptions about what it means to build a web application.

For a lot of people, the difference between a website and a web application can be unclear, and that's really understandable! I have to explain it from time to time to clients, friends and family members.

So, that client asked for some explanation. Why would it take so much time to develop his app?

A Website vs An Application

First of all, let me explain some differences between a website and a web application. You can use both through your web browser, like Chrome, Safari or Internet Explorer, but conceptually, they are not the same.

A Website Is…

… content that you consume, that you read. Here are some examples:

A restaurant's website where you can lookup the menu and the phone number.
A furnitures factory's website where you can see a list of what they make, how to apply for a job, phone numbers of each departments, etc.
A blog or a newspaper's website, where you can read articles.

A Web Application Is…

… software that you can use to do things. In most cases, you can log in. Here are some examples:

A timesheet software where you can enter the time you worked at your day job.
A to-do application, where you add tasks that you have to do, and check tasks that you have done.
A bank's web application where you can log in, see your latest transactions, send money to someone, pay invoices.

The Line Is Blurry

It's not always super clear. Here are some examples where the line is blurry:

You can reserve a table at a restaurant using their website, this is handled by software.
Some administrator has to log into a content management system (CMS) to add/edit/remove content from the furniture factory's website.
A user can log in the newspaper's website and post comments to articles.
The bank has a regular website, where it showcases it's product. On the top right corner of the page, it says that "You are logged in as Antoine Leclair", which has to come from the application.

Developing a Web Application Is Not Developing a Website

The lifecycle of a website is not the same as the lifecycle of a web application.

The Lifecycle of a Website

The lifecycle of the code base of a website is similar to a brochure. Some process (meetings, design, production, etc.) leads to a final product. The brochure is printed, ready to be read. The website is online ready to be read.

At some point in the future, it will get too old. A new one will be produced. The same process starts over (meetings, design, production, etc.). The new one is ready, the old one is discarded.

The code base of the website during his whole life will not change much. A new social network appears, we must add an icon. The client did not realize he needed a page that let people know how to get to the factory, a new page must be added, with a new link to that page on the home page. If the content has to be changed (more articles in a blog, new products in the catalog), the code base will rarely change. The changes will rather be done in the CMS without touching the code base.

You rarely find bugs in a website because the code is simple: it gets data from a database and shows it in a page, it affects how things are presented (the color for example), it performs simple animations. If there are any bugs, they are generally really simple to fix and deploy. Even someone unfamiliar with the code base will be able to do it easily.

The lifecycle of a Web Application

The lifecycle of the code base of a web application is extremely different. It really is a software dressed like a website.

You can rarely foresee every aspect of a software before actually using it. There's no solution, it's like that. Software is intangible. Talking about it without touching it only make things worse. So, the way successful software is generally done is by delivering the minimal feature set that has value for the users, and then augmenting/changing that feature set as time goes.

It is rarely a better idea to create a new version of a software and discard the old one. For example, if you discover that your timesheet application would also need to do some accounting and that users should have the ability to add details about the hours worked, those features will be added to the existing application, instead of creating a new application.

This requires a code base that is evolvable.

Some Considerations When Building a Web Application

Automated Testing

In order to make the code base elvolvable, the first thing that comes to my mind is automated testing. There are many types of automated testing: unit testing, functional testing, integration testing… I won't go into the details of each type. Some types of testing may be more appropriate than others depending on the project.

An automated test is code that will reproduce a use case in an automated manner and verify the result is correct. It could be code that tests that when feeding 10 to the function compute_taxes, it actually returns 1.52. Or it could be remote controlling a real browser to simulate that a user logs in, fills his daily timesheet then logs out, making sure that the changes were saved correctly and the side effects (update a report, etc.) were done properly.

To someone not familiar with writing software, it may seem like trivial test examples that do not really add value. I'll give you some examples where those tests will come in handy.

Three years after the software has initially been deployed, and the person that wrote that code does not work there anymore, the business expands and the function compute_taxes must be updated to compute the taxes according to the state where the product is sold. Instead of receiving only 10, it will also receive NY or CA, and the function should return the tax amount according to the state. You can update the existing tests to include the state that was implied before (for example, CA), that make sure the working case still works after the modification, then add new tests to also test other states. The developer does not have to manually test each cases. He just runs the tests, and that way, he is sure all the application still works correctly.
A bug report comes in. The taxes for NY are not computed correctly. If the code base is not covered by tests, the developer will look at the code, fix the bug, and test the application manually to see if the bug from the report he received is now fixed. It seems to be fixed, so he pushes his change in production. But by fixing the bug, he also introduced another subtle bug, that made the tax calculation wrong for every other state. If the code base is covered by test, the developer can run the test suite and catch the error right away. Also, since there was a bug in the first place, the developer will write a new test to make sure this bug never comes back.
A new requirement is that the timesheet includes a new box to enter the overtime, separated from the regular time. The existing browser test should still work. It may look like a very simple change, but it's often in those unexpected cases that something breaks somewhere else. In this example, loading the reports could make the application crash. The browser test will find it really fast.

As most bugs are discovered as the code is being written, instead of receiving bug reports, fixing them as they are introduced is much faster. Also, the test suite is there to catch other bugs that the fix could introduce.

To summarize, automated testing really helps to keep a code base evolvable.

Pre-made modules

When building a website, pre-made modules/plugins are often useful. Need to create an animated slideshow? There's a jQuery plugin for that. Need a private section on your website where only authenticated users can upload files? There's a WordPress plugin for that.

When building a web application, using those pre-made modules is often not the optimal solution for many reasons.

While it's normally accepted to adapt the requirements of a website according to what the existing plugins can do, it's normally not the case for a web application. For example, it could be important that a user provide his mobile phone number while signing up, but the existing modules do not ask for that information.
The requirements of a website will rarely change over time. If they change, there are good chances that a new website will be built to replace the old one. The requirements of an application do change. If a pre-made "users" module was used, to allow users to authenticate and access a private area, when the requirements change (they will), it could put you in a place where you can't modify the application to add the new behaviors. For example, when first writing the application, we did not need to ask for the phone number during signup, but two years later, we need that information to send an SMS near the end of the signup process, but the module that was used does not allow it.
It's often really hard to incorporate the pre-made modules into a complete test suite. They generally lack the access points that would be needed to test correctly, in the context of your application.
Using pre-made modules is often more about hacking the module to make it work in your context. Hacking code together to ship a website is perfectly acceptable. Doing the same for an application will make very brittle code, with really big consequences.

When building a website, it makes sense to use those modules because it makes it so much faster to ship! Not using it could mean 10x more time needed. Since you don't need to maintain that code, you would rather code 9 new websites over time than investing time in the first one to make it really maintainable by writing custom code. The first one will be thrown away in 2-4 years anyway.

When building a web application, you can't afford to lose the flexibility that pre-made module will take away from you. It could be acceptable in some cases, but probably never in core parts of the application. For example, if there is an animated slideshow in the application, it's probably OK to use that jQuery plugin. Replacing it with another one in the future will be easy since there is nothing that really depends on it.

However, using pre-made "users" modules, or "administration area" modules, is often a really bad idea when building software, because it will lock you in using those modules forever, with their limitations.

APIs and Mocking

When building websites, we sometimes have to access external APIs (data access points that software can use). Most of the time, for a website, it is very straightforward:

Fetching the latest tweets of a user and displaying them on a page.
Getting the latest value of a stock and displaying it on a page.
Getting the weather forecast of a city and displaying it on a page.

You get the pattern: read data from one access point, output it on a page in the desired format. Even better: there are often libraries or code examples that do the data-fetching for you. Copy-paste and tada!

When building a web application, the consumption of those APIs are generally not that straightforward.

There are often many calls to perform in a sequence.
Data has to be saved in the database.
That data in the database has to be architected to fit the exact business needs of the current application. For example, if a user enters a credit card (that is submitted to an external API), in your exact business needs, that credit card could be associated to a group of users, even if in the external API it's associated to only one user.
The API is only a data-access layer. It's very easy to do one API call. But it requires time to code all the meat around it. For example, if a user entered a credit card to subscribe to your service, it would probably mean that you have to keep track of the payments received for that user, and keep the user account active as long as the payments allow it. That means doing something special when the account becomes inactive. Also, what do we do when that card has expired? What do we do when a payment fails?
Since we are generally also sending data, there are generally error cases that have to be handled. The user entered a wrong credit card number, we have to tell the user that something failed.
External APIs will also want to tell your application about events that happen. You will have to write code that listen to those events and do something about them. For example, a user subscribed to your services using his credit card. You'll have to listen for successful payments to increase the duration of the subscription of the user.
A call to an API could fail because the external provider is down, or there is a network problem. You generally have to handle it in more complicated ways than "just hide the tweets if you can't read them", as you would do on a website.

Also, since those APIs are external services, it can be hard or impossible to write automated test suites (see above) that incorporates them. There are few reasons for that:

Every time you begin each of the automated tests, you have to be able to reset the environment to some initial known state. For example, if you had a test where a user logged in, added time to his timesheet, then looked at the updated report, you have to be able to reset the hours to some initial state before the test. Otherwise, every time you run the test, one more entry would be added, and the updated report would be unusable for testing because you will not be able to make sure that the last x hours have been entered, because they are already there 240 times from the previous runs of the tests. You need to be able to reset the state. External APIs often do not have that ability. They sometime only offer a sandbox where you can't reset the state to an initial condition.
Some API providers don't even offer a sandbox. It means all the calls that your application will do, even during testing, will be made against the live systems. It's impractical for some, if not most, testing. For example, you would have to submit a real credit card number and do real transactions on it. Every time you run the tests. Which is many times a day.
The external API sandboxes are extremely unreliable. They are often offline, the provider closes your sandbox account or changes your settings without telling you, they are testing new features that breaks the old ones, etc.

Since you can't include them in your tests directly, you have to create something that behaves like them for the context of the test cases. It's called a "mock". During the tests, your application will talk with that mock instead of talking with the real API endpoint. It's a mini-version of the external service. For example, if you mocked a payment provider, it could accept all credit card numbers except for one, that will be used for testing when the user enters an invalid number. The payments could always work for all credit cards but one number, where the payments always fail. Or it could refuse payments of 29.99 on one specific card number. Or it could only allow for 2 payments of 29.99 on one specific credit card number, then bounce the next ones.

Since you wrote that mock and you host it, you have complete control over it:

It will always be online when you need it.
You can reset it when you want.
The reset you apply brings the state to the exact case that you need.
You can simulate specific use cases e.g. two payments that works, then a third one that fails, etc.).

Those mocks are really needed and are really useful, but they also take time to write.

Deployment

Deploying a website is generally done by uploading the new files to the server that hosts those files. The developer sees that it looks good on when he worked on it on his laptop, then he uploads it. Or even faster, some directly edit the files on the production server.

Deploying a web application is generally more involving. You push your work in a source code repository (think GitHub), a continuous integration (CI) server builds the source (more on that below), runs all the tests, and allows you to deploy if all the tests are green. Doing the setup of the CI server (how it pulls the code from the repository, how it builds, how it runs the tests, how it deploys) requires some time.

Also, there's often a staging server to maintain, where a copy of the application is running fake accounts. That way, you can demo new changes for approval without deploying to the production server.

You must use those good practices when developing an application because otherwise, bugs that did not occur on the developer's laptop emerge in production. The exact source of the bugs is always a surprise, but it includes:

The timezone was not the same on the developer laptop and on the production server.
The developer runs Mac OS X or Ubuntu, but the production server runs another operating system, like Debian.
The developer has a library installed on his laptop, but it's not installed on the production server. This allowed the developer to scale an image on his laptop, but doing the same thing on the production server crashes.

Database Migrations

When a website has been deployed, it rarely happens that the database schema will change. For example, it will rarely happen that, suddenly, you need captions in the pictures of the slideshow, while you did not need it when the website was first published. Also, when that happens, you can connect to the database, write the command that adds the new column, and execute it.

For an application, a change to the database schema is everyday life. But, unlike the website, you can't just connect to the database and write some commands. It needs to be executed in a reproducible manner. Some reasons are:

More than one person may work on a project. Everyone has to keep their schema up-to-date.
There are more than one server to apply the changes: production, staging, etc.
The changes has to be tested before they are executed in production. Some changes may fail or have unwanted behaviors. This has to be correct the first time it runs in production.
Many different features may be developed at the same time in parallel. The database schema has to match the current branch of the code. In order to do that, you have to apply changes and also be able to apply the reverse of those changes easily when you change branch.

This is why we need database migrations: a system to do database schema changes in a reproducible manner. The simplest form could be a folder containing SQL files. But it's generally in the form of code that apply those migrations for you when you ask for it.

Doing the setup of the migration system and writing proper migrations is another task that requires time and is not needed when building a website.

Compiled and Minified Resources

When building a website, you may want to minify your resources. That means that you take your JavaScript and CSS (style sheets) and use a minifier that will reduce the size of those files by using many strategies, like change long variable names like numberOfChildren to b and removing line breaks. Also, there are new languages that developers sometimes prefer to use instead of JavaScript and CSS, that will be compiled by a compiler to become JavaScript and CSS. CoffeeScript and SASS are two examples. When you build a website, building those resources is something that could be done manually. For example, just before you deploy, you minify all your resources by using the command line.

When building a web application, that process has to be automated. Some reasons are:

You do not simply copy your code to the server when you want to deploy (see deployment above). Some other server will deploy for you.
The code is kept under source control. When the code is compiled and minified, it's generally on one single line, which means that the whole file that contains all of your JavaScript changes every time you change one single character in one of your file. This result if huge changes in every commit, which makes the repository bigger (takes longer to clone) and harder to understand.
The minified version also needs to be tested by the CI server.
The version deployed has to be the one tested on the CI server.
The code changes a lot. Building the minified files manually every time would not make sense.

Doing the setup of the automated minification/compiling is something that also requires time and could be skipped when building a website.

The Discardable Prototype To Test Market Fit

Until now, I talked about a website vs a web application. I used the website comparison, but it could probably also be valid with a prototype application that you only want to use to discover if your idea has a market fit.

If you intend to throw away your prototype application right after you discovered if your idea has, or not, a market fit, you could consider prototype applications to have the same characteristics that a website would have. In theory that is true. You will never need to maintain the code, so you don't have to bother about making it evolvable.

However, there are some more considerations before you can go that path.

First, it's really easy to fall in the trap of thinking that you will discard your prototype application really soon. In practice, the code base rarely gets thrown away. A total rewrite is extremely rare. The shortcut you take now will stay with you for a really long time. It will cost you money and agility.

Second, it's not always possible to do a quick prototype like that. I'm going back to the example I started with: the client wanted an application that would have access to users' bank accounts, see transaction history and other data and perform ACH transactions. This is a case where the application will store really sensitive data and would be able to perform really sensible operations. A bug or a security breach would be a disaster for the users, the client, the developer and everyone in between. In those cases, you have to write reliable software. That means using the best practices mentioned above, like automated testing.

Back to Grawl

Like I was saying in the introduction, I'm currently pushing Grawl to bring it to an MVP level. In this case, I'm my own client. I would really like it to get there faster.

I'm currently writing the code that handles the credit cards. It seems quite simple at first, but a lot of use cases have to be covered by the automated tests. Otherwise, I could be charging too much to someone, leak private data, allow someone to use the application without paying the correct amount, etc.

It takes time.

But building that part properly is required. That's not an optional feature.

In Summary

This was a super long article, but I think writing a shorter one would not have covered enough to be taken seriously.

Building software the right way requires time, but everyone wins in the end.