Introduction to static site generators and a new paradigm for building websites
note: This post was written at the beginning of 2014; the tools mentioned in this post are now outdated, but the ideas behind the paradigm are growing ever more relevant today.
The Very Brief History
In 2012 developers for the Obama re-election campaign moved from the ExpressionEngine-based website of the prior campaign to a practically retro HTML site with no content management system. Only this wasn't 1999 and the website was actually quite modern, having been generated by the popular open source software called Jekyll^1.
A hugely succesful site ^2, it had over 81 million page views and was a model of performance and stability. It was a milestone in the movement for web developers to utilize a less bloated technology stack than had become the norm over the past decade^3.
Static site generators, like Jekyll, convert text documents into HTML using one or more templates for laying out the pages. There is no content management system and no database. While once the realm of "hackers" only, there's a growing list use-cases for sites created this way because they're incredibly fast, simple and flexible.
The heart of a "traditional" CMS driven website is a SQL database where all the site's content and settings are rigidly organized into tables with rows and columns, something like a very complicated spreadsheet. Each CMS has its own scheme of organizing this information and on top of the database there is an application layer that converts the data into pages, route those pages into web addresses to serve to end users, and manages various user functions, whether or not we need them.
Querying the database every time a page loads takes time, which, even measured in milliseconds has shown to impact how people perceive and interact with your website^4. Many CMSs cache (save pre-rendered web pages in memory) to speed up the process, though that merely adds a another layer of complexity rather than solve the original problem.^5
SQL databases are not only potential sources for performance bottlenecks but are targets for hacking. What's more, they can be difficult to maintain in a version control system, and act as a barrier--when content is organized around them--to moving content from one system to another. Even free open source systems may come at a great cost of maintenance and hosting at scale as well as reliance on a variety of add-on developers that becomes expensive over time.
This is not to suggest that database driven websites are inherently bad in all cases, but for a vast number of independent websites a database driven content management system may very well be overkill, like using a canon when a rifle will do. As websites get fatter and slower^6 it becomes ever more important to question how they are built and look for new efficiencies. While static websites aren't a panacea, they very well may be a gateway toward a new way of thinking.
The Post-CMS World
Ben Balter, Github Gov 2.0 Evangelist and former Presidential Innovation Fellow, calls this the “Post-CMS” world^7, and I believe he's right. Solutions such as Jekyll or other static site generators (there are many) are not content management systems because they only process files (i.e.convert documents and templates into HTML pages). However, it's only the database and system of application layers that is missing. Developers and website owners are beginning to use a variety of loosely connected tools that are taking the place of formerly integrated content management systems, choosing only the bits they need and only the ones that work best for their needs.
Docpad is an excellent embodiment of this paradigm, billing itself as a "next generation" web architecture where one can choose pretty much any feature they need, whether that be a database, a particular templating system or editor, or other function to suit their needs. No assumptions and nothing you don't need.
Documents and Data
Content management for static site generator is nothing more than a set of files in folders, much like what we are all used to using on a day-to-day basis. In the real world we organize our little bits of data into spreadsheets and our text in documents, yet in the world of CMS driven websites we lump all of our documents into spreadsheet-like cells, which in turn creates a separation between us and our content.
I've worked with writers and editors over the years to help them get accustomed to the act of writing in a word processor (very sticky Word!) to copying their work into one or more little boxes with a new set of formatting buttons so each piece can be injected into a database. Some get it, many don't and everyone seems relatively uncomfortable with it.
This is what a document might look like in a static site:
--- title: Title category: Category layout: article date: November 10, 2013 author: authorname --- content goes here
This is a text document that can be moved to any system or shared with ease. In a database driven website the information above may be separated into as many as three or four different tables. While a highly organized database isn't inherently wrong, document-based templating systems maintain the connection between the creator of the content and the resulting document.
Cost Effectiveness & Faster Iterations
Static sites don't require as complex web server setups so are much cheaper to host. In fact, depending on the circumstances, they can be hosted for free on Github's Pages service, which tightly ties their version control with the site's hosting.
Healthcare.gov (not the problematic back-end application) was originally planned to be on a commercial CMS but was later moved to static. This website has over a thousand pages organized for various users and was built with the same tools as the Obama campaign website. As a result of the move from a CMS to static, the site's hosting requirements went from 32 servers down to one.^8
The lack of rigidity in developing sites without a CMS make changes much faster to put into effect. The ease of version control also makes it such that a developer or developers anywhere can contribute to the site's development and a site can easily be "forked," that is copied for use on another site, in a matter of minutes. The entire website can be treated as an open source project (it can also be kept private in this set up).
Static site generation is quickly moving out of the realm of "blogs for hackers" and being used for a variety of use-cases. There are still gaps in the usability of these systems for content creators and the software can only be used by developers comfortable in the realm of working without the interface that many rely on.
While there are hurdles, the landscape is changing quickly. Siteleaf is an excellent example where some smart developers understand the need for an interface for content creators while still maintaing the essence of static site generation. There are also platforms like Harp that facilitate publishing static sites through Dropbox.
There's no shortage of innovation and I believe these tools will soon be in the mainstream. Both Cactus and Mixture.io are tools that make the development side as easy as downloading software, and soon we will likely see a new form of hybrid content management systems that are more like a loose collection of tools curated rather than developed by one person.
I'd highly recommend you read through some of the articles listed below, particularly Development Seed's "How We Build CMS Free Websites," Rob Muhlestein's "The Static Web Returns," and Ben Balter's "Welcome to the Post-CMS World" where he has recently updated his performance analysis.