Google AMP, and Dependency Graphs for Improved WPO


This morning’s post concerns a couple of recent projects promise improvements in web efficiency.

Both of these projects address web complexity – the steady rise in ever more elaborate web pages and associated Software as a Service (Saas) APIs. As discussed elsewhere, web pages just keep getting bigger and bulkier.

Growth of average web page size and number of objects from 1995 to 2014

Source: http://websiteoptimization.com

The average mobile page is 3x larger today than it was a few years ago:

http://www.soasta.com/blog/mobile-web-performance-page-bloat-july-2015/

A detailed history of the whole rise in bloatware sites may be found at:

http://www.yottaa.com/company/blog/application-optimization/a-brief-history-of-web-page-size/

Fixes That Might Not Be Fixes?

Some tech fixes have been announced in recent months which promise things will get better. However, after reading this wonderfully scorching rant on webpage bloat, it’s not clear if these fixes will actually work.

http://idlewords.com/talks/website_obesity.htm

I begin to wonder how beneficial these changes and tech fixes will actually be…

So, in the following I’d like to discuss these new projects, along with critique from a sustainability perspective.

  1. Polaris Dependency Graph Project
  2. Google’s AMP
  3. Facebook’s Instant Articles

All You Webpage is Fat…

The ever-thickening waistline of web pages isn’t slowing down much. According to Craig Buckler’s summary of trends in the HTTP Archive, the average web page size went up 16% in 2015.

http://www.sitepoint.com/average-page-weight-increased-another-16-2015/

And, the most recent HTTP Archive results.

According to the SitePoint article:

  • 25% of sites do not use GZIP compression
  • 101 HTTP file requests are made — up from 95 a year ago
  • Pages contain 896 DOM elements — up from 862
  • Resources are loaded from 18 domains
  • 49% of assets are cacheable
  • 52% of pages use Google libraries such as Analytics
  • 24% of pages now use HTTPS
  • 36% of pages have assets with 4xx or 5xx HTTP errors
  • 79% of pages use redirects

The problem is particularly acute for CMS systems. Many (like the infamous WordPress this site runs on) send the same, overly complex web pages down to mobiles as desktops, just layering on extra dollups of CMS. And that doesn’t even consider the bloat that creeps into CMS themes regardless of their target.

Website Complexity and Dependency Trends in 2016

Web pages download a bunch of media, scripts, and other files they don’t always need in the particular use context. While all these files are used in some of the type, the typical download only requires a fraction of them be present.

And the rising inter-dependency of the page elements – another aspect of rising web page complexity – makes it difficult to streamline the pages.

A more detailed study of web complexity noted that the “cloud” is making sites more complex – multiple servers typically interact to produce a single web page these days, and 1/3 of the bytes on a typical web page came from non-origin servers.

http://web.eecs.umich.edu/~harshavm/papers/imc11.pdf

An additional problem is “dependencies” linking the bloatware assets on the client side. Client-side frameworks, like ReactJS, AngularJS, and the like often require lots of helper libraries, media and data files to run. Trouble is, unlike stuff copied into the HTML template on the server, these dependencies are “cryptic” – they aren’t obvious from examining the source code of a web page, and only manifest themselves when the JavaScript programs are running.

Since JavaScript-based APIs are touted as the wave of the future, and JS use continues to rise on web pages:

Change in javascript use on web pages between July 2011 and July 2015, by Soasta

Source: http://www.soasta.com/wp-content/uploads/2015/07/mobile-page-bloat-JS-requests-2015.png


1. Polaris to the Rescue?

In this light, I was interested to read about Polaris, a predictive library for stock browsers that promises to load pages up to 34% faster.

http://news.mit.edu/2016/system-loads-web%20pages-34-percent-faster-0309

Here’s the research paper:

http://web.mit.edu/ravinet/www/polaris_nsdi16.pdf

Polaris is a collaboration of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Harvard University. It is a “Dependency Tracker” system, designed to fix cryptic dependencies in pages leading to stalled pages and additional trips to the web server.

  1. The first component, Scout, is a measurement platform that monitors data flows during website operation to uncover hidden dependencies (e.g. a JavaScript library must be downloaded before it can fetch images used elsewhere on the web page). In particular, it addresses the “blocking” nature of JavaScript. Embedded tags halt page rendering until the file is downloaded and parsed. This is well-known to web designers, who learn early that most JavaScript should be placed at the bottom of their web pages (next to , in fact). This synchronous interaction make the browser inefficient in it can’t grab just what is actually needed during the particular page load. Scout differs.
  2. Scout differs from other WPO tools in tracking dependencies at a basic level. For example, it looks down to the level of individual JS variables to see which ones are shared by multiple libraries and functions.
  3. The second component, Polaris, is a JS library that is downloaded early. Using the dependency graph from the site analysis by Scout, it schedules downloads in a more efficient order.
  4. The combined approach pushes efficiency to the client-side, rather than server pre-optimization, as is done with Opera Mini.

Source: http://web.mit.edu/ravinet/www/polaris_nsdi16.pdf

The best thing about the system is that it was designed with built-in testing, and the authors conducted tests with lots of large sites. So their claim of increased efficiency is not just conceptual (as are many claims about web frameworks) but backed by hard, scientific data.

In practice, its possible to see how the system could deliver efficiency. Larger websites, with many designers and developers all fighting over what goes on a web page are likely to create “kitchen sink” solutions that download lots of stuff that is either loaded out of order (leading to latency problems) or just not needed on all but a few web pages. Analysis with Scout, plus Polaris optimization could help with this.

In fact, it could form the basis for a useful “Green Boilerplate” that I’ve been working on for several years. I experimented with some of the ideas in the paper, but did not try to create a dynamic dependency graph based on web traffic.

So this One’s a Win…

I don’t see much of a downside, except for smaller websites, or for CMS systems which do all their page processing on the server. Unfortunately, WordPress is one of these, at least until the WordPress JavaScript REST API becomes widespread later this year.

However, Polaris seems to share some features with Google’s AMP (see below). Until the libraries are released, it is difficult to see how this academic project compares to commercial solutions.


 

2. Google’s AMP

Bloatware is a big concern to Google, which, along with a very few other vendors, controls much of the “cloud”. In particular, Google has shown a continuing effort to reward sites that make their pages mobile-friendly, and that includes reducing page size. Enter AMP.

https://www.ampproject.org/

The goals of the project are to increase performance for mobile devices by re-defining web pages for fast loading

A list of goals from the main site:

https://www.ampproject.org/docs/get_started/technical_overview.html

Google’s solution is a throwback to the ancient, pre-iPhone, days of non-HTML mobile markup such as WML. Their new spec defines additional non-HTML5 tags required for an AMP pages to work.

Besides these new tags, you’re creating HTML5, preferably with Schema.org markup.

<!doctype html>
<html>
  <head>
  <charset="utf-8">
    rel="canonical" href="hello-world.html">
    name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1">
   amp-boilerplate>body{-webkit-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-moz-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-ms-animation:-amp-start 8s steps(1,end) 0s 1 normal both;animation:-amp-start 8s steps(1,end) 0s 1 normal both}@-webkit-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-moz-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-ms-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-o-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}style>noscript>style amp-boilerplate>body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}
   async src="https://cdn.ampproject.org/v0.js">
 </head>
 <body>Hello World!</body>
</html>

Some additional page requirements from https://github.com/ampproject/amphtml/blob/master/spec/amp-html-format.md

  • A <link rel=”canonical” href=”$SOME_URL” /> tag inside their head that points to the regular HTML version of the AMP HTML document or to itself if no such HTML version exists. :link:
  • A <meta charset=”utf-8″> tag as the first child of their head tag. :link:
  • Contain a <meta name=”viewport” content=”width=device-width,minimum-scale=1″> tag inside their head tag. It’s also recommend to include initial-scale=1 (1). :link:
  • A https://cdn.ampproject.org/v0.js tag inside their head tag. :link:
  • Contain the AMP boilerplate code in their head tag. :link:

Interestingly, AMP seems to be similar in spirit to Polaris – it includes custom JavaScript which renders page loading asynchronous, and thereby controlling for dependencies. The long list of improved performance features would seem to require this. But it isn’t clear that AMP creates a dynamic dependency graph. Google does cache the pages, so cached page download might be optimized in some special way.

 

Does it work?

The first question is technical. Since so many websites on mobile are actually CMS systems shoehorning mobile styles on their desktop displays, it is not clear that they easily adapt to AMP. The CMS space includes 1/3 of the entire Web, and half the CMS-based sites, since nearly one third of the web is WordPress. We won’t know until we get AMP stats from HTTP Archive.

Second…the sites used to announce these efficiency increases are themselves bloatware! From idleware:

…the page describing AMP is technically infinite in size. If you open it in Chrome, it will keep downloading the same 3.4 megabyte carousel video forever.

If you open it in Safari, where the carousel is broken, the page still manages to fill 4 megabytes.

Geez. Design is the problem. Getting people, including coders, to design in more sustainable ways requires web bloatware…


 

3. Facebook’s Instant Articles

Apparently, Facebook has a similar plan in mind for mobile iOS and Android. Their Instant Article Site details the coming framework, due to be opened to all and discussed in detail at the Facebook Developers conference in 2016. However, many large sites are already using the system.

http://www.fastcompany.com/3051882/how-facebook-instant-articles-works-a-publishers-perspective

Here’s their fluffy, high carbon-footprint site, apparently catering to “executive decision makers” and Creative Directors who need bloatware to decide stuff.

https://instantarticles.fb.com/

And the Developer’s site:

https://developers.facebook.com/docs/instant-articles

And the blog:

https://developers.facebook.com/ia/blog/

One feature of note is that your company has to be approved to support Instant Articles. After setting up a secure RSS feed (see below), you have to submit the RSS feed for approval by Facebook:

https://developers.facebook.com/docs/instant-articles/publishing#review

You also need to map your Facebook Page URL:

https://developers.facebook.com/docs/instant-articles/claim-url

Here’s the page you need to get it working (look at menu across top)

https://developers.facebook.com/docs/instant-articles/guides/businessmanagersetup#access-tools

Some features of Instant Articles Styling:

  1. CSS is NOT SUPPORTED
  2. Semantic HTML5 tag are required
  3. A canonical link is required
  4. OGP (Open Graph Protocol) is uses as on regular pages

From the FB page….

<head>
  <meta charset="utf-8">
  <meta property="op:markup_version" content="v1.0">

  <!-- The URL of the web version of your article --> 
  <link rel="canonical" href="http://example.com/article.html">
  
  <!-- The style to be used for this article --> 
  <meta property="fb:article_style" content="myarticlestyle">
</head>

Articles need to be placed in a secure RSS feed:

https://developers.facebook.com/docs/instant-articles/publishing

Once the RSS feed is submitted and approved, you can go to a manual editor on Facebook to publish individual articles on your Facebook Page (under Publishing Tools). You can also create articles manually. There’s also a debugger that validates your article.

This system, like AMP is screaming out for validation that it actually is faster. The purpose of the system is to allow Facebook to host content directly from suppliers, rather than reference other websites. However, it should improve WPO, and hence sustainability. But actual numbers will tell the tale.

 

But Does it Work?

Remember that sustainability is more than web optimization, including design, Ux, and all development. The following article details the resistance Facebook encounters from “creatives” who don’t care about page load.

Further down the page, you’ll find a 41 megabyte video, the only way to find out more about the project. In the video, this editor rhapsodizes about exciting misfeatures of the new instant format like tilt-to-pan images, which means if you don’t hold your phone steady, the photos will drift around like a Ken Burns documentary.

And supporting evidence from The Atlantic about the resistance of Creative Directors to efficiency in design.

http://www.theatlantic.com/technology/archive/2015/10/72-hours-with-facebook-instant-articles/412171/

Geez, Design is STILL the problem… Getting people to accept Instant Articles requires bloatware.

And The Real Problem is Design (Again)

All these examples promise sustainability gains from a technical perspective. However, there seems to be a real concern that people won’t build slimmed-down websites, and need bloatware to make the case that they should. To my mind, this implies that people who are unmoved by sustainability arguments need bytestorms of convincing to make the larger web sustainable. So, the people who are in charge of making the web more sustainable don’t feel that those rules apply to them, at least when introducing new products and services.

…More on how “Creative Directors Damage the Earth” in a future post.

 

 

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s