unhosted web apps

freedom from web 2.0's monopoly platforms

24. Decentralizing the web by making it federated

Clients and servers

The end-to-end principle that underlies the basic design of the internet treats all devices in the same neutral way. But in practice, that is not what the internet does. There is quite a strict distinction between client devices in one role, and (clusters of) servers in the other.

Generally speaking, a server is addressable, and a client is not. A server usually has a domain name pointing to it, and a client device usually doesn't. A big chunk of internet traffic is communication between one client device and one server. When two client devices communicate with each other, we call that peer-to-peer communication. And the third category is server-to-server traffic, where two or more servers communicate with each other.

Server-to-server traffic

The actual consumption of the usefulness of the internet happens on the client devices. The servers only ever play an intermediate role in this. This role can be to execute a certain data processing task, to store data for future use, or to be an addressable intermediate destination where data can be routed to.

The ability to store and process data is pretty generic and replaceable. It is the addressibility of servers, and restrictions based on that addressability, that can give them an irreplaceable role in certain communication sessions.

As an example, let's have a look at XMPP. This protocol is used a lot for text chat and other real-time communication. The standard describes a client-to-server protocol and a server-to-server protocol. If [email protected] wants to contact [email protected], then she tells her client device to connect to her alice.com server. Bob also tells his client device to connect to the bob.com server, so that he is online and reachable. After that, the alice.com server can contact the bob.com server to successfully deliver a message to Bob. So the message travels from Alice's client, to Alice's server, to Bob's server, to Bob's client.

It is obvious why the message cannot travel directly from Alice's client device to Bob's client device: even if Bob is online (his client device is connected to the internet), client devices are not (practically) addressable, so at least one server is needed as an intermediate hop that can bounce the message on to Bob's current network location.

What is maybe not so obvious is why the client-server-server-client trapezium is needed, instead of a client-server-client triangle. The reason for this is that it makes it easier to check sender identity.

When the bob.com server receives a message from somewhere, this message might be spam. It needs some way to check that the message really came from Alice. The alice.com server can in this case guard and guarantee Alice's sender identity. The XMPP protocol uses SASL to allow the bob.com server to know that it is the alice.com server that is contacting it.

It would be possible to do something like that in a client-server-client triangle route, but that would require the client to provide a proof of identity, for instance if the sending client creates a message signature from the private part of an asymmetric key pair, like PGP does. Even then, people usually relay outgoing email through their mailserver because email from domestic IP addresses is often blocked, both by the actual ISP, and by many recipient mailservers.

The client-server-server-client trapezium can also help to act as a proof-of-work on part of the sender, thus making massive spam operations more expensive: the sender needs to maintain a sending server, which is more work than just connecting directly to the receiving server.

Web hosting is a prime example where the client-server-client triangle route is used, but in this case it is the sender's server that is used, since the recipient of the content of a web page does not need to be addressable. Messaging protocols don't have that luxury, so most of them use the client-server-server-client trapezium route where server-to-server traffic forms the central part.

The internet as a federation of servers

For internet traffic that follows a trapezium-shaped client-server-server-client route, both the sending and the receiving user connect their client device to their own server. So clients are grouped around servers like moons around planets, and server-to-server connections form the actual federated network, where any server that has a domain name pointing to it and that speaks the right protocols, can participate.

Alas, not really

Unfortunately, this picture is only how the federated internet was designed. It doesn't truthfully reflect the current-day situation in practice. First of all, the vast majority of users don't run their own server, and don't even have their own domain name; not even a subdomain. In fact, only a handful of DNS domainnames are used to create user identities, "*@facebook.com" currently being the most important one.

Furthermore, a lot of the interconnections between servers are missing. This means that a user needs to establish a client-to-server connection to the Google XMPP interface as well as the Facebook XMPP interface, in order to successfully chat with all their friends.

Facebook chat has always been a "walled garden" in this sense; you can connect your own client device using its xmpp interface, but you're connecting into "the Facebook chat world", which is limited to conversations with both ends inside the Facebook namespace. Google used to offer proper xmpp hosting, just like it offers email hosting, but they recently announced that they are discontinuing XMPP federation.

When you search for a user, Facebook search will only return Facebook users among the results, so to a user who has their identity on facebook.com, it looks as if only the people exist who happen to be on facebook.com too. Google Search still returns results from web pages that are hosted on other domain names, but the Google+ people search does not show results from Facebook, it only returns links to profiles of people who happen to host their identity on plus.google.com.

The federated social web effort consists of projects like buddycloud, diaspora, friendika, statusnet and others, which offer software and services that fix this. But as discussed in episode 1, these project often only federate with themselves, not with each other, let alone with the "big players" like Google, Facebook, Twitter, etcetera.

The big identity providers have stopped federating, so in order to still communicate with humans, your server should become a client to each platform, rather than federating with it. This is a sad fact, but it's the situation we face. Instead of simply running an xmpp server to connect with users on other domains, you now also have to run multiple xmpp clients to connect to users on Google and Facebook.

A non-profit hosting provider

Ten years ago, when the web was still federated, I used to work at a hosting company. People who wanted to be on the web would register a domain name, and upload a website via FTP. They would set up email accounts through our control panel, and maybe add our "site chat" widget to their site, so their website visitors could also contact them for live chat. Nowadays, people who want to be on the web, sign up at Facebook, Twitter, Google, or a combination of them. The federated web that used to exist has been replaced by a "platformized" web.

In the light of this situation, I think it is necessary that someone sets up a non-profit hosting provider. This provider would offer web hosting, but not with an ftp or ssh interface; instead, the user experience would be very similar to setting up your profile on Facebook or Twitter. The service could be entirely open source, be federated using web standards, allow people to use their own domain name, and be based on unhosted web apps, so people can choose to only use the app, or only use the data hosting, or both.

The business model could be similar to how Mozilla develops Firefox (they get money for setting the default search engine), in-app sales, a freemium model, a "free for personal use" setup, or simply asking its own users for a contribution once a year, like Wikipedia do.

During the past three years we have worked on developing specs, software libraries, apps, and campaigns. We can use unhosted web apps ourselves, but for most end-users the entry threshold is still too high. We can convince sysadmins (like for instance those at European universities) to offer remoteStorage accounts to their users, and we can convince app developers to accept unhosted accounts, but we're not reaching any end-users directly with that. Using remoteStorage and Sockethub as the basis for a "Mozilla-style Facebook" would be a much more tangible result.

Also as Laurent remarked, it would simply be very funny if Unhosted becomes a Hosting provider. :) So I'll take a short break from writing this blog, and look into setting something like that up during the summer months. If anyone would like to comment on this plan, or on anything else mentioned in this episode, please do!. In any case, I'll let you know when I have something ready for alpha testing, and hope to see you all in Unhošť!

UPDATE November 2014: Niklas, Adrian and I got an office in Berlin and worked on this plan during the summer, but after that decided we didn't have enough solid ground to actually launch it just yet. In 2014 I wrote the remaining episodes of this blog, and then teamed up with Pierre Ozoux to finally launch this non-profit migration-oriented network of hosting providers, from whom people can get their own managed personal server, running any free software they may need: IndieHosters.

Next: Anonymity