21. Client-side sessions, origins, browser tabs, and WebIntents
Last week we looked at how you can manage your login-identity when you do your computing on multiple servers and/or multiple clients. Sessions turned out to play a significant role in this, because they can link the current user on a client to a current user on a server, and remove this link when the user on the client finishes the session. This way, you only have to establish trust once, and can then use the remote server without any further login prompts.
This week we'll look a bit more closely into this concept, especially in relation to client-side applications. Because web browsers have a (sandboxed) way of retrieving source code from a remote server, and then running it client-side, it is possible to establish a session that stays within this sandbox, and does not exist on the server that served up the web app.
This is, in my view, the core concept of unhosted web apps, and the reason I keep nagging about the No Cookie Crew, disabling all cookies by default, with the goal of pushing the web away from server-side sessions, and towards client-side sessions. Sorry we had to get to episode 21 before I bring up its fundamental details. :)
Behaving dynamically is of course again a vague term, but it definitely implies that the application's output cannot be mapped one-to-one to the user's latest action.
A rudimentary form of client-side dynamic behavior is for instance a
textarea element that builds up a text in response to consecutive keystrokes. Unless
the app puts the entire current state into the URL, the app has "client-side state", and its behavior can be called dynamic.
When there is no server-side session, a lot of things change. Most web developers have formed their brain by thinking in terms of the LAMP stack and similar server-side frameworks, and it was only in recent years that people started to make the paradigm shift to thinking about client-side sessions. For instance, OAuth did not have a client-side flow before 2010, and the webfinger spec has only mentioned CORS headers since 2011.
Web app origins
Even if the code runs on the client-side, the trust model of the web keeps its basis in origins: depending on the server that hosted the code that has been loaded into the current browser tab (the 'host', or 'origin' of the web app), it may interact with other pieces of code in the same and other tabs.
The same-origin policy describes the exact rules for this. For instance, a web app is often not allowed to include mixed content, the X-Frame-Options header can restrict iframing, and technologies like window.postMessage, DOM storage and Web workers are also usually sandboxed by script origin.
For data URIs, file: URLs and bookmarklets, you can get into all sorts of edge-cases. For instance, this bookmarklet inherits the scope of the page you're on when you click it, but this one will execute parent-less, with the empty string as its origin, because it was encapsulated into a redirection.
Another way to reach the empty origin from a bookmarklet is to become a hyperlink and click yourself. :)
Don't try these things at home though, I'm only mentioning this for entertainment value. In general, your unhosted web apps should always have a unique origin,
that starts with
As the browser starts to replace some of the roles of the operating system (mainly, the window manager), it becomes more and more likely that we want the app that runs in one tab to interact with an app that runs in another one. Or, an "in-tab dance" may be necessary: App 1 redirects to app 2, and app 2 redirects back to app 1. OAuth is based on this concept, as we've already discussed in episode 15. We use OAuth's in-tab dance in the remoteStorage widget, to connect an unhosted web app with the user's remote storage provider. Another example is when opening an app from an apps launch screen, like for instance this example launcher.html. Such a launch screen eventually replaces the 'Start' menu of your window manager, or the home screen of your smart phone.
The user experience of opening a new tab, and of performing in-tab dances, is always a bit clumsy. Browser tabs are not as versatile on a client device as native citizens of the windows manager, mainly because of how they are sandboxed. The barrier for opening a URL is lower than the barrier for installing a native application, and a lot of UX research is currently going into how to let a user decide and understand the difference between system apps that have elevated privileges and sandboxed apps that should not be able to exert any significant lasting effects on your device.
A prime example of this is native support for Mozilla Persona in browsers: the browser takes over the role of managing the single sign-on, which is the main point where native Persona is fundamentally more powerful than OpenID.
Using redirects to implement interaction between browser tabs is not only clumsy because of how the result looks, but also because of the difficulties of apps discovering each other. This can be resolved with WebIntents. First, app 2 (the app that needs to be opened) tells the browser that it can handle a certain type of intents. Then, app 1 only needs to tell the browser that it wants an intent of a certain type to be handled. The browser then takes on the task of tracking which app can handle what.
Most of this is all still research in progress, and it brings up a lot of questions about what "installing a web app" means in terms of privileges. We may at some point use WebIntents to improve the "connect your remote storage" experience of the remoteStorage widget, but for now this is definitely an area in which the web still lags behind on "other" device operating systems. :)
Next: How to locate resources