Leaping into the Cloud – Views on "Cloud Computing"

A couple of years ago I wrote about being "OS-agnostic" and using open formats for storing your data so that you never have to be dependent on one desktop operating system. This enables you to move between different computer systems relatively seamlessly while still being able to access the data that's important to you.

I mentioned that the next stage I was working towards was to host all my data across web servers and use a web application to manage each type of data. This would free me completely from dependence on any one device and allow me to access my data on any web-connected appliance. Since I wrote that blog post the notion of "cloud computing" has become increasingly popular and the proliferation of netbook computers seems to indicate that consumers are investing in this approach to computing as an alternative or as an addition to their desktop computer.

I thought I'd write an update on the kind of software I use today and my opinions on the advantages and disadvantages of cloud computing.

First, let me refer back to the table of applications I created back in January 2007 (with some additions) and list the web equivalents I use today.

Application	Desktop	Web
Email	Thunderbird	Google Mail
Calendar	Sunbird	Google Calender
Contacts	Thunderbird	(Google Contacts)
Todo List	Thunderbird	Remember the Milk
Bookmarks	Firefox	Delicious
Documents	OpenOffice	Google Docs
Multimedia	VLC	(Ampache & LastFM)
Pictures	The GIMP/Inkscape	(None)
Code	Eclipse	(Trac)
News	Thunderbird	Google Reader
Chat	Pidgin	Google Talk Gadget
Desktop	Windows/OS X/Gnome	iGoogle
Wiki	(None)	MediaWiki
Blog	(None)	Drupal
Microblogging	(None)	Twitter
Social Networking	(None)	Facebook

Access from Anywhere

All of these web applications can be accessed from any web connected device – PCs, internet tablets, netbooks, nettops, kiosks, mobile phones and all sorts of other emerging appliance form factors. This doesn't even have to be a device I own.

Commercially vs. Self Hosted

There is a mix of commercially hosted applications like Google Mail and Google Docs and self-hosted applications like MediaWiki and Ampache on this list. Before there were so many commercially hosted options it was my intention to host all of the web applications myself on a home server. I soon realised that this took up a huge amount of time for system administration because there was no good standard system for package management of web applications – they often use their own individual methods for installation and security and feature updates and this was just too difficult to keep track of.

The massive advantage of commercially hosted applications is that all of this system administration is taken care of for you. Security updates and feature additions appear automatically (this can be both exciting and confusing at times). Backup is taken care of for you and large organisations like Google have huge data edundancy a home user could not even dream of – plus a very good track record for uptime and not losing data!

The disadvantage of commercially hosted applications is a loss of a certain amount of control over your data and software and currently storage is a limitation. I choose to keep my music collection on a home server because there is simply no commercial organisation who would offer enough storage space at an affordable price and be willing to risk the potential legal issues of hosting so much copyrighted content. However, by hosting my own data on a home server my data is not truly in "the cloud" and I have to take care of my own backups – my data is much more vulnerable to risks such as hardware failure, fire, flood or theft.

What's Missing and What's New?

There are certainly gaps in what you can do on the web. I still use desktop applications for image editing – Inkscape (which I love) for vector graphics and The GIMP (which I don't love) for raster graphics. It's interesting to note that Inkscape actually uses the SVG format for graphics which is itself a web standard. I'd really like to see more web applications which utilise SVG and X3D to create 2D and 3D graphical web applications respectively – but this requires a serious increase in browser support. It's a bit of a chicken and egg situation. I also wouldn't currently use a web application for other graphically intensive tasks like video editing.

I've recently started looking into Google Contacts due to their new syncing features, but I still think there's a gap for a really good "Web 2.0" address book, perhaps supporting LDAP.

Multimedia in the browser is still a real pain point in my opinion – I see the use of Adobe Flash for audio & video as a temporary hack until something like HTML5 standardises embedded multimedia on the web. I think there's an awful lot of scope for improvement here to turn web browsers into 2D & 3D graphical tools which provide new opportunities for interactive IPTV.

There's no really good web based code editor or IDE that I know of for software engineers. Google have some modest offerings for JavaScript based applications that run on their own platform, but the recently announced Mozilla Bespin project looks like it might turn into more of a general purpose tool. I do use Trac for it's ticketing system and as a read-only front end to subversion repositories.

There are new applications which now exist in the networked world of cloud computing that never really existed on the desktop. Social networking, blogging and wikis are a few notable examples, but there are new ideas popping up all the time which leverage the "wisdom of crowds" and other great marketing terms like "the network effect" and "the long tail". See Tim O'Reilly's 2005 blog post on Web 2.0 and the book Wikinomics.

Offline Access

One of the biggest problems with web applications is what happens when you have no Internet connection? Although high speed network access is certainly becoming more widespread, it's hardly ubiquitous. For me, the big innovation here is Gears. Gears is a browser plugin (though the aim is to make it a standard part of every web browser) which allows web applications to preserve some functionality, even without a network connection. This means that I can edit documents, read my archived email and view my calendar and todo list even if I'm not connected to the Internet.

Performance

The trade-off for being able to access your data and software from anywhere is that the business logic is stored on the server. All that exists on the client is a user agent (a web browser) which downloads text, images, audio, video and scripts and renders them using layout engines, graphics engines and scripting engines. This traditionally has meant that web applications are inherently slower than desktop applications. A lot of the processing happens on the server side and has to be transferred over the network. It could be argued that this is a good thing because it forces software developers to keep their applications much more lightweight and avoid the ridiculous feature bloat we've seen on popular desktop applications. However, even a lightweight application can be made to run faster.

There are a few key changes happening in this respect. Firstly, although web applications reside on a server, a lot of the processing is increasingly being done on the client side using JavaScript and realtime graphics engines. The web browser is far from a dumb terminal, it is a very powerful piece of software in its own right which can take advantage of modern hardware and graphics acceleration. SVG and X3D plugins can use OpenGL to render graphics for example.

In some ways JavaScript/ECMAScript is becoming the assembly language of the web. Because JavaScript engines were never really intended to be used this heavily, they've historically not been terribly efficient or performed particularly well. One of the biggest innovations in Google's Chrome browser was the brand new V8 JavaScript engine which brings performance to a new level and introduces some clever new features. One major innovation is process separation. Each web application runs as a separate process on the underlying operating system. This might not sound like very much, but it means that a bug in one web page can't bring the entire browser down and it moves a step closer to the notion of a WebOS.

"WebOS"

"Whoa! Hold on there, what on earth is a WebOS?" I hear you say.

Sorry, yes. I acknowledge that in computer science "WebOS" is ludicrous and non-sensical term. I could go on all day about semantics if you like. I don't particularly like a lot of buzzword type terms like "web 2.0" (which I think is really "web 1.0") and "cloud computing" which is… well, what is it? But there are certainly new approaches to technology emerging and if we're going to talk about them we have to call them something.

So if there is such a thing as a WebOS, what does it look like? If we casually discard the real meaning of "Operating System" for a moment and use "OS" to describe an entire software platform – what does that include? Very simplistically you've got this thing in the middle that gives you access to the hardware via hardware drivers called a kernel, you've got a bunch of APIs for doing stuff with the hardware like networking and storage and you've got some kind of management of processes. For interactive systems there then has to exist some kind of user interface.

From the point of view of end users you don't need to worry about what "operating system" you're using. You're just using the web, with some kind of appliance. But developers can be OS-agnostic too. When you're programming for the cloud, it doesn't matter what kernel is being used underneath, what's important to you is the APIs (application programming interfaces).

If a WebOS exists, it exists as a series of web services which can be accessed programatically via APIs. This includes services such as the Google App Engine, Amazon's S3 (Simple Storage Service) and EC2 (Elastic Compute Cloud). They allow developers to create applications which are hugely scalable, have masses of redundnacy underlying them and require little low level system administration. The risk involved with current offerings is getting locked into a particular development platform and a mismatch between the scaling of revenue and costs.

If we're counting the user interface as part of the OS, then the equivalent of the desktop is probably the homepage. iGoogle is a personalised homepage which displays a collection of interactive "gadgets" or "widgets" that provide a summary of information aggregated from resources all over the web. This acts as a starting point for a browsing session and negates the need to constantly visit a whole range of separate web sites. There are other similar services like NetVibes and there are also alternatives which attempt to emulate the desktop experience on the web like EyeOS. I am very skeptical of this "desktop on the web" business and I think it's an unfortunate diversion until much more useful UI paradigms emerge.

Security

How do you know your data is secure? You don't. By hosting your data with a commercial service you are placing a huge level of trust in the organisation providing the service. It may be wise to look out for the usual things like HTTPS connections, but ultimately you are only as strong as the security team of the service provider. I remember MJ Ray mentioning that GMail doesn't support public key encryption to encrypt and sign emails which is another good point.

Privacy

If you store your data on another organisation's servers, you don't have physical access to that data. You are trusting them not to disclose any private information to anyone else and you have no way of knowing what a "delete" button actually does behind the scenes. Once you have stored a piece of data in the cloud, you have let go of a certain amount of control that you can never take back. For many organisations this a real issue, especially if they don't have a contractual relationship with the service provider.

There are some things the end user can do about privacy if it's something that concerns them. It's scary what can be achieved by piecing together lots of small amounts of information about a person spread across the web so you should always think about what information you are making public in total – not just in a single place. You should also think about how much you trust an organisation before you give them any personal details at all, though this is extremely difficult because you have no real way of knowing about their affliations. Overall all we can do is make decisions based on the information we have available. To a certain extent just the fact that information exists about every one of us on a networked computer system somewhere in the world means that our privacy is already out our control. We have to accept that privacy means something completely different to what it meant fifty years ago.

Openness

There was a time when my data was all stored on a single hard disk on a single PC running on software for which I could access all the source code. For me that situation will never exist again.

In some ways cloud computing is the most open computing platform we've ever seen, but in other ways web services are the ultimate proprietary software. As a programmer, if I'm sufficiently motivated I could change any part of my Open Source desktop operating system, but there is no way that I can change the source code of GMail – the only controls I have are the controls the service provider chooses to give me.

For a hacker (in the old fashioned sense of a tinkerer) cloud computing offers opportunities to tinker with computing platforms comprising of thousands of processors distributed across the world which simply wasn't possible before. But as the systems become more complex and depend on commercial organisations, we've lost a huge amount of control over our computing systems.

Whilst I'm really enthusiastic about software as a service, in many cases it effectively means that I'm moving away from open source software to something which is not only highly proprietary, but I can't even run it on my own system. This saddens me and I can only hope that service providers recognise the benefits of the open source development model and release the source code of their software for others to tinker with and improve. The Internet made the Open Source movement, and in return the Open Source movement provided the tools that made the Internet grow into what it is today. I hope that in the future we see more openness, not less.

Ownership

It's important to think about whether you "own" the information you put in the cloud. This might be in a legal sense where content you create using an online tool has certain copyright or licensing restrictions attached to it, or it might be in the sense of how easy it is to leave the service and take your data with you.

Future Innovations

As we see software migrate from the desktop to the web I think we will see lots of technological innovations to make the web better.

I've talked before about how I'd like to see 2D and 3D graphical interfaces using SVG and X3D and also voice interfaces using VoiceXML where natural language is used as a powerful means of interaction. I think the web in a few years time will be much richer, more immersive and more interactive than the web we have today. But equally I think the web will have incarnations which are much less visible and not at all immersive, but are instead a part of the world around us in a web of things. Electronic sensors will become part of the web and consumer appliances will be using the web without us even thinking about it. I think we'll see information appliances dedicated to a particular type of information or task for which the inclusion of networking is no more remarkable than the inclusion of say an electric motor. We'll have more of a focus on the task we're carrying out rather than the fact that we're using the web.

I hope that network appliances will leave behind the two-and-a-half-dimensional world of the "desktop environment" and will have UI paradigms which are much more relavent to the kinds of tasks computers are used for today. This might be through a physical interface or through different UI paradigms like a zoomable interface rather than overlapping "windows".

We're still very much thinking in terms of web "applications" like we have desktop "applications", but really the web is a collection of resources provided as services and there's no real reason to think in terms of applications. With a different UI the "cloud" doesn't necessarily need to be presented as a collection of distinct applications, but can simply be a collection of interactive information resources displayed as a literal cloud of information or presented through a continuous natural language discussion.

Conclusions

Behind the buzzword of "cloud computing" are some real shifts in how we think about computing and how we use our data. We're at the beginning of the transition from the desktop PC to the World Wide Web and there are lots more innovations in the pipeline.

With the new opportunities come new risks and there are certain questions you should ask yourself before choosing to use a web application:

Can I access the app from any web connected device? Does it use web standards for its web interface?
Am I happy with my relationship with the service provider (e.g. are they providing a free service in return for showing me advertising)?
What happens when I have no network access, can I still access my data offline?
Does the application perform well enough for me to achieve what I want to achieve?
Am I happy with the level of security provided?
Is my privacy under my control? Can I sufficiently control what I share with who?
Is the app flexible enough for what I want to do?
How well does the app integrate with other apps I use?
How easy is it to leave? Is my data stored or exportable in open standards?
Do I legally own any content I create?

Finally, the proof of the pudding is in the eating.

The day I realised I really had shifted my data into the cloud was the day the hard disk failed on my MacBook. I took the machine into the Apple Store in Birmingham and the "Genius" asked if I would like them to try and retrieve the data from the disk. I hesitated for a moment, but then politely declined the offer. "It's OK, it's all backed up" I said. But what I really meant was "It's OK, it's all in the cloud".