world wide web

Web Agent Robot

Synopsis

A World Wide Web connected personal robot which gives a web agent a mechanical body with which to interact with the user and its environment.

Rationale

A web agent in the form of a robot would allow the user to interact with resources on the world wide web without turning on and using a desktop PC. Conversely, the robot will also allow the user to interact with the real world from a virtual world.

Features

When in the proximity of the user, the robot can interact directly with the user via audio, visual and tactile interfaces and pass information between the user and web resources using a network connection. When the user is not in the proximity of the robot, the robot can give the user information about its own environment via the web and bring about change on that environment using the same audio, visual and tactile interfaces.

As well as a physical presence in the form of its robot body, the agent may have additional virtual representations such as an IRC bot or an Avatar on the 3D Web.

Bob's Perfect Virtual World as the 3D Web

In this blog entry, I'd like to address Bob Sutor (of IBM)'s three blog posts about his requirements for a perfect 3D World, implemented as a direct extension to the World Wide Web, as described in my 3D Web design concept.

A pure offline Mode

I think this is part of a wider requirement for certain web applications to work offline. With the recently announced Google Gears and other projects from major industry players like Adobe's Apollo, Mozilla's Firefox 3 (and Parakey, currently vapourware), Django's Offline Toolkit, Microsoft's Silverlight and Joyent's Slingshot I think this is going to become an extremely hot topic. I think we're going to see the boundary between web server and web user agent blur considerably into "Web Servents". So in short, an offline mode can use the same technology as an online 3D web, with a local server or a local cache of data, logic and presentation.

A peer-to-peer model

By using web technology, we can take this for granted to the extent that anyone can run their own 3D web server and we can make hyperlinks between them. The peer to peer idea could be taken a lot further than this though, by users in the same virtual space swarming the data between each other. I don't know about that bit.

A model of many planets

Again, this is basically what the web is.

Much better zoning

This almost touches on the contraversial subject of the .xxx domain.

However, with Second Life the geography works in much the same way as First Life with blocks of land having permenant neighbours. This is a limitation of real physical space that while it might be nice to reflect in virtual worlds, is not necessary. We could have lots of areas of "virtual land" who's boundaries are defined only by their own content and then have portals (hyperlinks) which allow you to move into another space, there is no reason to have permenant neighbours because your neighbours are simply whatever you link to, which is under your control. In this way zoning just becomes a result of the links people make, which works reasonably well on the current web.

If you do want to build a planet with geography like the real world (like Second Life), you can still do so, but you could decide to ban certain activities in that particular planet. That way, if there's some content you don't like you simply don't link to it, and it is only as close as 6 degrees of separation dictate.

In-world Secure Chat

I would argue that secure chat in general isn't particularly widespread on the Internet yet, so this is an issue for the Internet in general. However, see later for more discussion on in-world chat. In short, XMPP encryption extensions.

AI

This would just be part of a web application. A 3D web application that is a 3D game may have AI controlling faux avatars and objects, a sales site may have an AI shop assistant or human-AI hybrid. Server side scripting languages and javaScript manipulating X3D files.

World-to-world communication

XMPP (Jabber) and either Jingle or SIP for voice (and video?) would be great for person to person chat. A couple of interesting points spring to mind:

Firstly, should the jabber client be part of the 3D Web user agent or should it just be another web interface like Google Talk in GMail? Especially with regards to advertising prescense or status of the user (available, away, busy, offline).

Secondly, how do we deal with the issue of hearing people around you in a virtual space and adjusting the sound as people move, in addition to person-to-person conversation between worlds. We certainly don't have standards for this yet so it wil be interesting to watch Linden Labs.

World to world teleportation

Hyperlinks.

Do I need a membership in the other world or is there a notion of guest?

We have the same issue on the web. I think distributed authentication like OpenID is a giant leap forward in this field.

How do I deal with cross-world identity?

By using a URI as a person's identity as in OpenID. You can still have your friendly screename in Jabber, but the URI uniquely identifies you.

Can I bring my money with me?

That's a good question, but I think the answer is that if you want some kind of virtual currency, it simply becomes a service like PayPal where you buy credits of some kind and they sort out "exchange rates". You could then use that currency in any world or any web site by using that service as a broker for payments. I'm obviously making this sound a lot more simple than it really is, nothing is straight forward where money is involved.

Can I bring my clothes with me?

Yes, your avatar and everything your avatar wears is hosted on an avatar server (just an 3D web server) and can simply be included into a scene. This only works if all the worlds use the X3D (or other) standard, which is one of the fundamental requirements of a 3D web in my opinion.

Can I bring more general objects between worlds?

The same as above, "objects" can be an X3D file hosted somewhere on the web which can be included into another X3D file dynamically. This requires a certain level of write access to all 3D web servers, which is probably going to cause all sorts of spam problems like we have on wikis. (Imagine a spamming company putting up billboards everywhere).

Search

Don't worry, Google will sort that out ;). Seriously though, it could work the same way as the web with spiders and giant indexes.

Device and world compatible link redirection

Now that is a very interesting topic which you could call the Device Independent or Multimodal Web. I think this can be solved with HTTP Accept headers and content negotiation. This is a major part of what my Webscope project is about, a Multimodal Web User Agent.

tola – Sat, 2007 – 06 – 09 13:12

Google Gears @ Google Developer Day 2007

London Google Developer Day

I spent yesterday at the London Google Developer day, one of 10 simultaneous events in cities around the world. This international event was the first of its kind and Google took the opportunity to launch some new products relevant to developers. By far the most interesting project for me was "Google Gears", a web browser extension that allows web applications to work offline. I seriously believe this web browser advancement is as significant as the APIs which put the "A" in "AJAX".

Google Gears

Google gears is interesting not only because of what it does (there are plenty of projects tackling the "offline problem"), but because of the way Google are going about it. Google are collaborating with important industry partners like Mozilla and even Adobe to try to create a standard API for offline applications that they hope all web clients will use. All major browsers are already supported, with Opera support in the works. Adoption of the standard by projects like Adobe's Apollo platform and discussion with projects like the Dojo Toolkit greatly increase the chances of proliferation of the standard.

Google Gears consists of three main parts - LocalServer, Database and WorkerPool.

LocalServer acts as a local HTTP server inside the web browser and caches and serves resources locally.

The Database part uses SQLite as a local store, a kind of giant cookie implemented as a relational database that web applications can access both online and offline.

WorkerPool creates a kind of multi-threading in JavaScript so that processor-intensive operations can run asynchronously, with a particular focus on preventing user interface lock-ups. This is useful not only for offline operation but also to increase the responsiveness of the user interfaces of online applications. The idea of Google Gears is that web applications will be usable offline when network connectivity is intermittent or non-existant and changes made by the user will be passed to the server opportunistically when a network connection returns.

The missing component

However, Google Gears is missing a key part of the solution required to make web applications work offline. Currently, if you write a web application which modifies data in the local SQLite database there is no provided method for synchronising those changes with the server. This is left for the developer to figure out on a per-application basis.

After the event I had a chat with Chris Prince over a pint (paid for by Google of course). Chris is one of the main engineers who has been working on Google Gears in Mountain View. He said that three separate teams were given the task of figuring out a standard method for synchronisation and all three came up with completely different answers. It turned out that they couldn't figure out a standard synchronisation method that worked well in most cases so they just left that bit out, for now.

I asked Chris whether he thought a dominant standard would emerge or whether things were always likely to be this way. He said that he expected a standard to emerge which worked well 80% of the time, with different methods for special cases, financial transactions being an example. He thinks that once Google Gears capability has been added to around three major applications (I suggested GMail, Google Calendar and Google Docs!), a useful standard method may emerge.

Other Happenings at GDD07

I attended a talk by Chris DiBona, Google's Open Source Programmes Manager, about Open Source in Google. I grilled him about how Google decides whether a project be open source or not (with particular reference to hosted services like GMail and software bundled with Google Appliances) and asked him about GPLv3. I then sheepishly asked him to sign my copy of "Open Sources" which he co-edited.

I also attended talks on Google Gadgets and GData APIs and asked whether Google plans on supporting authentication mechanisms other than Google Accounts in their APIs, but was basically told to make a feature request.

I met up with Darren from PHPWM and talked with a Cambridge PhD student about his work on AI in virtual learning environments. I explained my business idea to him and had an interesting conversation about intellectual property in universities.

Sergey Brin gave an international live webcast to all the event attendees and gave an amusing and bizarre talk about how the Internet is now fuelling its own growith through relationships formed on dating websites which lead to offspring who go off to work on making the Internet better. He was referring to the fact that a child born as the result of the first dating sites would now be around 12 years old, and presumably old enough to use Google's web based IDEs for developing mashups and Google gadgets!

The food at the Google event was characteristically fantastic and the "Blogger Lounge" was full of lava lamps and floor cushions, with free WiFi and coffee. A goody bag was provided including a Google branded T-shirt, mouse mat, USB stick, yo-yo, notepad, pen, sweets and silly putty! All in all it was great to rub shoulders with Googlers and I had some extremely interesting conversations with lots of smart people. The food, coffee and beer was all provided by Google and was brilliant. The train journey and mianderings around the London Underground weren't even that troublesome.

tola – Fri, 2007 – 06 – 01 13:29

Is "The RADAR Architecture" trying to solve a problem which should be solved in the browser?

Designing web applications in a RESTful way from the User Interface perspective can be very challenging and many people argue that REST just isn't suited to user interfaces and should be restricted to Application Interfaces. I've come across this recently when trying to understand why the Bongo Project uses a separate URI scheme for the user interface to the application interface. I've always thought that a resource is a resource, regardless of whether you're a man or a machine. Different people and different machines want different representations of the resource, but it is still the same resource.

Dave Thomas is a pragmatist with knowledge and experience behind him. But in reply to Dave's insightful blog post, The RADAR Architecture: RESTful Application, Dumb-Ass Recipient, I play devil's advocate and put forward an idealistic alternative argument:

From a pragmatic point of view this makes an awful lot of sense because it's easier to bend your server application to the will of millions of existing web browsers than to change the way browsers work.

However, I wonder whether from an idealistic point of view this would be better addressed in the design of the "dumb browsers" you talk about. The dumb browser doesn't need to be quite as dumb if we don't want it to.

What if XHTML forms and Web Browsers *did* support the PUT and DELETE verbs? I'm not familiar with the reasons that browsers do not already support these HTTP methods, but I do know that the original vision Tim Berners-Lee had of a web browser was of an application which could write data as readily as it could read it.

Maybe it is a user interface problem.

Asbjørn Ulsberg asked what a request/response would like if a web browser wants to GET a resource represented as an HTML form so it can edit the information and then PUT it back to the same URI. In answer to his question, I don't believe we should have a MIME type to put into an Accept header specifically for an HTML form view. It's still HTML after all!

Perhaps the answer is that when HTML is displayed in a browser, if the user has write permission, it is editable in the same way that a document in a word processor is editable, rather relying exclusively on forms for user input. If you've ever used Google Docs you may have noticed that if you click on the title of a document, it becomes editable using JavaScript. There is no separate "edit" and "view" mode, it's all one thing so you don't need to tag ;edit to the end of the URI. The reason web browsers themselves don't work this way is probably because the HTML view is only one representation of a resource and it might be hard to translate changes made by the user to this presentation of the information into changes in the underlying data model. Using forms allows the application designer to restrict user input to specific fields in the underlying data model. This is something which needs more thought because I believe it's also the core reason behind the "offline problem", but that's a different story.

roberthahn writes that "The dumb browser doesn't provide the user with a way of submitting requested Mime types." Well, again, maybe this is a user interface issue with the browsers, maybe they should! Remember that the web is just a collection of resources, HTML is only one representation of those resources.

If "smart" client interface can specify which representation it wants, why can't a user? Maybe a user should be able to choose whether they want to browse the web in plain text, formatted text, or even 2D or 3D vector graphics or a voice representation of a resource. The user agent could allow the user to choose a particular mode in which to browse the web depending on their current environment, and offer alternative representations where they are available. For devices that are only capable of certain modes of interaction with the user (e.g. a voice interface), this could be fixed by the user agent.

We shouldn't entirely dismiss the idea of changing web user agents themselves, just because it is a difficult option.

tola – Sun, 2007 – 04 – 01 20:36

Tux Droid VoiceXML Web Browser

Synopsis

A VoiceXML web browser using a "Tux Droid" as the human-computer interface.

Codename

Vux

Rationale

A proof-of-concept for the multimodal web and a bit of fun.

Features

  • Audio output via audio playback and TTS (text to speech)
  • Input via the XML form of DTMF grammars described in the W3C Speech Recognition Grammar Specification, triggered by buttons on the Tux Droid remote.
  • Input via speech recognition grammar data in the XML Form of the W3C Speech Recognition Grammar Specification.
  • Record audio received from the user

Extra input and output

Input

  • Push button on head - home
  • Menu button - home
  • Buttons 1-9, *, #, red, green, blue and yellow - DTMF tones
  • Left direction button - back
  • Right direction button - forward
  • Vol+ & Vol- - volume control

Output

  • Beak movements during TTS audio output
  • Error, status and notification messages using flashing eyes, moving wings and spinning

Use Cases

  • Voice webmail - reading email messages
  • Voice interface for home automation
  • Voice interface for music collection

Implementation

Possibly a front end to an existing VoiceXML interpreter such as Public VoiceXML which uses OpenVXI.

Developer Resources

Why *not* to make the "Metaverse" a direct extension of the web

Further to my previous blog entry, Why I would make the "Metaverse" a direct extension of the web I have found a strong argument to the contrary in the documentation of the Virtual Object System.

In a section of their manual called The 3D Web the authors point out "three basic limitations of HTTP which have caused 10 years of pain, suffering and hacky workarounds for developers trying to build interactive applications over the web. These are that HTTP is a stateless protocol, that URLs represent opaque handles to resources, on which no reliable introspection is possible, and that HTTP is explicitly asymmetric so that a server typically cannot initiate sending new data to a client."

The reponse of the Virtual Object System community is to create an entirely new protocol stack which is a mirror of the technologies used on the web, but with a new technology for each layer:

  • VIP is like TCP
  • VOS is like HTTP
  • A3DL is like HTML
  • CSVOSA3DL is like an HTML rendering engine such as Gecko or KHTML
  • Ter'Angreal is like the web browser

The fact that HTTP is a synchronous, stateless protocol has come up in the past with regards to web applications - raising the possibility that AJAX is just a hack, waiting for a new protocol to replace it. Perhaps a replacement or extension of HTTP is due.

The current approach I am taking to a 3D Web client for Webscope is:

  • TCP is TCP
  • HTTP is HTTP
  • X3D is like XHTML
  • FreeWRL (and others) are like an HTML rendering engine such as Gecko
  • Webscope is the web browser.

Because of the limitations of HTTP I have considered building a protocol like XMMP into Webscope, and the argument the Virtual Object System community make will certainly prompt me to explore alternatives further.

What I think I would like to see is a solution that sits somewhere between the plain X3D over HTTP approach and the radical VOS approach of replacing the whole protocol stack. I don't want to throw away HTTP entirely because of its Content Negotiation abilities and the vision of the Multimodal Web.

I'd like to see some discussion on this by some people who know more about networking than I do.

tola – Sun, 2007 – 03 – 11 12:11

Why I would make the "Metaverse" a direct extension of the web

In answer to Bob Sutor's question "If we didn’t have web browsers as we do today and started today to do everything that you imagine [for a distributed 3D virtual world], what would you create to do all that?"

I would probably create something very much like Second Life and open source the server source code.

Anything anyone ever creates is based somehow on someone else's ideas (standing on the shoulders of giants and all that). If we didn't have the web but we had video games, I would start with an existing gaming engine. Then in the absence of a worldwide network of linked information resources, I would take the next best thing to existing technology, science fiction. I'd buy Snow Crash by Neal Stephenson and start writing network protocols and file formats!

I'd start by separating the storage of content, logic and presentation into different formats and come up with some kind of distributed TCP/IP streaming protocol with heavy compression.

I suspect that you're asking whether the web is really a suitable platform for all this, whether if we weren't stuck in the mind set of the existing world wide web we might come up with a better solution. Perhaps.

But if I was creating the web from scratch (but happened to benefit from the hindsight of all the great minds that came after me), I wouldn't use XML-like syntax for web pages, I would use something more efficient. I would try to make the DNS system more decentralised and URIs would be of the form http:uk.co.companyname.department/resource instead of http://department.companyname.co.uk/resource. I might make HTTP requests asynchronous, build comment spam protection and Denial of Service protection into the protocols of the web. However, I wouldn't necessarily attempt to make those changes now.

What's amazing about the web for me isn't that it's perfect technology that could not have been done better, it's that it's openness and adoption has made it almost ubiquitous in the world. Creating new protocols suited to new applications is definitely a good idea, but if the online 3D virtual world is to become as ubiquitous as the World Wide Web, we should learn from the lessons of how web technology was created and build on an already ubiquitous platform. Adoption of a well defined standard is more important than a perfect technology.

Another motivation behind making Stephenson's "Metaverse" a direct extension of the web is device independence. It's all very well creating a 3D virtual world which requires a large amount of processing to render, but what if I want to access the information on a small information appliance with little processing power? What if I live in a developing country and want to be able to access some information but only have a text based browser? What if I'm blind and can't see the virtual world and want to hear it instead? We need not carry over all the limitations of First Life into Second Life. I don't know about you, but I hate having to pay for physical objects and I love flying!

tola – Tue, 2007 – 02 – 27 12:24

3D Web

Synopsis

Browsing the world wide web as a three dimensional virtual world.

Rationale

Virtual online worlds like Second Life are like the AOL of the 3D web, they provide a walled garden in the virtual 3D world using proprietary software. Although the Second Life Client is now Open Source, the server software remains closed and only Linden Labs are able to run Second Life servers. Like AOL eventually had to open up user's access to the rest of the Internet, these innovative but proprietary solutions will eventually give way to a ubiquitous online space which is a direct extension of the web and uses web standards. Anyone will be able to host a 3D web server.

Features

3D Web Server

  • An existing HTTP server which serves 3D web pages to client requests with the relevent HTTP Accept header
  • Server side scripting
  • XML transformation if required
  • 3D web pages or "spaces" written in X3D and ECMAScript (with hyperlinks between spaces).
  • Web interface to chat server

Chat Server

  • Chat server, possibly using the XMMP protocol

Avatar Server

A special type of 3D web server which holds a person's 3D avatar. This could optionally include a distribute authentication mechanism like OpenID which identifies a user securely to others in a 3D space. When a user visits a 3D space, their avatar is served to that 3D world so that they appear to other users.

3D Web Browser

(Could be part of a Multimodal Web User Agent).

  • Rendering X3D
  • Executing ECMAScript
  • Sending HTTP Requests with the relevent Accept headers to ask for a 3D representation of a resource

Implementation

Web3D Consortium
Metaverse Roadmap

Related Blog Entries

 

Parakey

An article on IEEE spectrum talks about Blake Ross and Joe Hewitt's project called Parakey.

Details are very sketchy at the moment but it sounds like Parakey is aiming to address some problems very close to my heart. It sounds like Free Software designed to replace (or supplement) your operating system with a unified interface where your desktop meets the web. Importantly it appears to offer a solution to the "offline problem" and other important Post-PC concepts. The interesting bits are on page 3. I look forward to hearing more about this project.

tola – Mon, 2006 – 11 – 06 23:25

Device Independent Web Server

Synopsis

A web server which uses content negotiation to return different representations of a web resource depending on the HTTP Accept header passed from a user agent.

Syndicate content