Tier-Agnostic Requests and Wadi

Tier-Agnostic requests are being used by the Wadi project in Java Server Farms. This Java-specific implementation lends support to the general applicability of Request Based Distributed Computing (RBDC) for client centric distributed computing.

WADI is an acronym of ‘WADI Application Distribution Infrastructure’. WADI started life as a solution to the problems surrounding the distribution of state in clustered web tiers. It has evolved into a more generalised distributed state and service framework.

The Wadi Docs describe an Invocation mechanism that shares features with RBDC. The Invocation Interface is a tier-agnostic encapsulation of a remote call. Here is a first-cut comparison:

Similarities

  • Both use the idea of Tier-Agnostic Requests
  • They have similar session-state mechanisms. See here for one proposal about state within RBDC.
  • Both delay location of computing decisions until run-time.

Differences

  • Wadi is targeted for use within a server farm, where as RBDC is proposed as starting in the client and works into the server farm with the same mechanism.
  • Wadi maintains centralised knowledge of the location of active Java objects whereas RBDC works as an extension of http’s native request by request invocation pattern.
  • Wadi is programmed specifically for Java, RBDC is proposed as a generic mechansim.
  • Wadi is visible to the Java programmer. RBDC is proposed as a generic mechanism that could become as ubiquitous and ‘invisible’ as http requests are today.

In summary, the usage of Wadi in the server farm, serves as a pointer to the potential of RBDC in the client and server.

Tier-agnostic Requests and Microsoft Volta

Recently my attention has been directed to Microsoft’s Volta split-tier technology. Volta is addressing the same set of issues as Request Based Distributed Computing (RBDC). Key issues directly addressed by both Volta and RBDC include:

Drivers

  • Build Distributed Applications
  • Provide programmers with a unified programming model (i.e. not deal with a separate programming model on the client)
  • Build Mult-tier applications
  • Use existing technologies
  • Delay architectural decisions about the splitting of workload between client and server
  • Enable applications to ‘run anywhere’
  • Provide a Language agnostic mechanism
  • Use a Client agnostic approach

Not withstanding that Volta deserves credit for being a real-live (beta) product while RBDC is still in gestation as an architectural idea, I want to argue that RBDC which is based on tier agnostic requests, is an architecturally cleaner solution to the problems of client centred distributed computing.

The three areas that I would like to highlight are simplicity, generality and run-time architectural decisions.

Simplicity
If you look at these illustrations of RBDC the tier agnosticism of requests leads to an extremely simple mechanism for distributed computing with equivalent expressive power to Volta.

Generality
Volta is specifically targeting the .Net platform whereas the RBDC mechanism is not just language agnostic, but Virtual Machine agnostic. Of course, the Java VM crowd could dupllicate the work done by the Volta team for .Net – but is that a good idea when a more general mechanism is available?

Run-Time Architecture decisions
While Volta provides the ability to make architectural decisions late in the development process – they still need to be decided during the application build. In contrast, RDBC provides a mechanism where these decisions may be made at run-time.

A Question
Reading the Volta website has confirmed that the drivers of RDBC are real and perceived by others in the IT community – what is the best approach to solving them?

RBDC Illustrated

The purpose of this post is to illustrate the behaviour of Request Based Distributed Computing (RBDC). This is how I summarised RBDC in a recent post:

Request Based Distributed Computing is a small extension of the http protocol and notion of server, proxy and client. Rich Internet Applications, SOA architected applications and SETI@home type distributed computing alike can utilise a common unified programming model. No longer will technology dictate the locus of code execution – instead issues like availability of computing power, intellectual property and security will dictate this at run time.

Using the mechanisms explained below the need for separate programming models on server and client is removed. RDBC is language neutral, but for illustration purposes, in the following example lets assume that the server code is written in PHP.

Distributed computing may be facilitated by mobile code moving from the server to a browser that is equipped with one or more RBDC compatible Virtual Machines. In Diagram 1 the example Virtual Machines (VMs) are in circles labeled “hXXX” one for each of 3 major web environments. The VM’s in server and client are identical. Notice that the server does not return the requested “Resource A”, but rather the code that when evaluated will generate the resource. The server does that because the client has indicated in a header of the http request that it is ready to accept mobile code. The client caches the returned code in accordance with the http cache headers. The evaluation of the code is done in the client.

DIAGRAM 1
RBDC Diagram 1

While the cache entry for the returned code is still valid, the client can reuse it without communicating with the server. In Diagram 2 the client is again requesting resource A and is able to generate the resource autonomously.

DIAGRAM 2
RBDC Diagram 2

Meanwhile – other users of the same application are using thin clients or legacy browsers without RBDC VMs built-in. In Diagram 3 you can see a thin client making the same request of the server – in this case – the server automatically returns the requested resource. The code is evaluated on the server.

DIAGRAM 3
RBDC Diagram 3

In some circumstances, perhaps due to Intellectual Property or security concerns a system owner will want a code to always run on their server. In Diagram 4, even though the client is ready to receive mobile code, the code for resource B is marked as “Not Mobile” and therefore the code is evaluated on the server.

DIAGRAM 4
RBDC Diagram 4

See here for a more detailed description..

RBDC, Continuation Passing Style, Closures, Lazy Evaluation and Mobile Applets

It has been recently pointed out to me that the mechanisms underlying Request Based Distributed Computing – RDBC (see primer) are related to Continuation Passing Style CPS, Closures, Lazy Evaluation and Mobile Applets. This is a good insight. Lets have a look at it.

The CPS pattern is where

the caller passes the callee code which the callee runs when when the callee is done with his unit of work. Return never passes to the caller, but rather to the third party designated by the caller.

CPS is a widely used programming style that addresses different kinds of issues to RDBC. CPS does not address the question of the locus of code evaluation whereas with RBDC there is an explicit mechanism that controls whether evaluation proceeds in the callee or the caller. Also, in CPS the callee operates with implicit trust that the caller will pass a sensible continuation, in RBDC the callee (server) does not trust the caller (client) and never receives code from the caller.

Closures is a mechanism that associates a function with state that lasts between invocations. Closures are often used in languages (like lua) where functions are themselves first-class objects. Closures and RBDC share at least one similarity. If functions are first-class objects then variables may hold functions (ie code) whereas we are used to variables holding values. With RBDC http requests may transparently return functions (ie code) whereas we are used to http resources being returned. Unlike closures, RBDC does not bind functions with state.

Lazy Evaluation is a computing strategy where an expression or function remains unevaluated until the result is required for further computation. RBDC can be characterised as a Lazy Evaluation strategy – sometimes. RBDC also supports Eager Evaluation . A point of the RBDC paradigm is that the location of execution is decided at run-time based on availability of computing resources, intellectual property and security concerns.

Request Based Distributed Computing is quite similar to the movement of applets using java.net.ContentHandler.getContent but a key difference is that the mechanism is generalised, independent of the language employed and the programmer does not need to know where the code will execute.

Another implicit feature of RBDC (inherited from http) is that the caller (client) can cache code received from a callee (server) and can proceed autonomously while the function code cache entry remains valid.

Distributed Computing with the Browser

Recently, Subbu posted an interesting discussion of an xml analysis and presentation application – you can read it here: Distributed Computing with the Browser.

This design scenario is a good illustration of the limitations of our current situation with programming . Our current situation is that while the WWW allows a programmer to ignore the network path to an information resource, as programmers, we can’t ignore where computing will be done. The programmer’s choice of technology (framework, language etc etc) carries with it the implicit choice about the location of computation (server or client).

An assumption behind Subbu’s post is that we need to decide the location of processing during the design phase. The purpose of this post is explore how the application could be built using Request Based Distributed Computing RBDC (see backgrounder). With the application recast as a RBDC application, the location-of-processing decisions can be made at runtime based on the availability of computing power and storage, intellectual property, and security issues.

The XML analysis and presentation application using RBDC

(This description presumes that you have read the RBDC backgrounder.)

The key distributed process in this application is the initial analysis of the source XML text, and the saving of the key features into a central database. Lets call this “analyse-save”. With RBDC, the code that performs analyse-save may be written as mobile code that will run on either the server, a proxy or on the client. Analyse-save may be implemented as the code that responds to a http POST request that uploads the source file to the server. It analyses the uploaded file then POSTS the results of the analysis to a central database.

When a RBDC compliant server receives the analyse-save request it may perform the analysis itself on the server or otherwise return the analyse-save code to the client. If the client receives code as a response to its analyse-post request then it would execute the code locally. In either case, the results of the analysis are POSTed to the central database using http.

Clients that have local processing capabilities signal through a http header in the POST request that they are able to accept mobile code as a response to the request. Alternatively clients without processing ability can make the same request signalling that they need the server to do all possible processing.

In this way – the architecture of the solution is the same for Subbu’s cases 1 and 3 with the decision about location of processing being made at runtime, not as part of the design.

Code Mobility and Session State

Code mobility as provided for by Request Based Distributed Computing RBDC (see backgrounder) is key for delivering On-Demand computing, Distributed Computing (e.g. SETI@home) and Rich Internet Applications.

RBDC enables the mobility of code that gets its input from http sources (url, request body, cookie, and passwordless GETs). This post looks into whether session state can be made mobile as well.

How can code that relies on session state be made mobile?

In a typical scenario, http servers associate session state with client request streams through the use of a server-unique session-id that is preserved between accesses via a cookie. An example of this is PHP’s handling of sessions. Under this scheme the server held session state prevents the code being mobile. The code is not mobile because the session state is only available in the server that generated the client’s requested resource.

Using RBDC, code that relies on session state can be made securely mobile. Here is one way.

[[Since writing this post, I have realised that the mechanism described here is the same mechanism that makes Google Reader Public Pages both globally available and private.]]

Firstly, the server stores the session state using a globally-unique-id (GUID) insead of server-unique-id as the key. The key is preserved between requests in the client cookie as is now done. Then the server makes the session state publically available at a well known URL. For example, an xml serialised version of the state could be GET and POSTable at a URL like https://www.myserver.com/sessionstate. The GUID used is sufficiently long to prevent guessing and therefore session state will be securely and globally available.

With session state stored in such a secure and globally available fashion, code that requires session state may also be mobile.

HTC and Cloud and Grid Computing

The HyperText Computing (HTC) paradigm is not a “complete solution” to the challenges and opportunites afforded by Cloud and Grid computing — however this post argues that the HTC is part of the solution. My angle into this question is via a recent blog post.

This is how Tim Foster, in a recent post at Grid Gurus, concludes his discussion of current and future trends of Cloud and Grid computing (emphasis mine):

In building this distributed œcloud or œgrid (œgroud?), we will need to support on-demand provisioning and configuration of integrated œvirtual systems providing the precise capabilities needed by an end-user. We will need to define protocols that allow users and service providers to discover and hand off demands to other providers, to monitor and manage their reservations, and arrange payment. We will need tools for managing both the underlying resources and the resulting distributed computations. We will need the centralized scale of today’s cloud utilities, and the distribution and interoperability of today’s grid facilities.

The concepts that Tim highlights: “on-demand provisioning”, “configuring integrated virtual systems”, providing “precise capabilities” and a focus on the needs of the “end-user” are all addressed by the HyperText Computing (HTC) paradigm. HTC also addresses the need to view central resources through the same lens as localised ones.

The HyperText Computing (or Request Based Distributed Computing – RBDC) — is a small extension of http and our conceptions of server, proxy and client. It creates a distributed computing platform that is built from an end-user perspective outwards just as http does for information. It is built on a recognition of the equivalence between http resources and the code that when executed will return the resource. RBDC unifes programming models by applying browser based sandboxed Virtual Machines (VM) to our conception of proxies and servers.

Key benefits of RBDC are ultra-lightweight distributed computing, run-time code mobility, and backwards compatibility with http.

A fuller description of RBDC may be found here.

Http offers location transparency for retrieving data, a small http extension can also provide location transparency for code execution.

The HTC and Java Remote Method Invocation

Java Remote Method Invocation JRMI (White Paper) is a distributed computing capability for the Java Platform. Like the HTC it is designed to facilitate “write once run everywhere” and “code mobility”. Naturally it does it within the paradigm of Java Objects.

The purpose of this post is to give a 30 second comparison of the JRMI and the Hypertext Computer (HTC) paradigm.

The HTC is not so much an extension of a language’s Virtual Machine but a reconceptualised computer – implemented using an extension of the http protocol along with identical Virtual Machines on client, proxy and server. It is language neutral.

No doubt the JRMI has many advantages of its own, however I would like to identify one major benefit that the HTC confers over the JRMI. It is this: the HTC does not rely on the designer choosing the locus of code execution at compile time (either on the client or on the server). To illustrate this lets use the following example from the JRMI white paper:

For example, you can define an interface for examining employee expense reports to see whether they conform to current company policy. When an expense report is created, an object that implements that interface can be fetched by the client from the server. When the policies change, the server will start returning a different implementation of that interface that uses the new policies. The constraints will therefore be checked on the client side-providing faster feedback to the user and less load on the server-without installing any new software on user’s system. This gives you maximal flexibility, since changing policies requires you to write only one new Java class and install it once on the server host.

This same scenario is handled, just as easily by the HTC paradigm. The user interface for examining employee expense reports is implemented in a client. To evaluate policy conformance the client requests a server with an HTTP GET. However the GET is extended with a request header that indicates to the server that the client has a particular virtual machine and is willing to receive a coderesource (ie program) instead of the result of the GET. The server may (at its option) return the current coderesource that defines the policy. The client then executes the coderesource and caches the compiled version of the code. The server set http caching parameters when it returned the coderesource to force the client to update its coderesource cache according to the applications update cycle. The advantage of the HTC’s handling of this scenario is that:

  1. Thin clients may request the same GET without offering to execute a coderesource and so would transparently be served with the correct result. Alternatively, the processing could be transparently trapped by a proxy serving a network of thin clients.
  2. While any particular implementation will choose one more computer languages The solution is language agnostic. It would work equally well for the JVM as it would with the .Net CLI
  3. The solution is very lightweight

Request Based Distributed Computing – A rough sketch

The Hypertext Computing (HTC) paradigm that I have written about in this blog is built on the following observations:

  • There is a fundamental equivalence between http resources and code that if executed would generate the resource
  • It is an accident of history that the scripting models of servers and clients on the web are different.
  • We have an opportunity to apply the lesssons learnt about building secure scriptable clients, to the building of servers and proxies.
  • While the WWW allows a programmer to ignore the network path to an information resource, as programmers, we can’t (yet) ignore where computing will be done. The programmer’s choice of technology (framework, language etc etc) carries with it the implicit choice about the location of computation (server or client).
  • Grid computing must integrate the client’s available computing power rather than assuming that ‘the cloud’ will do everything. As we anticipate processors with 100’s of cores, a bet against the computing power available at the edges of the network is a poor one.
  • The http protocol can be orthogonally extended so that instead of returning the resource at the given URL, a server may instead return code that will generate the resource when executed on a compatible virtual machine.

Doing this will enable us to:

  • Unify the programming models associated with delivering rich user experiences and satisfying http requests on client, proxy and servers.
  • Enable location of code execution to be determined at run time based on criteria like availability of computing power, security and intellectual property concerns rather than just on choice of technology as at present. Thus make location of code execution location transparent to the end user and to the system designer.
  • Facilitate extremely lightweight distributed computing through code mobility from a canonical source to the computing environment that executes it.

Request Based Distributed Computing

[[Update: I have added a set of graphics that illustrate the RDBC architecture.]]

An alternative name for Hypertext Computing is “Request Based Distributed Computing” that is the name that I will use for the remainer of this article. This informal sketch of the Request Based Distributed Computing paradigm involves extending the definition of the http protocol, client, proxies and servers.

http

In achieving the aim of request based distributed computing this proposal does not break the power and security inherent in the request based http model, for example it:

  • does not not assume clients may be interrogated or polled by servers and
  • never expects clients to send code to a server for execution and
  • does not imply that servers become stateful and
  • does not assume that trust can be delegated to a third party process.

The http protocol defines resources which are located using URLs. Request Based Distributed Computing is enabled by the extension of the definition of “resource” to include “coderesource” identified by an extension header field. Coderesources are http resources that are executable on a known Virtual Machine. after executing on a VM the result is indistinguishable from the resource that a webserver would send in response to the same URL. If an http resource returns a coderesource rather than the resource itself, then a well behaved resource will return code WITHOUT reference to the particular data passed via the url/ and or cookie. Internally during one invocation of a CodeResource it operates with the full usage of all language features, local vars etc etc available to it. A coderesource contains a single entry point. CodeResources are wrapped in xml that contains at least the following information:

  • The name and version of the VM that the code may be executed on. Well behaved VMs are always able to execute legacy version code according to the version number in the CodeResource.
  • Contains a mark that is respected by the serving VM that controls mobility. Mobile (or not)
  • Contains a mark that is respected by the serving HTC that controls execution. Executable (or not)

Note: a coderesource marked Not Mobile and Executable corresponds to the behaviour of today’s .php scripts.

A coderesource gets its input from 4 sources:

  • URL parameters
  • A cookie
  • GETs on public http URLs
  • Private resource such as a local databases or GET from a password secured http URL

Code the gets its input from sources 1, 2 and 3 only is mobile code. A large amount of today’s web code can be written in a mobile form. Especially code that facilitates Rich Internet Applications; gadgets and distributed computing projects like SETI@home.

Code that refers to http://localhost resources is not mobile however code that refers http://client is mobile. While http://localhost is understood to refer to a resource local to the server on which the code is found, http://client is introduced to stand for resources on the initiator of the http request. Code containing references to http://client is NOT executable on the server (or on a proxy) since only the client has access to its state.

In addition the extension response header field that identifies that the content of the resource is a coderesource there is an extension request header field that indicates that the request is for mobile code or the requester is open to receiving mobile code. The absence of this request header indicates that the http request is a standard one where the resource itself is expected. RBDC Proxy servers may add this header if it has a local VM, then trap the returned code, execute it and return the resource to the client as expected.

Request Based Distributed Computing (RBDC) Servers

A RBDC Server is an extension of the common web server. It includes at least one sandboxed Virtual Machine (VM) similar to .Net’s CLI or a JVM. A key is that the same virtual machines are used on servers, proxies and clients. If the VM’s primitive instructions are extensible (e.g. like PHP extensions) then the mechanism of extension is by requesting coderesources from canonical RBDC Servers. VM’s contain a look-aside code cache that operates using the http caching mechanism.

Code executed on behalf of a client will generate an error if it refers to http://client resources. If the request that resulted in the failure indicates that the requesting client has the capacity to execute code then the server may return the coderesource instead of the result.

RBDC compatible Proxies

RBDC compatible Proxies also include a VM. HTTP responses that are coderesources that flow through the proxy may be intercepted and executed on the proxy, with the resulting resource returned to the client. Naturally, if the client has specifically requested a coderesource, well behaved proxies will not attempt to execute it.

Proxies can be used on the perimeter of networks to automatically perform processing on behalf of thin clients.

RBDC compatible Clients

A representative example of a http client is a web browser. Clients that support Request Based Distributed Computing contain a Virtual Machine. The VM identifies and accesses ALL local resources via http://client. The local VM may satisfy requests for http://client without employing a full networking stack.

If a client is returned code that it can’t execute it may re-request the URL with headers that request the server to return the resource rather than its coderesource. This can be trapped by proxies and executed and returned or executed by the server and returned.

RBDC clients deprecate existing scripting solutions that are not compatible with RBDC. Scripts embedded in web pages can be references to coderesources available on the web or treated as anonymous functions on the VM. These scripts refer to the DOM via http://client.

RBDC clients have default sandbox security which may be relaxed by the user.

Conclusion

The Hypertext Computer paradigm (or Request Based Distributed Computing) is a small extension of the http protocol and notion of server, proxy and client. Rich Internet Applications, SOA architected applications and SETI@home type distributed computing alike can utilise a common unified programming model. No longer will technology dictate the locus of code execution – instead issues like availability of computing power, intellectual property and security will dictate this at run time.

Click here for discussion of RBDC compared to current technologies.

Pramati’s Dekoh and The Hypertext Computer

Pramati announced Dekoh this week. Dekoh is a platform that supports applications that run both on over the network and on the desktop. It embodies some of the ideas of an Hypertext Computer (HTC):

Dekoh Desktop is a small footprint download that can be installed on user™s desktop in a single click. Dekoh Desktop includes a web server on which applications written using open standards like JSP, Ajax, DHTML, Flash can be deployed and accessed thru a web browser. Applications deployed on Dekoh Desktop are automatically enabled for web 2.0 functions like tagging, sharing, commenting, rating, etc.

Dekoh Network allows controlled sharing of applications or content on the web. A user can share application/content on his or her desktop with a buddy, who can go to userID.dekoh.net and access it. The key thing to note is that the user is not required to upload different kind of content to different websites. Instead, the shared content and applications remain on the desktop and are served from there.

In particular, the presentation of the computing resources of the desktop to the world as a web server is an idea that is common to both Dekoh and HTC. The biggest difference between HTC and Dekoh is that Dekoh does not seem to address code mobility issues, instead, the choice by the programmer of using Dekoh does carry with it a choice about the locus of processing — it will be on your desktop.

Intel’s Teraflop chip and The Hypertext Computer

A chip with 80 processing cores and capable of more than a trillion calculations per second (teraflops) has been unveiled by Intel.

see the BBC report.

This new chip presents a great challenge to the programming community. The proposed HTC may be part of solving these challenges.

The BBC report continues.

The challenge

“It’s not too difficult to find two or four independent things you can do concurrently, finding 80 or more things is more difficult, especially for desktop applications.

“It is going to require quite a revolution in software programming.

“Massive parallelism has been the preserve of the minority – a few people doing high-performance scientific computing.

“But that sort of thing is going to have to find its way into the mainstream.”

What is one of the causes of this problem?

Current programming models are built on strong assumptions about continuity of the location of processing. This is true of common programming tools and languages (e.g. Java, C, C++, PHP, Visual Basic, Perl, Delphi, Pascal, Kylix, Python, SQL, JavaScript, C, SAS, COBOL, IDL, Lisp, Fortran, Ada, MATLAB, RPG) but is also true of explicitly distributed projects like seti@home and the Windows Communication Foundation.

One of the challenges in “finding 80 or more things” to do at once is overcoming the assumption of continuity of the locus of programming. Doing parallel programming using current programming models is tough. The programmer is constantly fighting the assumptions that underpin the language that she is programming in.

Contribution of the HTC

The HTC is, in part, an attempt to eliminate the effect of programmers implicitly making choices about where processing will be done through their choice of technology. Core concepts of the HTC are that

  1. all computing resources are presented as the ability to complete HTTP requests,
  2. HTC programs reference all input information as URLs.
  3. the HTC depends on an extended HTTP which includes an offer of assistance along with the request for the information at a URL. The HTTP request becomes œplease give me the information located in information space at this URL, and by the way, I have processing and storage available in my HTC and I am happy to help with the processing involved. The HTC serving the request may
    • return the HTML of a page, or
    • code that calculates it. The returned code would, of course, reference its input data in the same way – as further URLs.

The HTC brings the network right into the core of programming and removes completely any assumptions about the location of processing. If the 80-core chip was programmed as an HTC – any request for a result could be performed on the same processor, another one of the 80 on the chip or – for that matter – on a computer with spare capacity 1/2 a world away.

Extending the typical RPC model with an offer to help compute the results in one stroke enables:

  • code mobility,
  • removal of all assumptions of continuity of locus of programming, and
  • can provide “80 or more things” to do.

The HyperText Computer (HTC) and seti@home

SETI@home is a computing project that analyzes radio telescope data using the spare computing power available in internet connected computers. Users who wish to offer their computers’ processor and storage to the project, download and install BOINC – the Berkeley Open Infrastructure for Network Computing. BOINC accepts units of computing work from the seti@home server, does the work on your computer and then returns the results. BOINC also makes sure that the user’s other work is not interfered with by the seti@home work. The paper “Designing a Runtime System for Volunteer Computing (2006)” is a very readable description of how BOINC works.

BOINC shares many similarities with the proposed HyperText Computer (HTC). Lets look at how an HTC could be used to serve a project like seti@home.

HTC is, in part, an attempt to eliminate the effect of programmers implicitly making choices about where processing will be done through their choice of technology. Two core concepts of the HTC are that one, all computing resources are presented as the ability to complete HTTP requests, and two, that HTC programs reference all input information as URLs. Thirdly, the HTC depends on an extended HTTP which includes an offer of assistance along with the request for the information at a URL. The HTTP request becomes “please give me the information located in information space at this URL, and by the way, I have processing and storage available in my HTC and I am happy to help with the processing involved.” Webservers may return the HTML of a page, or code that calculates it. This mechanism provides an alternative to in-browser Javascript. These ideas are discussed here.

These mechanisms may also provide a generic alternative to special software like BOINC. Here is how it may work. If the computing resources of my desktop computer are managed by an HTC, and the seti@home project was also hosted on an HTC then from a user agent (browser), I could visit the seti@home website and request a “participate in seti@home project page”. This page would return their analysis code to my HTC which would begin executing the code, pulling radio telescope data from the seti@home server as needed using HTTP GET and HTTP POSTing the results back to the seti@home server when complete.

To accommodate the seti@home project, and other similar projects, an HTC on an end-user’s computer would need to adjust processing priorities based on the busyness of the computer and support long running threads.

Today, programmers who wish to use end-user computers’ spare cycles for their projects must use the special programming model offered by tools like BOINC to accomplish this. The proposed HTC, provides an alternative to this, where the programming model is the same wherever the processing takes place. The unification of the programming model makes life easy for programmers, and also for those responsible for corporate IT infrastructure. For example, in a corporate environment, a proxy server could trap the returned code and execute it on behalf of a user’s browser without the programming model or user being affected!

The HyperText Computer (HTC) and the Windows Communication Foundation

As part of .NET 3.0 the Windows Communication Foundation (WCF) is a unified programming model from Microsoft that delivers many of the benefits motivating the proposed HyperText Computer. Some of the key benefits of the WCF include: Replacing 7-8 different programming models with a single one, the programmer may interact with local and remote objects using the same language constructs, and WCF is designed to be flexible – it is not confined to using only HTTP as its network protocol. WCF is also designed to connect to other web services using the WS-* standards.

However, in my reading, the features of WCF differ in a few key ways to the proposed HTC. While the programming model of WCF replaces many other models from Microsoft and therefore may be described as “unified”, the WCF “plumbing” is quite visible. Programmers still have to choose to employ WCF technology. And through their choice of technology to influence the locus of execution. The vision of the HTC is that every resource is accessed using a network protocol and that therefore the programming model is a unity. This is discussed here.

Also, the possibility of extending the unification of the programming model to the user agent (browser) level does not seem to be in view for the WCF. The present situation (e.g. AJAX) where the server operates on one model and the client browser on a separate one forces programmers through their technology decisions to choose the location of processing and storage. As discussed here the proposed HTC suggests an alternative to browser hosted languages and a mechanism for location-of-processing decisions to be made at run time. The HTC, related to this, offers a mechanism for automatic code mobility which does not seem to be addressed by WCF.

These comments not withstanding, the WCF is a major achievement and step towards a future where programmers truly do have a single programming model. And … it is also an implemented reality!

The HTC does for programming what has already been done for information retrieval.

What has the web done for information? In short, the web has made the network invisible when retrieving information. The distance traveled and the technology employed to deliver a request from a browser and the response returned makes no difference to the user’s perspective. That is part of the magic of the web. The web has been designed so that caching can be automated. Information may be transparently moved in response to network conditions and local policy so that requests are satisfied in a way that is optimal for the owner of connected computing systems.

In contrast the situation for programmers is that the network is still very visible. Most computing environments still force programmers to choose in advance where processing is done. Often this choice is implicit in the choice of technology employed. PHP means processing done on the server. Javascript means processing done on the user’s computer. Most computing environments use widely different paradigms for accessing local resources vs remote resources. The computer carries within it implicit information about “here” and “out there on the net”.

The visibility of the network in the programming model means that programmers have to make explicit decisions about “here” and “there”. Just as this distinction has been erased for information retrieval, it is time to erase this distinction from our programming models. One way of achieving this is to explicitly build a model computer on top of HTTP. This is the approach taken by the proposed HyperText Computer (HTC). The HTC accesses all resources across the network. This uniformity in programming model allows the decision about where to execute the HTC’s code to be taken at runtime, similar to the way that caching makes runtime decisions about where to locate information. Depending on factors such as the availability of a local HTC, and the willingness of the owner of the code to allow it to be transported to other HTCs, computing could be done remotely or locally.

With the widely varying amounts of processing and storage available on a rapidly increasing array of devices, is it time to offer to programming the benefits that the web already offers for information retrieval, that is, make the network invisible?

The HyperText Computer (HTC) and IBM’s Infinity Project

This morning I found a recent report by Darryl K. Taft quoting Stefan Schoenauer of IBM titled “Future Net: Expanding the Web from Pages to Data Sources“. It appears that IBM’s Infinity middleware project may be a proto-HyperText Computer. While the details are sketchy, here is what we know…

The HTC is a model computer that processes information by making HTTP requests and references information only through URLs. In an HTC all computing power is presented as the ability to complete HTTP requests.

What is IBM’s Infinity project?

What prompted the Infinity project was a great big “what if,” Schoenauer said: What if all the information stored in devices like cell phones, PDAs, RFID (radio-frequency identification) chips and USB sticks could be accessed much the way Web sites are today, or even more easily

IBM’s [Infinity] prototype is notable because as yet there is no standard way to share data between diverse mobile devices directly in ad hoc networks. And because the variety of mobile operating systems offers so many different programming environments and interfaces, applications have to be custom-developed for each platform. The vast range of data types, database software and connection hardware involved make it difficult to achieve broad-spectrum mobile device integration. Infinity technology will improve cross-platform integration and communication for mobile applications, and will enable application developers to more easily develop applications for a variety of mobile devices, IBM said.

The goal is universal access to heterogeneous computing resources with a single programming model. This is. of course, very similar to the objective of the HTC.

The article is sketchy on the implementation details of the Infinity Project however, what is stated sounds as if the project is a step towards the creation of an HTC.

“The middleware itself looks very much like a Web server on the Internet. The applications are HTML pages with some JavaScript, and they communicate via HTTP.” In addition, the platform uses XML as a data exchange format, he said.

Each of the devices in question definitely has its own user agent (browser, interface). It also has additional computing capabilities that could be made available to others. It appears that Infinity presents the computing resources (processing and information) of all these devices on the net as the ability to fulfill HTTP requests. If so, that is a hallmark of the HTC. The article does not discuss the possibilities of code mobility offered by the HTC and still presumes the necessity of Javascript to which the HTC offers an alternative, but it appears that the Infinity project may be offering us a proto-HTC!

We live in interesting days.