RBDC, Continuation Passing Style, Closures, Lazy Evaluation and Mobile Applets

It has been recently pointed out to me that the mechanisms underlying Request Based Distributed Computing – RDBC (see primer) are related to Continuation Passing Style CPS, Closures, Lazy Evaluation and Mobile Applets. This is a good insight. Lets have a look at it.

The CPS pattern is where

the caller passes the callee code which the callee runs when when the callee is done with his unit of work. Return never passes to the caller, but rather to the third party designated by the caller.

CPS is a widely used programming style that addresses different kinds of issues to RDBC. CPS does not address the question of the locus of code evaluation whereas with RBDC there is an explicit mechanism that controls whether evaluation proceeds in the callee or the caller. Also, in CPS the callee operates with implicit trust that the caller will pass a sensible continuation, in RBDC the callee (server) does not trust the caller (client) and never receives code from the caller.

Closures is a mechanism that associates a function with state that lasts between invocations. Closures are often used in languages (like lua) where functions are themselves first-class objects. Closures and RBDC share at least one similarity. If functions are first-class objects then variables may hold functions (ie code) whereas we are used to variables holding values. With RBDC http requests may transparently return functions (ie code) whereas we are used to http resources being returned. Unlike closures, RBDC does not bind functions with state.

Lazy Evaluation is a computing strategy where an expression or function remains unevaluated until the result is required for further computation. RBDC can be characterised as a Lazy Evaluation strategy – sometimes. RBDC also supports Eager Evaluation . A point of the RBDC paradigm is that the location of execution is decided at run-time based on availability of computing resources, intellectual property and security concerns.

Request Based Distributed Computing is quite similar to the movement of applets using java.net.ContentHandler.getContent but a key difference is that the mechanism is generalised, independent of the language employed and the programmer does not need to know where the code will execute.

Another implicit feature of RBDC (inherited from http) is that the caller (client) can cache code received from a callee (server) and can proceed autonomously while the function code cache entry remains valid.

Distributed Computing with the Browser

Recently, Subbu posted an interesting discussion of an xml analysis and presentation application – you can read it here: Distributed Computing with the Browser.

This design scenario is a good illustration of the limitations of our current situation with programming . Our current situation is that while the WWW allows a programmer to ignore the network path to an information resource, as programmers, we can’t ignore where computing will be done. The programmer’s choice of technology (framework, language etc etc) carries with it the implicit choice about the location of computation (server or client).

An assumption behind Subbu’s post is that we need to decide the location of processing during the design phase. The purpose of this post is explore how the application could be built using Request Based Distributed Computing RBDC (see backgrounder). With the application recast as a RBDC application, the location-of-processing decisions can be made at runtime based on the availability of computing power and storage, intellectual property, and security issues.

The XML analysis and presentation application using RBDC

(This description presumes that you have read the RBDC backgrounder.)

The key distributed process in this application is the initial analysis of the source XML text, and the saving of the key features into a central database. Lets call this “analyse-save”. With RBDC, the code that performs analyse-save may be written as mobile code that will run on either the server, a proxy or on the client. Analyse-save may be implemented as the code that responds to a http POST request that uploads the source file to the server. It analyses the uploaded file then POSTS the results of the analysis to a central database.

When a RBDC compliant server receives the analyse-save request it may perform the analysis itself on the server or otherwise return the analyse-save code to the client. If the client receives code as a response to its analyse-post request then it would execute the code locally. In either case, the results of the analysis are POSTed to the central database using http.

Clients that have local processing capabilities signal through a http header in the POST request that they are able to accept mobile code as a response to the request. Alternatively clients without processing ability can make the same request signalling that they need the server to do all possible processing.

In this way – the architecture of the solution is the same for Subbu’s cases 1 and 3 with the decision about location of processing being made at runtime, not as part of the design.

Code Mobility and Session State

Code mobility as provided for by Request Based Distributed Computing RBDC (see backgrounder) is key for delivering On-Demand computing, Distributed Computing (e.g. SETI@home) and Rich Internet Applications.

RBDC enables the mobility of code that gets its input from http sources (url, request body, cookie, and passwordless GETs). This post looks into whether session state can be made mobile as well.

How can code that relies on session state be made mobile?

In a typical scenario, http servers associate session state with client request streams through the use of a server-unique session-id that is preserved between accesses via a cookie. An example of this is PHP’s handling of sessions. Under this scheme the server held session state prevents the code being mobile. The code is not mobile because the session state is only available in the server that generated the client’s requested resource.

Using RBDC, code that relies on session state can be made securely mobile. Here is one way.

[[Since writing this post, I have realised that the mechanism described here is the same mechanism that makes Google Reader Public Pages both globally available and private.]]

Firstly, the server stores the session state using a globally-unique-id (GUID) insead of server-unique-id as the key. The key is preserved between requests in the client cookie as is now done. Then the server makes the session state publically available at a well known URL. For example, an xml serialised version of the state could be GET and POSTable at a URL like https://www.myserver.com/sessionstate. The GUID used is sufficiently long to prevent guessing and therefore session state will be securely and globally available.

With session state stored in such a secure and globally available fashion, code that requires session state may also be mobile.

HTC and Cloud and Grid Computing

The HyperText Computing (HTC) paradigm is not a “complete solution” to the challenges and opportunites afforded by Cloud and Grid computing — however this post argues that the HTC is part of the solution. My angle into this question is via a recent blog post.

This is how Tim Foster, in a recent post at Grid Gurus, concludes his discussion of current and future trends of Cloud and Grid computing (emphasis mine):

In building this distributed œcloud or œgrid (œgroud?), we will need to support on-demand provisioning and configuration of integrated œvirtual systems providing the precise capabilities needed by an end-user. We will need to define protocols that allow users and service providers to discover and hand off demands to other providers, to monitor and manage their reservations, and arrange payment. We will need tools for managing both the underlying resources and the resulting distributed computations. We will need the centralized scale of today’s cloud utilities, and the distribution and interoperability of today’s grid facilities.

The concepts that Tim highlights: “on-demand provisioning”, “configuring integrated virtual systems”, providing “precise capabilities” and a focus on the needs of the “end-user” are all addressed by the HyperText Computing (HTC) paradigm. HTC also addresses the need to view central resources through the same lens as localised ones.

The HyperText Computing (or Request Based Distributed Computing – RBDC) — is a small extension of http and our conceptions of server, proxy and client. It creates a distributed computing platform that is built from an end-user perspective outwards just as http does for information. It is built on a recognition of the equivalence between http resources and the code that when executed will return the resource. RBDC unifes programming models by applying browser based sandboxed Virtual Machines (VM) to our conception of proxies and servers.

Key benefits of RBDC are ultra-lightweight distributed computing, run-time code mobility, and backwards compatibility with http.

A fuller description of RBDC may be found here.

Http offers location transparency for retrieving data, a small http extension can also provide location transparency for code execution.

The HTC and Java Remote Method Invocation

Java Remote Method Invocation JRMI (White Paper) is a distributed computing capability for the Java Platform. Like the HTC it is designed to facilitate “write once run everywhere” and “code mobility”. Naturally it does it within the paradigm of Java Objects.

The purpose of this post is to give a 30 second comparison of the JRMI and the Hypertext Computer (HTC) paradigm.

The HTC is not so much an extension of a language’s Virtual Machine but a reconceptualised computer – implemented using an extension of the http protocol along with identical Virtual Machines on client, proxy and server. It is language neutral.

No doubt the JRMI has many advantages of its own, however I would like to identify one major benefit that the HTC confers over the JRMI. It is this: the HTC does not rely on the designer choosing the locus of code execution at compile time (either on the client or on the server). To illustrate this lets use the following example from the JRMI white paper:

For example, you can define an interface for examining employee expense reports to see whether they conform to current company policy. When an expense report is created, an object that implements that interface can be fetched by the client from the server. When the policies change, the server will start returning a different implementation of that interface that uses the new policies. The constraints will therefore be checked on the client side-providing faster feedback to the user and less load on the server-without installing any new software on user’s system. This gives you maximal flexibility, since changing policies requires you to write only one new Java class and install it once on the server host.

This same scenario is handled, just as easily by the HTC paradigm. The user interface for examining employee expense reports is implemented in a client. To evaluate policy conformance the client requests a server with an HTTP GET. However the GET is extended with a request header that indicates to the server that the client has a particular virtual machine and is willing to receive a coderesource (ie program) instead of the result of the GET. The server may (at its option) return the current coderesource that defines the policy. The client then executes the coderesource and caches the compiled version of the code. The server set http caching parameters when it returned the coderesource to force the client to update its coderesource cache according to the applications update cycle. The advantage of the HTC’s handling of this scenario is that:

  1. Thin clients may request the same GET without offering to execute a coderesource and so would transparently be served with the correct result. Alternatively, the processing could be transparently trapped by a proxy serving a network of thin clients.
  2. While any particular implementation will choose one more computer languages The solution is language agnostic. It would work equally well for the JVM as it would with the .Net CLI
  3. The solution is very lightweight

Request Based Distributed Computing – A rough sketch

The Hypertext Computing (HTC) paradigm that I have written about in this blog is built on the following observations:

  • There is a fundamental equivalence between http resources and code that if executed would generate the resource
  • It is an accident of history that the scripting models of servers and clients on the web are different.
  • We have an opportunity to apply the lesssons learnt about building secure scriptable clients, to the building of servers and proxies.
  • While the WWW allows a programmer to ignore the network path to an information resource, as programmers, we can’t (yet) ignore where computing will be done. The programmer’s choice of technology (framework, language etc etc) carries with it the implicit choice about the location of computation (server or client).
  • Grid computing must integrate the client’s available computing power rather than assuming that ‘the cloud’ will do everything. As we anticipate processors with 100’s of cores, a bet against the computing power available at the edges of the network is a poor one.
  • The http protocol can be orthogonally extended so that instead of returning the resource at the given URL, a server may instead return code that will generate the resource when executed on a compatible virtual machine.

Doing this will enable us to:

  • Unify the programming models associated with delivering rich user experiences and satisfying http requests on client, proxy and servers.
  • Enable location of code execution to be determined at run time based on criteria like availability of computing power, security and intellectual property concerns rather than just on choice of technology as at present. Thus make location of code execution location transparent to the end user and to the system designer.
  • Facilitate extremely lightweight distributed computing through code mobility from a canonical source to the computing environment that executes it.

Request Based Distributed Computing

[[Update: I have added a set of graphics that illustrate the RDBC architecture.]]

An alternative name for Hypertext Computing is “Request Based Distributed Computing” that is the name that I will use for the remainer of this article. This informal sketch of the Request Based Distributed Computing paradigm involves extending the definition of the http protocol, client, proxies and servers.

http

In achieving the aim of request based distributed computing this proposal does not break the power and security inherent in the request based http model, for example it:

  • does not not assume clients may be interrogated or polled by servers and
  • never expects clients to send code to a server for execution and
  • does not imply that servers become stateful and
  • does not assume that trust can be delegated to a third party process.

The http protocol defines resources which are located using URLs. Request Based Distributed Computing is enabled by the extension of the definition of “resource” to include “coderesource” identified by an extension header field. Coderesources are http resources that are executable on a known Virtual Machine. after executing on a VM the result is indistinguishable from the resource that a webserver would send in response to the same URL. If an http resource returns a coderesource rather than the resource itself, then a well behaved resource will return code WITHOUT reference to the particular data passed via the url/ and or cookie. Internally during one invocation of a CodeResource it operates with the full usage of all language features, local vars etc etc available to it. A coderesource contains a single entry point. CodeResources are wrapped in xml that contains at least the following information:

  • The name and version of the VM that the code may be executed on. Well behaved VMs are always able to execute legacy version code according to the version number in the CodeResource.
  • Contains a mark that is respected by the serving VM that controls mobility. Mobile (or not)
  • Contains a mark that is respected by the serving HTC that controls execution. Executable (or not)

Note: a coderesource marked Not Mobile and Executable corresponds to the behaviour of today’s .php scripts.

A coderesource gets its input from 4 sources:

  • URL parameters
  • A cookie
  • GETs on public http URLs
  • Private resource such as a local databases or GET from a password secured http URL

Code the gets its input from sources 1, 2 and 3 only is mobile code. A large amount of today’s web code can be written in a mobile form. Especially code that facilitates Rich Internet Applications; gadgets and distributed computing projects like SETI@home.

Code that refers to http://localhost resources is not mobile however code that refers http://client is mobile. While http://localhost is understood to refer to a resource local to the server on which the code is found, http://client is introduced to stand for resources on the initiator of the http request. Code containing references to http://client is NOT executable on the server (or on a proxy) since only the client has access to its state.

In addition the extension response header field that identifies that the content of the resource is a coderesource there is an extension request header field that indicates that the request is for mobile code or the requester is open to receiving mobile code. The absence of this request header indicates that the http request is a standard one where the resource itself is expected. RBDC Proxy servers may add this header if it has a local VM, then trap the returned code, execute it and return the resource to the client as expected.

Request Based Distributed Computing (RBDC) Servers

A RBDC Server is an extension of the common web server. It includes at least one sandboxed Virtual Machine (VM) similar to .Net’s CLI or a JVM. A key is that the same virtual machines are used on servers, proxies and clients. If the VM’s primitive instructions are extensible (e.g. like PHP extensions) then the mechanism of extension is by requesting coderesources from canonical RBDC Servers. VM’s contain a look-aside code cache that operates using the http caching mechanism.

Code executed on behalf of a client will generate an error if it refers to http://client resources. If the request that resulted in the failure indicates that the requesting client has the capacity to execute code then the server may return the coderesource instead of the result.

RBDC compatible Proxies

RBDC compatible Proxies also include a VM. HTTP responses that are coderesources that flow through the proxy may be intercepted and executed on the proxy, with the resulting resource returned to the client. Naturally, if the client has specifically requested a coderesource, well behaved proxies will not attempt to execute it.

Proxies can be used on the perimeter of networks to automatically perform processing on behalf of thin clients.

RBDC compatible Clients

A representative example of a http client is a web browser. Clients that support Request Based Distributed Computing contain a Virtual Machine. The VM identifies and accesses ALL local resources via http://client. The local VM may satisfy requests for http://client without employing a full networking stack.

If a client is returned code that it can’t execute it may re-request the URL with headers that request the server to return the resource rather than its coderesource. This can be trapped by proxies and executed and returned or executed by the server and returned.

RBDC clients deprecate existing scripting solutions that are not compatible with RBDC. Scripts embedded in web pages can be references to coderesources available on the web or treated as anonymous functions on the VM. These scripts refer to the DOM via http://client.

RBDC clients have default sandbox security which may be relaxed by the user.

Conclusion

The Hypertext Computer paradigm (or Request Based Distributed Computing) is a small extension of the http protocol and notion of server, proxy and client. Rich Internet Applications, SOA architected applications and SETI@home type distributed computing alike can utilise a common unified programming model. No longer will technology dictate the locus of code execution – instead issues like availability of computing power, intellectual property and security will dictate this at run time.

Click here for discussion of RBDC compared to current technologies.

Blood Test for Repetitive Stress

See the story at here

Guinness World Records are not interested

My Underwater Running World Record attempt took a hit today, being rejected by the people at Guinness World Records. They were very polite:

Dear Mr Pratten,

Thank you for sending us the details of your recent record proposal for ‘Underwater Running’. We are afraid to say that we are unable to accept this as a Guinness World Record.

We receive over 60,000 enquiries a year from which only a small proportion are approved by our experienced researchers to establish new categories. These are not ‘made up’ to suit an individual proposal, but rather ‘evolve’ as a result of international competition in a field, which naturally accommodates superlatives of the sort that we are interested in. We think you will appreciate that we are bound to favour those that reflect the greatest interest.

We realize that this will be disappointing to you. However, we have considered your proposal fully; in the context of the specific subject area and that of records as a whole, and our decision is final in this matter.

Once again thank you for your interest in Guinness World Records.

Yours sincerely,

(signed)
Records Management Team

There definitely is a lack of “international competition” in the Underwater Running “field” – perhaps that accounts for their lack of interest.

Pramati’s Dekoh and The Hypertext Computer

Pramati announced Dekoh this week. Dekoh is a platform that supports applications that run both on over the network and on the desktop. It embodies some of the ideas of an Hypertext Computer (HTC):

Dekoh Desktop is a small footprint download that can be installed on user™s desktop in a single click. Dekoh Desktop includes a web server on which applications written using open standards like JSP, Ajax, DHTML, Flash can be deployed and accessed thru a web browser. Applications deployed on Dekoh Desktop are automatically enabled for web 2.0 functions like tagging, sharing, commenting, rating, etc.

Dekoh Network allows controlled sharing of applications or content on the web. A user can share application/content on his or her desktop with a buddy, who can go to userID.dekoh.net and access it. The key thing to note is that the user is not required to upload different kind of content to different websites. Instead, the shared content and applications remain on the desktop and are served from there.

In particular, the presentation of the computing resources of the desktop to the world as a web server is an idea that is common to both Dekoh and HTC. The biggest difference between HTC and Dekoh is that Dekoh does not seem to address code mobility issues, instead, the choice by the programmer of using Dekoh does carry with it a choice about the locus of processing — it will be on your desktop.

20m Underwater Running World Record Attempt

As a young man, I invented the Australian style of underwater running while swimming in the Brunswick River, in Northern NSW.

I feel that this new discipline is under appreciated in the world of water sports!

Today I submitted the papers to Guinness World Records (GWR) for an attempt to set a world record for running 20metres underwater.

Yes – it is possible to run underwater. Although, as far as I know there is no current world record for this discipline!

Hopefully GWR will approve my attempt before the water in Sydney gets too cold! Time to get into shape!

Intel’s Teraflop chip and The Hypertext Computer

A chip with 80 processing cores and capable of more than a trillion calculations per second (teraflops) has been unveiled by Intel.

see the BBC report.

This new chip presents a great challenge to the programming community. The proposed HTC may be part of solving these challenges.

The BBC report continues.

The challenge

“It’s not too difficult to find two or four independent things you can do concurrently, finding 80 or more things is more difficult, especially for desktop applications.

“It is going to require quite a revolution in software programming.

“Massive parallelism has been the preserve of the minority – a few people doing high-performance scientific computing.

“But that sort of thing is going to have to find its way into the mainstream.”

What is one of the causes of this problem?

Current programming models are built on strong assumptions about continuity of the location of processing. This is true of common programming tools and languages (e.g. Java, C, C++, PHP, Visual Basic, Perl, Delphi, Pascal, Kylix, Python, SQL, JavaScript, C, SAS, COBOL, IDL, Lisp, Fortran, Ada, MATLAB, RPG) but is also true of explicitly distributed projects like seti@home and the Windows Communication Foundation.

One of the challenges in “finding 80 or more things” to do at once is overcoming the assumption of continuity of the locus of programming. Doing parallel programming using current programming models is tough. The programmer is constantly fighting the assumptions that underpin the language that she is programming in.

Contribution of the HTC

The HTC is, in part, an attempt to eliminate the effect of programmers implicitly making choices about where processing will be done through their choice of technology. Core concepts of the HTC are that

  1. all computing resources are presented as the ability to complete HTTP requests,
  2. HTC programs reference all input information as URLs.
  3. the HTC depends on an extended HTTP which includes an offer of assistance along with the request for the information at a URL. The HTTP request becomes œplease give me the information located in information space at this URL, and by the way, I have processing and storage available in my HTC and I am happy to help with the processing involved. The HTC serving the request may
    • return the HTML of a page, or
    • code that calculates it. The returned code would, of course, reference its input data in the same way – as further URLs.

The HTC brings the network right into the core of programming and removes completely any assumptions about the location of processing. If the 80-core chip was programmed as an HTC – any request for a result could be performed on the same processor, another one of the 80 on the chip or – for that matter – on a computer with spare capacity 1/2 a world away.

Extending the typical RPC model with an offer to help compute the results in one stroke enables:

  • code mobility,
  • removal of all assumptions of continuity of locus of programming, and
  • can provide “80 or more things” to do.

Structures (or why things don’t fall down)

J.E. Gordon’s book is a great read. Did you know that arches are so popular because they have to break in 4 places before they will fall down? Amazing factoids and insights for the lay person abound in this book. The technical details occasionally got a bit heavy.

The book comes to this wonderful conclusion:

Is it not fair to ask the technologist, not Only to provide artefacts that work, but also to provide beauty, even in the common street, and above all to provide fun? Otherwise technology will die of boredom. Let us have lots of ornament. … Since we have created a whole menagerie full of new artefacts, … , let us sit down and think what fun we can have in devising new kinds of decorations for them.

I did not expect to find an explanation of the success of Web 2.0 websites in a book on constructions. It was fun to realise that Web 2.0, like successful architecture owes much to skiamorphs, fakes and ornaments!

Human-Built World

Thomas P. Hughes “Human-Build World – How to think about technology and culture” compares and contrasts current optimism about the Internet, gadgets and technology with human attitudes to earlier generations of technology. Technology has a much longer history than the popular mind may be aware! A great read for anyone responsible for implementing technology. The book shows that our current rush to technology is part of a picture that has been being drawn for a long time.

The HyperText Computer (HTC) and seti@home

SETI@home is a computing project that analyzes radio telescope data using the spare computing power available in internet connected computers. Users who wish to offer their computers’ processor and storage to the project, download and install BOINC – the Berkeley Open Infrastructure for Network Computing. BOINC accepts units of computing work from the seti@home server, does the work on your computer and then returns the results. BOINC also makes sure that the user’s other work is not interfered with by the seti@home work. The paper “Designing a Runtime System for Volunteer Computing (2006)” is a very readable description of how BOINC works.

BOINC shares many similarities with the proposed HyperText Computer (HTC). Lets look at how an HTC could be used to serve a project like seti@home.

HTC is, in part, an attempt to eliminate the effect of programmers implicitly making choices about where processing will be done through their choice of technology. Two core concepts of the HTC are that one, all computing resources are presented as the ability to complete HTTP requests, and two, that HTC programs reference all input information as URLs. Thirdly, the HTC depends on an extended HTTP which includes an offer of assistance along with the request for the information at a URL. The HTTP request becomes “please give me the information located in information space at this URL, and by the way, I have processing and storage available in my HTC and I am happy to help with the processing involved.” Webservers may return the HTML of a page, or code that calculates it. This mechanism provides an alternative to in-browser Javascript. These ideas are discussed here.

These mechanisms may also provide a generic alternative to special software like BOINC. Here is how it may work. If the computing resources of my desktop computer are managed by an HTC, and the seti@home project was also hosted on an HTC then from a user agent (browser), I could visit the seti@home website and request a “participate in seti@home project page”. This page would return their analysis code to my HTC which would begin executing the code, pulling radio telescope data from the seti@home server as needed using HTTP GET and HTTP POSTing the results back to the seti@home server when complete.

To accommodate the seti@home project, and other similar projects, an HTC on an end-user’s computer would need to adjust processing priorities based on the busyness of the computer and support long running threads.

Today, programmers who wish to use end-user computers’ spare cycles for their projects must use the special programming model offered by tools like BOINC to accomplish this. The proposed HTC, provides an alternative to this, where the programming model is the same wherever the processing takes place. The unification of the programming model makes life easy for programmers, and also for those responsible for corporate IT infrastructure. For example, in a corporate environment, a proxy server could trap the returned code and execute it on behalf of a user’s browser without the programming model or user being affected!

The HyperText Computer (HTC) and the Windows Communication Foundation

As part of .NET 3.0 the Windows Communication Foundation (WCF) is a unified programming model from Microsoft that delivers many of the benefits motivating the proposed HyperText Computer. Some of the key benefits of the WCF include: Replacing 7-8 different programming models with a single one, the programmer may interact with local and remote objects using the same language constructs, and WCF is designed to be flexible – it is not confined to using only HTTP as its network protocol. WCF is also designed to connect to other web services using the WS-* standards.

However, in my reading, the features of WCF differ in a few key ways to the proposed HTC. While the programming model of WCF replaces many other models from Microsoft and therefore may be described as “unified”, the WCF “plumbing” is quite visible. Programmers still have to choose to employ WCF technology. And through their choice of technology to influence the locus of execution. The vision of the HTC is that every resource is accessed using a network protocol and that therefore the programming model is a unity. This is discussed here.

Also, the possibility of extending the unification of the programming model to the user agent (browser) level does not seem to be in view for the WCF. The present situation (e.g. AJAX) where the server operates on one model and the client browser on a separate one forces programmers through their technology decisions to choose the location of processing and storage. As discussed here the proposed HTC suggests an alternative to browser hosted languages and a mechanism for location-of-processing decisions to be made at run time. The HTC, related to this, offers a mechanism for automatic code mobility which does not seem to be addressed by WCF.

These comments not withstanding, the WCF is a major achievement and step towards a future where programmers truly do have a single programming model. And … it is also an implemented reality!