2 Distribution Model

The basic difference between a distributed and a centralized program is that the former is partitioned among several sites. We define a site as the basic unit of geographic distribution. In the current implementation, a site is always one operating system process on one machine. A multitasking system can host several sites. An Oz language entity has the same language semantics whether it is used on only one site or on several sites. We say that Mozart is network-transparent. If used on several sites, the language entity is implemented using a distributed protocol. This gives the language entity a particular distributed semantics in terms of network messages.

The distributed semantics defines the network communications done by the system when operations are performed on an entity. The distributed semantics of the entities depends on their type. The distribution model gives well-defined distributed semantics to all Oz language entities.

The distributed semantics has been carefully designed to give the programmer full control over network communication patterns where it matters. The distributed semantics does the right thing by default in almost all cases. For example, procedure code is transferred to sites immediately, so that sites never need ask for procedure code. For objects, the developer can specify the desired distributed semantics, e.g., mobile (cached) objects, stationary objects, and stationary single-threaded objects. Section 2.1 defines the distributed semantics for each type of language entity, Section 2.2 explains more about what happens at sites, and Section 2.3 outlines how to build distributed applications.

2.1 Language entities

2.1.1 Objects

The most critical entities in terms of network efficiency are the objects. Objects have a state that has to be updated in a globally-consistent way. The efficiency of this operation depends on the object's distributed semantics. Many distributed semantics are possible, providing a range of trade-offs for the developer. Here are some of the more useful ones:

Cached object: Objects and cells are cached by default--we also call this "mobile objects". Objects are always executed locally, in the thread that invokes the method. This means that a site attempting to execute a method will first fetch the object, which requires up to three network messages. After this, no further messages are needed as long as the object stays on the site. The object will not move as long as execution stays within a method. If many sites use the object, then it will travel among the sites, giving everyone a fair share of the object use.
The site where the object is created is called its owner site. A reference to an object on its owner site is called an owner or owner node. All other sites referencing the object are proxy sites. A remote reference to an object is called a proxy or a proxy node. A site requesting the object first sends a message to the owner site. The owner site then sends a forwarding request to the site currently hosting the object. This hosting site then sends the object's state pointer to the requesting site.
The class of a cached object is copied to each site that calls the object. This is done lazily, i.e., the class is only copied when the object is called for the first time. Once the class is on the site, no further copies are done.
Stationary object (server): A stationary object remains on the site at which it was created. Each method invocation uses one message to start the method and one message to synchronize with the caller when the method is finished. Exceptions are raised in the caller's thread. Each method executes in a new thread created for it on the object's site. This is reasonable since threads in Mozart are extremely lightweight (millions can be created on one machine).
Sequential asynchronous stationary object: In this object, each method invocation uses one message only and does not wait until the method is finished. All method invocations execute in the same thread, so the object is executed in a completely sequential way. Non-caught exceptions in a method are ignored by the caller.

Deciding between these three behaviors is done when the object is created from its class. A cached object is created with New, a stationary object is created with NewStat, and an sequential asynchronous stationary object is created with NewSASO. A stationary object is a good abstraction to build servers (see Section 3.2.3) and fault-tolerant servers. It is easy to program other distribution semantics in Oz. Chapter 3 gives some examples.

2.1.2 Other stateful entities

The other stateful language entities have the following distributed semantics:

Thread: A thread actively executes a sequence of instructions. The thread is stationary on the site it is created. Threads communicate through shared data and block when the data is unavailable, i.e., when trying to access unbound logic variables. This makes Oz a data-flow language. Threads are sited entities (see Section 2.1.5).
Port: A port is an asynchronous many-to-one channel that respects FIFO for messages sent from within the same thread. A port is stationary on the site it is created, which is called its owner site. The messages are appended to a stream on the port's site. Messages from the same thread appear in the stream in the same order in which they were sent in the thread. A port's stream is terminated by a future (see Section 2.1.3).
Sending to a local port is always asynchronous. Sending to a remote port is asynchronous except if all available memory in the network layer is in use. In that case, the send blocks. The network layer frees memory after sending data across the network. When enough memory is freed, the send is continued. This provides an end-to-end flow control.
Oz ports, which are a language concept, should not be confused with Unix ports, which are an OS concept. Mozart applications do not need to use Unix ports explicitly except to communicate with applications that have a Unix port interface.
Cell: A cell is an updatable pointer to any other entity, i.e., it is analogous to a standard updatable variable in imperative languages such as C and Java. Cells have the same distributed semantics as cached objects. Updating the pointer may need up to three network messages, but once the cell is local, then further updates do not use the network any more.
Thread-reentrant lock: A thread-reentrant lock allows only a single thread to enter a given program region. Locks can be created dynamically and nested recursively. Locks have the same distributed semantics as cached objects and cells. This implements a standard distributed mutual exclusion algorithm.

2.1.3 Single-assignment entities

An important category of language entities are those that can be assigned only to one value:

Logic variable: Logic variables have two operations: they can be bound (i.e., assigned) or read (i.e., wait until bound). A logic variable resembles a single-assignment variable, e.g., a final variable in Java. It is more than that because two logic variables can be bound together even before they are assigned, and because a variable can be assigned more than once, if always to the same value. Logic variables are important for three reasons:
- They have a more efficient protocol than cells. Often, variables are used as placeholders, that is, they will be assigned only once. It would be highly inefficient in a distributed system to create a cell for that case.
  When a logic variable is bound, the value is sent to its owner site, namely the site on which it was created. The owner site then multicasts the value to all the proxy sites, namely the sites that have the variable. The current release implements the multicast as a sequence of message sends. That is, if the variable is on n sites, then a maximum of n+1 messages are needed to bind the variable. When a variable arrives on a site for the first time, it is immediately registered with the owner site. This takes one message.
- They can be used to improve latency tolerance. A logic variable can be passed in a message or stored in a data structure before it is assigned a value. When the value is there, then it is sent to all sites that need it.
- They are the basic mechanism for synchronization and communication in concurrent execution. Data-flow execution in Oz is implemented with logic variables. Oz does not need an explicit monitor or signal concept--rather, logic variables let threads wait until data is available, which is 90% of the needs of concurrency. A further 9% is provided by reentrant locking, which is implemented by logic variables and cells. The remaining 1% are not so simply handled by these two cases and must be programmed explicitly. The reader is advised not to take the above numbers too seriously.
Future: A future is a read-only logic variable, i.e., it can only be read, not bound. Attempting to bind a future will block. A future can be created explicitly from a logic variable. Futures are useful to protect logic variables from being bound by unauthorized sites. Futures are also used to distribute constrained variables (see Section 2.1.5).
Stream: A stream is an asynchronous one-to-many communication channel. In fact, a stream is just a list whose last element is a logic variable or a future. If the stream is bound on the owner site, then the binding is sent asynchronously to all sites that have the variable. Bindings from the same thread appear in the stream in the same order that they occur in the thread.
A port together with a stream efficiently implement an asynchronous many-to-many channel that respects the order of messages sent from the same thread. No order is enforced between messages from different threads.

2.1.4 Stateless entities

Stateless entities never change, i.e., they do not have any internal state whatsoever. Their distributed semantics is very efficient: they are copied across the net in a single message. The different kinds of stateless entities differ in when the copy is done (eager or lazy) and in how many copies of the entity can exist on a site:

Records and numbers: This includes lists and strings, which are just particular kinds of records. Records and numbers are copied eagerly across the network, in the message that references them. The same record and number may occur many times on a site, once per copy (remember that integers in Mozart may have any number of digits). Since these entities are so very basic and primitive, it would be highly inefficient to manage remote references to them and to ensure that they exist only once on a site. Of course, records and lists may refer to any other kind of entity, and the distributed semantics of that entity depends on its type, not on the fact of its being inside a record or a list.
Procedures, functions, classes, functors, chunks, atoms, and names: These entities are copied eagerly across the network, but can only exist once on a given site. For example, an object's class contains the code of all the object's methods. If many objects of a given class exist on a site, then the class only exists there once.
Each instance of all the above (except atoms) is globally unique. For example, if the same source-code definition of a procedure is run more than once, then it will create a different procedure each time around. This is part of the Oz language semantics; one way to think of it is that a new Oz name is created for every procedure instance. This is true for functions, classes, functors, chunks, and of course for names too. It is not true for atoms; two atoms with the same print name are identical, even if created separately.
Object-records: An object is a composite entity consisting of an object-record that references the object's features, a cell, and an internal class. The distribution semantics of the object's internal class are different from that of a class that is referenced explicitly independent of any object. An object-record and an internal class are both chunks that are copied lazily. I.e., if an object is passed to a site, then when the object is called there, the object-record is requested if it is missing and the class is requested if it is missing. If the internal class already exists on the site, then it is not requested at all. On the other hand, a class that referenced explicitly is passed eagerly, i.e., a message referencing the class will contain the class code, even if the site already has a copy.

In terms of the language semantics, there are only two different stateless language entities: procedures and records. All other entities are derived. Functions are syntactic sugar for procedures. Chunks are a particular kind of record. Classes are chunks that contain object methods, which are themselves procedures. Functors are chunks that contain a function taking modules as arguments and returning a module, where a module is a record.

2.1.5 Sited entities

Entities that can be used only on one site are called sited. We call this site their owner site or home site. References to these entities can be passed to other sites, but they do not work there (an exception will be raised if an operation is attempted). They work only on their owner site. Entities that can be used on any site are called unsited. Because of network transparency, unsited entities have the same language semantics independent of where they are used.

In Mozart, all sited entities are modules, except for a few exceptional cases listed below. Not all modules are sited, though. A module is a record that groups related operations and that possibly has some internal state. The modules that are available in a Mozart process when it starts up are called base modules. The base modules contain all operations on all basic Oz types. There are additional modules, called system modules, that are part of the system but loaded only when needed. Furthermore, an application can define more modules by means of functors that are imported from other modules. A functor is a module specification that makes explicit the resources needed by the module.

All base modules are unsited. For example, a procedure that does additions can be used on another site, since the addition operation is part of the base module Number. Some commonly-used base modules are Number, Int, and Float (operations on numbers), Record and List (operations on records and lists), and Procedure, Port, Cell, and Lock (operations on common entities).

Due to limitations of the current release, threads, weak dictionaries, and spaces are sited even though they are in base modules.

When a reference to a constrained variable (finite domain, finite set, or free record) is passed to another site, then this reference is converted to a future (see Section 2.1.3). The future will be bound when the constrained variable becomes determined.

We call resource any module that is either a system module or that imports directly or indirectly from a system module. All resources are sited. The reason is that they contain state outside of the Oz language. This state is either part of the emulator or external to the Mozart process. Access to this state is limited to the machine hosting the Mozart process. Some commonly-used system modules are Tk and Browser (system graphics), Connection and Remote (site-specific distributed operations), Application and Module (standalone applications and dynamic linking), Search and FD (constraint programming), Open and Pickle (the file system), OS and Property (the OS and emulator), and so forth.

2.2 Sites

2.2.1 Controlled system shutdown

A site can be stopped in two ways: normally or abnormally. The normal way is a controlled shutdown initiated by {Application.exit I}, where I is the return status (see the module Application). The abnormal way is a site crash triggered by an external problem. The failure model (see Chapter 4) is used to survive site crashes. Here we explain what a controlled shutdown means in the distribution model.

All language entities, except for stateless entities that are copied immediately, have an owner site and proxy sites. The owner site is always the site on which the entity was created. A controlled shutdown has no adverse effect on any distributed entity whose owner is on another site. This is enforced by the distributed protocols. For example, if a cell's state pointer is on the shutting-down site, then the state pointer is moved to the owner site before shutting down. If the owner node is on the shutting-down site, then that entity will no longer work.

2.2.2 Distributed memory management

All memory management in Mozart is automatic; the programmer does not have to worry about when an entity is no longer referenced. Mozart implements an efficient distributed garbage collection algorithm that reclaims all unused entities except those that form a cycle of references that exists on at least two different owner sites. For example, if two sites each own an object that references the other, then they will not be reclaimed. If the objects are both owned by the same site, then they will be reclaimed.

This means that the programmer must be somewhat careful when an application references an entity on another site. For example, let's say a client references a server and vice versa. If the client wishes to disconnect from the server, then it is sufficient that the server forget all references to the client. This will ensure there are no cross-site cycles.

2.3 Bringing it all together

Does the Mozart distribution model give programmers a warm, fuzzy feeling when writing distributed applications? In short, yes it does. The distribution model has been designed in tandem with many application prototypes and numerous Gedankenexperimenten. We are confident that it is basically correct.

Developing an application is separated into two independent parts. First, the application is written without explicitly partitioning the computation among sites. One can in fact check the correctness and termination properties of the application by running it on one site.

Second, the objects are given distributed semantics to satisfy the geographic constraints (placement of resources, dependencies between sites) and the performance constraints (network bandwidth and latency, machine memory and speed). The large-scale structure of an application consists of a graph of threads and objects, which access resources. Threads are created initially and during execution to ensure that each site does the desired part of the execution. Objects exchange messages, which may refer to objects or other entities. Records and procedures, both stateless entities, are the basic data structures of the application--they are passed between sites when needed. Logic variables and locks are used to manage concurrency and data-flow execution. See Section 3.3 for more information on how to organize an application.

Functors and resources are the key players in distributed component-based programming. A functor specifies a software component. A functor is stateless, so it can be transparently copied anywhere across the net and made persistent by pickling on a file (see the module Pickle). A functor is linked on a site by evaluating it there with the site resources that it needs (see the modules Module and Remote). The result is a new resource, which can be used as is or to link more functors. Our goal is for functors to be the core technology driving an open community of developers, who contribute to a growing global pool of useful components.