14 Referring To Distributed Entities: URL

In the age of the World Wide Web, resources needed by a running system don't just reside in files, they reside at URLs. The URL module provides an interface for creating and manipulating URLs as data-structures. It fully conforms to URI syntax as defined in RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax by T. Berners-Lee, R. Fielding, and L. Masinter (August 1998), and passes all 5 test suites published by Roy Fielding.

The only derogations to said specification were made to accommodate Windows-style filenames: (1) a prefix of the form C: where C is a single character is interpreted as Windows-style device notation rather than as a uri scheme -- in practice, this is a compatible extension since there are no legal single character schemes, (2) path segments may indifferently be separated by / or \; this too is compatible since non-separator forward and backward slashes ought to be otherwise escape encoded.

There is additionally a further experimental extension: all urls may be suffixed by a string of the form "{foo=a,bar=b}". This adds an info record to the parsed representation of the url. Here, this record would be info(foo:a bar:b). Thus properties can be attached to urls. For example, we may indicate that a url denotes a native functor thus: file:/foo/bar/baz.so{native}. Here {native} is equivalent to {native=}, i. e. the info record is info(native:'').

14.1 Examples

Here are a few examples of conversions from url vstrings to url records. Return values are displayed following the function call.

{URL.make "http://www.mozart-oz.org/home-1.1.0/share/FD.ozf"}

url(
   absolute  : true  
   authority : "www.mozart-oz.org"  
   device    : unit  
   fragment  : unit  
   info      : unit  
   path      : ["home-1.1.0" "share" "FD.ozf"]  
   query     : unit  
   scheme    : "http")

The absolute feature has value true to indicate that the path is absolute i. e. began with a slash. The path feature is simply the list of path components, as strings.

{URL.make "foo/bar/"}

url(
   absolute  : false  
   authority : unit  
   device    : unit  
   fragment  : unit  
   info      : unit  
   path      : ["foo" "bar" nil]  
   query     : unit  
   scheme    : unit)

The above illustrates a relative url: the absolute feature has value false. Note that the trailing slash results in the empty component nil.

{URL.make "c:\\foo\\bar"}

url(
   absolute  : true  
   authority : unit  
   device    : &c 
   fragment  : unit  
   info      : unit  
   path      : ["foo" "bar"]  
   query     : unit  
   scheme    : unit)

Here the leading c: was parsed as a Windows-style device notation and the backslashes as component separators.

{URL.make "foo.so{native}"}

url(
   absolute  : false  
   authority : unit  
   device    : unit  
   fragment  : unit  
   info      : info(native:nil)  
   path      : ["foo.so"]  
   query     : unit  
   scheme    : unit)

The {native} annotation is entered into the info feature.

14.2 Interface

URL.make

{URL.make +VR ?UrlR}

Parses virtual string VR as a url, according to the proposed uri syntax modulo Windows-motivated derogations (see above). Local filename syntax is a special case of scheme-less uri. The parsed representation of a url is a non-empty record whose features hold the various parts of the url, it has the form url(...). We speak of url records and url vstrings: the former being the parsed representation of the latter. A url record must be non-empty to distinguish it from the url vstring consisting of the atom url. The empty url record can be written e. g. url(unit). VR may also be a url record, in which case it is simply returned.

URL.is

{URL.is +X}

Returns true iff X is a non-empty record labeled with url.

URL.toVirtualString

{URL.toVirtualString +VR ?V}

VR may be a url record or a virtual string. The corresponding normalized vstring representation is returned. #FRAGMENT and {INFO} segments are not included (see below). This is appropriate for retrieval since fragment and info sections are meant for client-side usage.

URL.toVirtualStringExtended

{URL.toVirtualStringExtended +VR +HowR ?V}

Similar to the above, but HowR is a record with optional boolean features full, cache, and raw. full:true indicates that #FRAGMENT and {INFO} sections should be included if present. cache:true requests that cache-style syntax be used (see Chapter 15): the : following the scheme and the // preceding the authority are both replaced by single /. raw:true indicates that no escape encoding should take place; this is useful e.g. for Windows filenames that may contain spaces or other characters illegal in URIs.

URL.toString

{URL.toString +VR ?S}

Calls URL.toVirtualString and converts the result to a string.

URL.toAtom

{URL.toAtom +VR ?A}

Calls URL.toVirtualString and converts the result to an atom.

URL.resolve

{URL.resolve +BaseVR +RelVR ?UrlR}

BaseVR and RelVR are url records or vstrings. RelVR is resolved relative to BaseVR and a new url record is returned with the appropriate fields filled in.

URL.normalizePath

{URL.normalizePath +Xs ?Ys}

Given a list Xs of string components (see path feature of a url record), returns a list Ys that results from normalizing Xs. Normalization is the process of eliminating occurrences of path components . and .. by interpreting them relative to the stack of path components. A leading . is preserved because ./foo and foo should be treated differently: the first one is an absolute path anchored in the current directory, whereas the second one is relative.

URL.isAbsolute
URL.isRelative

{URL.isAbsolute +VR ?B}

{URL.isRelative +VR ?B}

A url is considered absolute if (1) it has a scheme, or (2) it has a device, or (3) its path started with /, ~ (user home directory notation), . (current directory), or .. (parent directory). For example, ~rob/foo/baz is absolute.

URL.toBase

{URL.toBase +VR ?UrlR}

Turns a url vstring or record into a url record that can safely be used as a base for URL.resolve without loosing its last component. Basically, it makes sure that there is a slash at the end.


Denys Duchier, Leif Kornstaedt, Martin Homik, Tobias Müller, Christian Schulte and Peter Van Roy
Version 1.4.0 (20080702)