August 17, 2001
10/ Bit fields of arbitrary size

As other posters noted, there are cases where bit fields are just absolutely required. LX allows bit fields, again using predefined pragmas. I used to work a lot on real-time applications, and, no, sorry, bitmasking and shifting was not really an option.

 {volatile} {align 16} {memory_width 16} {memory_space IO}
 type chip_control_register is record with
  integer command  {bit_offset 0} {bit_size 3}
  integer size   {bit_offset 4} {bit_size 4}
  boolean ready  {bit_offset 8} {bit_size 1}
  boolean error  {bit_offset 9} {bit_size 1}

 {address 16#FFFF_ED00} chip_control_register chip_A
 {address 16#FFFF_ED04} chip_control_register chip_B
 {address 16#FFFF_ED08} chip_control_register chip_C


By the way, the statement that "bitmasking is better because it generates better code" is a very x86-centric view of the universe. The PA-RISC and IA-64 have "extract" and "deposit" instructions, and the compilers are much better at finding those on bit field accesses than on bitmask and shifting (at least at lower optimization levels)


Christophe

August 17, 2001
11/ Support for 16-bit computers

I believe what you really meant was: support for the 16-bit segmented memory model on the x86 architecture.

LX pragmas also offer a nice answer there. Should you need near/far pointers, which may happen on some 64-bit architectures (where near means 32-bit, and far 64-bit):

 {near} type near_int_pointer is pointer to integer
 {far}  type far_int_pointer is pointer to integer

Regarding more modern uses of {near} and {far}. On IA-64 on HP-UX, there is a 32-bit memory model (for better source compatibility with 32-bit applications). Internally, all pointers are really 64-bit, however, they are converted when used. And, guess what, there are cases where you need a 64-bit pointer from a 32-bit application :-) Not that HP implemented a _far keyword, for that matter...


Note that there is also a standard pragma (bit_size, seen above) that might
fit the bill:

 {bit_size 16} type near_int_pointer is pointer to integer
 {bit_size 32} type far_int_pointer is pointer to integer

but to be exhaustive, near and far convey more information than just bit size.


Christophe

August 17, 2001
12/ RTTI

LX uses "reflection" rather than RTTI, and the full variant of it (not introspection as in Java). The LX runtime environment that supports it is independent (called Mozart). The key connection between reflection and LX sources are LX pragmas that we already saw a couple of times. When the LX compiler parses:

 {glop} function foo() is
  blah; bloh; blih

it looks for a  pragma handler for {glop}, and passes it a standardized tree representation for function foo. The pragma handler returns what foo should be replaced with. In the future, pragma handlers will possibly be in separate DLL. In any event, the intent is that the user can add their own pragmas. Pragmas can be invoked in a number of phases of the compiler: after parsing, before or after semantics, on references to the object (for pragmas that apply to declarations), before or after expansion, or before or after code generation, or as part of the optimization phase.

With reflection, you can implement: persistence, garbage collection, policy control, automatic debugging code generation, etc... See http://mozart-dev.sf.net.


Anyway, you talk somewhere about: How do I keep an enum and a char array in sync. Well, if I had a preprocessor, here is what I would do:

    // file.tbl
    ZNORT(First)
    ZNORT(Second)
    ZNORT(Third)

    // file.c
    enum Znort {
    #define ZNORT(x) x,
    #include "file.tbl"
    #undef ZNORT
        Last
    }

    char *Znort_names[] = {
    #define ZNORT(x) #x,
    #include "file.tbl"
    #undef ZNORT
        NULL
    }

And, in case you wonder, I do use this technique extensively in the LX compiler to keep track of things like tokens, predefined strings, run-time tables, etc.


Now, let's assume you don't have macros. That's where reflection RULES. It makes things simpler to read for the end user, but the process is a bit more dirty.


 import CLX = LX.Compiler

 {define_pragma named_enum before_semantics}
 procedure NamedEnumPragma(CLX.context C; CLX.tree T) is

  using CLX
  {inline} procedure Error(string S) is
   CLX.Error C, Position(T), S

  with enum_type ET := T as enum_type
  if ET = nil then Error "Not an enum"

  -- Declare a temporary array, with the enum as an index
  with declaration ArrayDecl := declaration(
   name: tempname("enum"),
   type: quote(array[low..high] of string)
   initializer: nil)

  -- Replace "low" and "high" names in the above quote
  -- with the enum first and last
  with expression low := quote(X.low)
  replace low, quote(X), T  -- Turn that into <enumtype>.low
  with expression high := quote(X.high)
  replace high, quote(X), T
  replace ArrayDecl, quote(low), low
  replace ArrayDecl, quote(high), high

  -- Create the initializer. Don't use 'quote' for a change
  with string Args of expression
  with enumerator E
  for E in ET.enumerators loop
   Args &= E.name
  with initializer Init := call(
   callee: name("array"),
   arguments: Args)
  ArrayDecl.initializer := Init

  -- Now create a function that allows Name(E) and E.name
  with declaration FDecl :=
   quote(function Name(E arg) return string
     written E.name is
     return ArrayDecl[arg]);
  replace FDecl, quote(E), ET

  -- Finally, append the two declarations just after the enum type decl
  InsertDeclaration context: C, after: T, declaration: ArrayDecl
  InsertDeclaration context: C, after: ArrayDecl, declaration: FDecl

 -- OK, that was hard. Now for the fun part: see use of {named_decl}
 procedure Test() is
  with type X is {named_decl} enum(A, B, C, D)

  -- Should output AB
  IO.WriteLn Name(A), B.name



Christophe

August 17, 2001
13/ Garbage collection

LX supports it, but it is a side effect of well-defined pragmas applied to selected types, rather than built-in into the language. The {dynamic} pragma indicates that an object resides on the garbage collected heap. "object" and "record" differ only in that "record" can live on the stack, while "object" is always on the GC heap.

By the way, the LX compiler itself, although written in C++, uses a mark-and-sweep garbage collector that uses reflective information generated from C++ source :-)


Christophe


August 17, 2001
14/ Declaration vs. Definition - Modularity aspect

Good modularity sometimes imply that you don't make the implementation details public. LX has clearly separate interface and implementation, with a 'body' keyword that allows the kind of repetitions required in C++.

    -- Interface, in file fact.lxs
    function factorial(integer N) return integer

    -- Body, in file fact.lxb. 'body' replaces all arguments
    function factorial body is
        ...

For completeness, let me also indicate that what LX allows you do to with any object is what the interface says, regardless of whether you see the implementation. No 'private' parts. The exception is for types, where knowing the implementation gives you access to it.

    -- Type interface
    type complex with
        real Re, Im
        out real Rho, Theta

    -- This is legal just seeing the above
    function Re(complex Z) return real is
        return Z.Re

    -- Type definition (possibly another file)
    -- Notice there is no Rho and Theta fields
    type complex body is
        real Re, Im
    -- So we need to tell the compiler what to do when the user does Z.Rho
    function Rho(complex Z) return real written Z.Rho is
        return Math.sqrt(Z.Re^2 + Z.Im^2)


Christophe


August 17, 2001
15/ Declaration vs. Definition - Extensibility aspect

Your model of class definition doesn't allow one to extend a base class
(without deriving it), which in practice is a severe weakness in large
projects. Objective C is the only other compiled language I know that fixes
the problem (with categories). What LX does is giving up the notion of
"member function" (although you can recreate the X.f() notation with a
written form where suitable).

Ah ah, but what about dynamic dispatch, then? What LX has is the concept of dynamic object, which is essentially a (type, reference) pair. This is introduced with the any keyword. It allows C++ style dynamic dispatch with two major extensions: a/ You can add "virtual" functions to a base type at any point in the program, not just in the class interface. Thus, any class remains extensible. b/ Dynamic dispatch can apply to more than one argument at a time.

    type shape
    type rectangle like shape
    type type circle like shape

    -- "Virtual" function, and overrides for the various classes
    -- There is no single location where these have to be.
    -- Obviously, this requires a 'bind' phase to generate the vtables
    -- from the whole program.
    procedure Draw(any shape S) written S.Draw()
    procedure Draw(any rectangle S) written S.Draw()
    procedure Draw(any circle S) written S.Draw

    -- Multi-way dynamic dispatch
    function Intersect(any shape S1, S2) return any shape
    function Intersect(any rectangle S1, S2) return any rectangle
    function Intersect(any circle S1, S2) return any shape


Again, the major point here is that the "class" shape can be extended from wherever in the program. This doesn't break encapsulation, however, because the extensions can only see whatever interface is visible to them. In particular, the body of the shape type is not visible to the extensions.



Christophe

August 17, 2001
16/ Modules - Encapsulation

There are two orthogonal aspects that are commonly called "modules": 1/ encapsulating related stuff, and 2/ making declarations available across translation units.

For the first aspect, LX uses the "module" type, which is defined as:

    type module is constant record

You define a module interface as follows:

    module COMPLEX with
        type complex
        constant complex I
        function complex (real Re := 0.0, Im := 0.0) return complex
        function Add(complex A, B) return complex written A+B
        function Sub(complex A, B) return complex written A-B
        ...

You define a module implementation as follows (notice how 'body' comes handy here to avoid repeating the whole interface):

    module COMPLEX body is
        type complex is record with
            real Re, Im
        constant complex I is complex(0.0, 1.0)
        function complex body is
            result.Re := Re
            result.Im := Im
        function Add body is
            result.Re := A.Re + B.Re
            result.Im := A.Im + B.Im
        ...


Because a module is just a constant record, there are many things that are vastly simplified, notably lookup rules. Also, a 'using' statement works both for modules and for objects, since they are essentially the same thing.

    using COMPLEX
    complex Z
    using Z
    Re := 0.0
    Im := 1.0


Christophe

August 17, 2001
17/ Modules - Importing

There are two orthogonal aspects that are commonly called "modules": 1/ encapsulating related stuff, and 2/ making declarations available across translation units.

The second aspect of modules, communication across translation units, is handled in LX as follows. Any translation unit is made of a specification (foo.lxs) and an optional body (foo.lxb). Anything declared in a specification file can be imported from another source using import. For instance, we saw above:

    import IO = LX.Text_IO

The "IO =" part is optional, but is useful to help shorten references to a specific import without making the import globally visible. Without it, you have to write LX.Text_IO.WriteLn. With it, you can write IO.WriteLn and it does the same thing.

You can import any globally declared entity in a specification.


Christophe

August 17, 2001
18/ Modules - Shortening references

When using modules, it is also important to avoid polluting the namespace. That really matters for "million of lines of code" projects. I see from the Sieve example the "import stdio" in D implicitly imports printf, and I believe it is bad. LX still allows you to shorten references (make them implicit) with the "using" keyword.

An "using" statement adds an entity to the current context's map (it comes last in the lookups for that context). For instance:

    function Mod(in out array A of complex) return real
        with array.index.value I
        result := 0.0
        for I in A.range loop
            using A[I]
            -- All references are implicitly to A[I]
            result += Re * Re + Im * Im
        result := Math.sqrt(result)

Now, remember that modules are simply constant records? This allows one to say:

    using IO
    WriteLn "Hello world"



Christophe

August 17, 2001
19/ Modules - Child modules

One of the toughest aspects of Ada-style modularity is how can modules be extended safely. LX borrows from Ada the notion of child package, although it is implemented slightly differently.

Essentially, what this means is that there is a "master module" LX, which is extended (with knowlege of the internal details of module LX) by a module named LX.Text_IO, which itself is extended (with knowledge of the internal details of module LX.Text_IO) by a module named LX.Text_IO.Formatting

This system allows you to structure your software like russian dolls, with each inner doll visible from the outside when you open the larger doll, but also, while inside the larger doll, seing the inside of the larger doll. In other words, you don't break encapsulation, but you preserve extensibility.



Christophe