generic algorithms, resource mgmt, and other ramblings

May 30, 2002

Posted by Patrick Down

Permalink

Patrick Down

Permalink

Lately in some posts there has been talk about D objects and the fact that you can't automatically tie object lifetime to a stack frame or to the lifetime of other objects like you could do in C++.  I think what is most lamented is the fact that you can't use this feature to do automatic deterministic resource management.

The following ideas follow from languages like Smalltalk,
Ruby, and now Python ( version 2.2 ).  These languages
provide some mechanisms that can be used as alternatives
to stack frame object lifetime management.

If we look at a very simple example with out error checking of a C++ file helper class we might write it like this.

class File
{
  FILE* f;
  public:
  File(char* name, char* mode)
  {
    f = fopen(name,mode);
  }
  ~File()
  {
    fclose(f);
  }
  operator FILE*() { return f; }
};

void foo()
{
  File fin("data.dat","r");

  // use fin to read and process data
  // from the file

  // destructor closes the file
}

If we look at this we can generalize this concept into a generic algorithm where some of the steps are predefined, opening and closing the file for example, and other steps are user defined.

1. Open file (predefined)
2. Read/Write/Process (user defined)
3. Close file (predefined)

Languages like smalltalk, ruby, and python actually have
a way to encapsulate these generic algorithms.  In Ruby for
example you can do the same file C++ trick above like this...

File.open("data.data") do |aFile|
  # do io with aFile
end

Even better you can print all the lines in a text
file like this...

File.foreach("data.txt") do |aLine|
  print aLine
end

Each of the functions above, open and foreach, get an
invisible parameter which is a reference to the code
block that follows the function call.  In Ruby open
might be written something like this...

def open(filename,mode)
  aFile = File.new(filename,mode)
  yield aFile
  aFile.close()
end

The yield statement passes control to the user defined
code block with the file object as a parameter.  When
that code block is done the file is closed.

As you might imagine this is usefull for all sorts of things.  Sorting for example.

# sort an array of objects by their name attribute
a.sort { | x,y | x.name < y.name }

The sort algorthm expects some code to do the comparison. In C you could use qsort but you would have to go someplace else in you code to define the criteria function.

It's easy to do this sort of thing with these interpreted dynamically bound languages.  Everything is an object to them, even code blocks. However, it has always seemed to me that there should be a way to at least get some of this functionality with statically bound languages, like C++... or maybe even D.

C++ actually does attempt to provide functionality like this
thru the STL and the algorithm templates.  The problem with
the STL approach comes from the user defined part of the
algorithm.  You ether need to use a fairly unreadable template
base functor language or define a separate class or function
that has at least some semantic distance from the place it
is used.

I would propose a language level feature to support this idea.  I've thought a long time about how to do this syntactically and this is the best that I've come up with so far.

Heres the file open from above.  I'm just going to use the stdio functions and leave out error checking to make the example simple.

// Define the function
void withfile(char[] name, char[] mode)
: void do(FILE*)
{
  FILE* f = fopen(toStringz(name),toStringz(mode));

  do(f);

  fclose(f);
}

// Use the function
void foo()
{
  char buffer[1024];
  int size;

  withfile("test.dat","r")
  :do(f)
  {
    size = fread(buffer,1,1024,f);
  }
}

The section after the function header defines a function prototype to be matched up with a similarly named function block at the place the function is called.  I defined it this way to allow more than one user defined code block.

void Func(int a)
: void foo(int),void bar(int)
{
  //...
  foo(b);
  //...
  bar(c);
  //...
}

Now image the power of this when combined with generic types.

void sort(type$[] array)
: bit compare(type$ a, type$ b)
{
  // ...Do sorting stuff
  if(compare(array[i],array[j])
  {
    // whatever
  }
}

int[] arr = { 23, 13, 56, 23, 19 }

sort(arr): compare(a,b) { return a < b; }

class Foo
{
  char[] GetName() { ... }
  int GetAge() { ... }
}

Foo[] arr;

// sort by name
sort(arr) : compare(a,b) { return a.GetName() < b.GetName(); }

// sort by age
sort(arr) : compare(a,b) { return a.GetAge() > b.GetAge(); }



The next question is how do you implement this functionality behind the scenes?   Simple functions are just inlined when possible.  When the function is complex or inlining is not possible then the code blocks are actually implemented as functions and passed as invisible parameters to the function that uses them.

Let's do some cfront like translations to see how this works. Take the withfile function above.  It translate to this.

void withfile(char[] name, char[] mode, void deligate(FILE*) do)
{
  FILE* f = fopen(toStringz(name),toStringz(mode));

  do(f);

  fclose(f);
}

The function call is translated like this.

// Create function for code block

void withfile_foo_do(FILE* f)
{
  // Opps, where do buffer and size come from?
  size = fread(buffer,1,1024,f);
}

void foo()
{
  char buffer[1024];
  int size;

  withfile("data.dat","r",&withfile_foo_do);
}

As you can see there is a problem with the scope of the
buffer and size variables.  We solve this by creating a
pointer that points into the callers stack to access
the local variables.  This pointer can be passed as another
invisible parameter.

struct foo_stack
{
  char buffer[1024];
  int size; // order depends on how the stack is layed out
}

void withfile(char[] name, char[] mode, foo_stack* pstk, void deligate
(FILE*) do)
{
  FILE* f = fopen(toStringz(name),toStringz(mode));

  do(f,pstk);

  fclose(f);
}

void withfile_foo_do(FILE* f,foo_stack* pstk)
{
  // Opps where do buffer and size come from?
  pstk.size = fread(pstk.buffer,1,1024,f);
}

void foo()
{
  char buffer[1024];
  int size;

  withfile("data.dat","r",cast(foo_stack*)&size,&withfile_foo_do);
}


Well thats it in a nutshell.  I'm sorry the examples are so simple.

"Patrick Down" <pat@codemoon.com> wrote in message news:Xns921EBF196284Apatcodemooncom@63.105.9.61...
>  void foo()
> {
>   char buffer[1024];
>   int size;
>
>   withfile("test.dat","r")
>   :do(f)
>   {
>     size = fread(buffer,1,1024,f);
>   }
> }
>
Well,

void foo()
{
 char buffer[1024];
  int size;
  FILE* f = fopen(toStringz(name),toStringz(mode));
  size = fread(buffer,1,1024,f);
  fclose(f);
}

seems shorter, and cleaner. But here comes the problem: Exceptions!
If fread causes an exception the file will not be closed in either case.
So modifiing your example:

void withfile(char[] name, char[] mode)
: void do(FILE*)
{
  FILE* f = fopen(toStringz(name),toStringz(mode));
  try {
    do(f);
  } finally {
    fclose(f);
  }
}

Seems better. From this point of view your syntax may decrease client code
complexity.
For the particular problem of stack lifetime, I propose a smaller syntax
sugar.

void fn()
{
  int size;
  owned FileWrapper f = new FileWrapper(name, mode);
  size = f.fread(buffer,1,1024);
  f.whatever();
  something();
}

Where 'owned' is a keyword, which would actually mean:

void fn()
{
  int size;
  FileWrapper f = new FileWrapper(name, mode);
  try {
    size = f.fread(buffer,1,1024);
    f.whatever();
    something();
  }
  finally {
    delete f;
  }
}

And then the destructor/finalizer for FileWrapper could close the file.
The finally is put to the end of the smallest block inside which f was
declared.
It doesn't seem to conflict with the garbage collector.

Yours,
Sandor

Forums