Thread overview
An old wart - opApply's return type is an int
Mar 08, 2005
Matthew
Mar 08, 2005
Kris
Mar 08, 2005
Matthew
March 08, 2005
No new arguments here, just an expanded, and largely new, audience. The old arguments were perfectly good anyway. I'll reiterate then now.

Consider std.openrj.Record.opApply():

  int opApply(int delegate(in char[] name, in char[] value) dg)
  {
    int result  = 0;
    foreach(Field field; m_fields)
    {
      result = dg(field.name(), field.value());
      if(0 != result)
      {
        break;
      }
    }
    return result;
  }


The result method is the vector by which the compiler generated delegate 'dg' communicates with opApply() and with the compiler translated code in the foreach statement.

The rule is that the implementor of opApply() either returns 0, or returns the result of the delegate. NOTHING ELSE IS ALLOWED. (And if you do _anything else_, you're likely to find your foreach client code doing all manner of weird things, as though it'd been told to break/goto/continue/next/<whatever>.)

There are four points where result is manipulated in this very basic opApply(). (FYI: it's possible to have much more complex ones.)

1
    int result  = 0;

2.
      result = dg(field.name(), field.value());

3.
      if(0 != result)
      {
        break;
      }

4.
    return result;

Now D claims to 'make it hard for programmers to make mistakes', or some such. I can assign any value to an int, and yet the semantics of opApply() require that I only ever assign 0 or the result of calling the delegate. This is a contradiction.

It would be a trivial matter to fluff any of those four points in the function. Imagine if you're doing complex things and using 'res' for your processing and 'ret' for your return value. Nasty.

Developers should be diligent, to be sure, but we're all human.

The solution, as I proposed a long long time ago, the solution to this is to define opApply() as having a return type of type OPAPPLY_RETURN_VALUE (or OpApplyReturn, or whatever). This would be an enum with *one* element:

    enum OPAPPLY_RETURN_VALUE
    {
        COMPLETE    =    0
    }

Because there's only one value - that we know about, anyway - it'd not be too difficult to make sure it was initialised correctly. The compiler would notionally cast things to and from this enum type in the foreach and delegates, but we'd never have to know or care about this. Everything would work exactly as now; the object code would likely contain exactly the same bytes as it does now. And there'd be no way to unwittingly screw with foreach/opApply().

Since this costs developers nothing, and Walter only some small and entirely one-off effort, and it guarantees that foreach/opApply() stuff-ups can only occur as a result of determined malicious effort, rather than all-too-easy mistake, I ask if anyone would put forth any arguments why it should not be adopted. D is, after all, making all kinds of claims to be helping developers write error free code ...

Cheers

Matthew



March 08, 2005
Presumably the delegate return-type should equate with the enum also? Otherwise, one would have to explicitly cast the assignment to 'result'.

From what I recall, Walter didn't comment upon why he apparently prefers the open-ended 'int' instead ... I'll go out on a limb and hazard that he /might/ prefer the obfuscation of the delegate return; i.e. if all the various 'commands' were to be spelt out in an enum declaration, some darned developer might take "advantage" of that. Or, similarly, perhaps the values are compiler-specific?

That's conjecture, of course. But it's the best I can come up with against the notion.

I'm certainly a fan of tightening up valid ranges, so would also like to hear why this remains the way that it is ...




In article <d0jd0v$1c5e$1@digitaldaemon.com>, Matthew says...
>
>No new arguments here, just an expanded, and largely new, audience. The old arguments were perfectly good anyway. I'll reiterate then now.
>
>Consider std.openrj.Record.opApply():
>
>  int opApply(int delegate(in char[] name, in char[] value) dg)
>  {
>    int result  = 0;
>    foreach(Field field; m_fields)
>    {
>      result = dg(field.name(), field.value());
>      if(0 != result)
>      {
>        break;
>      }
>    }
>    return result;
>  }
>
>
>The result method is the vector by which the compiler generated delegate 'dg' communicates with opApply() and with the compiler translated code in the foreach statement.
>
>The rule is that the implementor of opApply() either returns 0, or returns the result of the delegate. NOTHING ELSE IS ALLOWED. (And if you do _anything else_, you're likely to find your foreach client code doing all manner of weird things, as though it'd been told to break/goto/continue/next/<whatever>.)
>
>There are four points where result is manipulated in this very basic opApply(). (FYI: it's possible to have much more complex ones.)
>
>1
>    int result  = 0;
>
>2.
>      result = dg(field.name(), field.value());
>
>3.
>      if(0 != result)
>      {
>        break;
>      }
>
>4.
>    return result;
>
>Now D claims to 'make it hard for programmers to make mistakes', or some such. I can assign any value to an int, and yet the semantics of opApply() require that I only ever assign 0 or the result of calling the delegate. This is a contradiction.
>
>It would be a trivial matter to fluff any of those four points in the function. Imagine if you're doing complex things and using 'res' for your processing and 'ret' for your return value. Nasty.
>
>Developers should be diligent, to be sure, but we're all human.
>
>The solution, as I proposed a long long time ago, the solution to this is to define opApply() as having a return type of type OPAPPLY_RETURN_VALUE (or OpApplyReturn, or whatever). This would be an enum with *one* element:
>
>    enum OPAPPLY_RETURN_VALUE
>    {
>        COMPLETE    =    0
>    }
>
>Because there's only one value - that we know about, anyway - it'd not be too difficult to make sure it was initialised correctly. The compiler would notionally cast things to and from this enum type in the foreach and delegates, but we'd never have to know or care about this. Everything would work exactly as now; the object code would likely contain exactly the same bytes as it does now. And there'd be no way to unwittingly screw with foreach/opApply().
>
>Since this costs developers nothing, and Walter only some small and entirely one-off effort, and it guarantees that foreach/opApply() stuff-ups can only occur as a result of determined malicious effort, rather than all-too-easy mistake, I ask if anyone would put forth any arguments why it should not be adopted. D is, after all, making all kinds of claims to be helping developers write error free code ...
>
>Cheers
>
>Matthew
>
>
>


March 08, 2005
"Kris" <Kris_member@pathlink.com> wrote in message news:d0jhf6$1geo$1@digitaldaemon.com...
> Presumably the delegate return-type should equate with the enum also?
> Otherwise,
> one would have to explicitly cast the assignment to 'result'.
>
> From what I recall, Walter didn't comment upon why he apparently
> prefers the
> open-ended 'int' instead ... I'll go out on a limb and hazard that he
> /might/
> prefer the obfuscation of the delegate return; i.e. if all the various
> 'commands' were to be spelt out in an enum declaration, some darned
> developer
> might take "advantage" of that. Or, similarly, perhaps the values are
> compiler-specific?

But I haven't said that they'd be visible. Indeed, I specifically said that only the 0 value would be visible. All other values would be 'known' _only_ to the compiler (implementor).

> That's conjecture, of course. But it's the best I can come up with
> against the
> notion.

Since I haven't said that, I assume that you agree with me that there is no (good) argument against.

> I'm certainly a fan of tightening up valid ranges, so would also like
> to hear
> why this remains the way that it is ...

Walter???

> In article <d0jd0v$1c5e$1@digitaldaemon.com>, Matthew says...
>>
>>No new arguments here, just an expanded, and largely new, audience.
>>The
>>old arguments were perfectly good anyway. I'll reiterate then now.
>>
>>Consider std.openrj.Record.opApply():
>>
>>  int opApply(int delegate(in char[] name, in char[] value) dg)
>>  {
>>    int result  = 0;
>>    foreach(Field field; m_fields)
>>    {
>>      result = dg(field.name(), field.value());
>>      if(0 != result)
>>      {
>>        break;
>>      }
>>    }
>>    return result;
>>  }
>>
>>
>>The result method is the vector by which the compiler generated
>>delegate
>>'dg' communicates with opApply() and with the compiler translated code
>>in the foreach statement.
>>
>>The rule is that the implementor of opApply() either returns 0, or
>>returns the result of the delegate. NOTHING ELSE IS ALLOWED. (And if
>>you
>>do _anything else_, you're likely to find your foreach client code
>>doing
>>all manner of weird things, as though it'd been told to
>>break/goto/continue/next/<whatever>.)
>>
>>There are four points where result is manipulated in this very basic
>>opApply(). (FYI: it's possible to have much more complex ones.)
>>
>>1
>>    int result  = 0;
>>
>>2.
>>      result = dg(field.name(), field.value());
>>
>>3.
>>      if(0 != result)
>>      {
>>        break;
>>      }
>>
>>4.
>>    return result;
>>
>>Now D claims to 'make it hard for programmers to make mistakes', or
>>some
>>such. I can assign any value to an int, and yet the semantics of
>>opApply() require that I only ever assign 0 or the result of calling
>>the
>>delegate. This is a contradiction.
>>
>>It would be a trivial matter to fluff any of those four points in the function. Imagine if you're doing complex things and using 'res' for your processing and 'ret' for your return value. Nasty.
>>
>>Developers should be diligent, to be sure, but we're all human.
>>
>>The solution, as I proposed a long long time ago, the solution to this is to define opApply() as having a return type of type OPAPPLY_RETURN_VALUE (or OpApplyReturn, or whatever). This would be an enum with *one* element:
>>
>>    enum OPAPPLY_RETURN_VALUE
>>    {
>>        COMPLETE    =    0
>>    }
>>
>>Because there's only one value - that we know about, anyway - it'd not
>>be too difficult to make sure it was initialised correctly. The
>>compiler
>>would notionally cast things to and from this enum type in the foreach
>>and delegates, but we'd never have to know or care about this.
>>Everything would work exactly as now; the object code would likely
>>contain exactly the same bytes as it does now. And there'd be no way
>>to
>>unwittingly screw with foreach/opApply().
>>
>>Since this costs developers nothing, and Walter only some small and entirely one-off effort, and it guarantees that foreach/opApply() stuff-ups can only occur as a result of determined malicious effort, rather than all-too-easy mistake, I ask if anyone would put forth any arguments why it should not be adopted. D is, after all, making all kinds of claims to be helping developers write error free code ...
>>
>>Cheers
>>
>>Matthew
>>
>>
>>
>
>