[Issue 4115] New: Reading few CPU flags from D code

April 21, 2010

Posted by bearophile_hugs@eml.cc

Permalink

bearophile_hugs@eml.cc

Permalink

http://d.puremagic.com/issues/show_bug.cgi?id=4115

           Summary: Reading few CPU flags from D code
           Product: D
           Version: future
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody@puremagic.com
        ReportedBy: bearophile_hugs@eml.cc


--- Comment #0 from bearophile_hugs@eml.cc 2010-04-21 16:52:34 PDT ---
Delphi has ranged types of integral values, that increase the safety of programs, restricting a variable in a sub range. In D a struct template can be created to implement a ranged integral value:

Ranged!(1, 1001, int) foo;
alias Ranged!('a', 'z'+1, char) Lowercase;

(The type used by the struct of the can be omitted, so for a range in [1, 1000] it can choose an int.)

See a similar idea in C++: http://www.richherrick.com/software/herrick_library.html

Multiplications are quite less common on ranged variables, + and - == and assigns are the most common operations done on them.

The preconditions of the methods of that struct can test for the out-of-range conditions. In release mode they get removed (or I can use a debug statement). But it's better to keep those tests when possible, so I'd like that Ranged to be efficient.

Delphi ranges are fast also because the compiler can remove some unnecessary checks, I can't do this in a simple way (template expressions are overkill here). The struct has to test for out-of-range and true overflows of the int/ubyte/etc they are implemented on.

But there are no good solution in D because:
- Checking for overflow in D with no inline assembly can be a little slow.
- Modern programmers know assembly less than in the past
- Asm is more error-prone
- asm is less portable than D code
- dmd (or LDC with no LDC extensions) don't inline functions and struct methods
that contain asm code
- And maybe the prologue-epilogue of the asm code can kill any performance
improvement given by reading the overflow bit from asm.

A solution is to make the backend smarter, so it recognizes patterns in the code and compiles it into good asm, but LLVM doesn't currently perform well here yet:

http://llvm.org/bugs/show_bug.cgi?id=4916 http://llvm.org/bugs/show_bug.cgi?id=4917 http://llvm.org/bugs/show_bug.cgi?id=4918

Even if/when LLVM implement those tiny optimizations, that's not a full solution because the bad thing with compiler optimizations is that you can't rely on them.

A solution that I think is better, that is portable on many CPU types (CPUs aren't forced have all those flags, but they are common, and the compiler can map the requested semantics using the correct asm instructions for different CPUs too), and gives good performance, is to add ways to read the contents of Overflow, Zero and Carry flags to std.intrinsic.

A simple way to implement it is to turn them into boolean functions that the compiler manages in a special way, as the other intrinsics:

bool over = overflow_flag();
if (carry_flag()) {...} else {...}

Then the compiler has to manage them efficiently (for example here using a single JNO or JO instruction), and inlining functions if they contain such intrinsics.

Unlike the other intrinsics I have given them a semantic name, instead of the name of the asm instruction, so the D compiler can use the right instruction from different CPUs, increasing their portability.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

http://d.puremagic.com/issues/show_bug.cgi?id=4115


Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |bugzilla@digitalmars.com
         Resolution|                            |WONTFIX


--- Comment #1 from Walter Bright <bugzilla@digitalmars.com> 2010-04-28 09:18:18 PDT ---
The trouble with the idea of, say:

   a = b + 2;
   c = carry();

where carry() is a compiler intrinsic that reads the CPU carry flag, is that there's nothing that says the previous statement even sets the carry flag in any consistent manner. For example:

   a = b + 1;

may be implemented with an INC instruction, which does not set the carry flag on overflow. Many optimizations may transform the code to not set the carry flag, or to have the carry flag set by some other operation.

While this idea looks portable, it would be completely non-portable in practice. Even its behavior with one compiler can arbitrarily depend on the code mix surrounding it, and be highly sensitive to any changes in it in ways that would be impractical for the user to track.

In its proposed form, the idea is not workable.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Forums