Thread overview
[Issue 8536] New: OPTLINK crash with large fixed-size array
Aug 10, 2012
Walter Bright
Aug 11, 2012
Walter Bright
August 10, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536

           Summary: OPTLINK crash with large fixed-size array
           Product: D
           Version: D2
          Platform: x86
        OS/Version: Windows
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Optlink
        AssignedTo: nobody@puremagic.com
        ReportedBy: bearophile_hugs@eml.cc


--- Comment #0 from bearophile_hugs@eml.cc 2012-08-10 15:43:19 PDT ---
This program:

uint[1 << 24] a;
void main() {}


Gives this error:
test.d(2): Error: index 16777216 overflow for static array



While this program:

struct Foo { uint x; }
Foo[1 << 24] a;
void main() {}


Causes an OPTLINK crash.


I sometimes translate to D some C programs that for performance reasons use some large global 2D arrays. In D using a global __gshared dynamic array of dynamic arrays is an option, but this kills some optimizations the compiler is able to perform thanks to knowing the 2D matrix sizes at compile-time. In my opinion asking for 50-100 MB static 2D arrays is not that much for a PC with 2+ GB RAM.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 10, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536


Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla@digitalmars.com


--- Comment #1 from Walter Bright <bugzilla@digitalmars.com> 2012-08-10 16:01:07 PDT ---
This is a well known Optlink bug, though I don't have the bugzilla number handy.

You're wrong about it impeding optimizations compared with dynamically allocating it, for a couple reasons:

1. static data is often indirectly accessed through a register anyway, either in explicit code generated by the compiler, or implicitly as how the CPU does virtual memory, or even there's no way to do it other than offsetting the program counter register

2. there is no performance penalty for offsetting a base address register versus and addressing mode with just and address.

D knows the static compile time sizes of arrays if you use static arrays. That's what they're for.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536



--- Comment #2 from bearophile_hugs@eml.cc 2012-08-10 19:43:48 PDT ---
Created an attachment (id=1138)
Three C programs that show one effect of static 2D arrays

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536



--- Comment #3 from bearophile_hugs@eml.cc 2012-08-10 19:49:18 PDT ---
(In reply to comment #1)

> This is a well known Optlink bug, though I don't have the bugzilla number handy.

OK.

> You're wrong about it impeding optimizations compared with dynamically allocating it, for a couple reasons:

This is a discussion better fit for the D newsgroup.

In attach there are 3 nearly identical C programs, that use a 2D global cache matrix to perform a certain simple (but not stupid) computation.

The test0 uses a dynamically allocated "array" of pointers to "arrays". The test1 uses a static array of dynamically allocated rows, and the test2 uses a fully static 2D matrix. Compiling with GCC 4.7.1 with "-std=c99 -Ofast -flto -s" the run-times are 6.52, 6.07 and 4.95 seconds. The more the GCC compiler knows statically about the arrays, the more efficient binary it produces.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536



--- Comment #4 from Walter Bright <bugzilla@digitalmars.com> 2012-08-10 20:51:43 PDT ---
Your test is incorrectly written.

Use one array, not an array of arrays, and use a macro to compute the r*row+c index.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536


bearophile_hugs@eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |DUPLICATE


--- Comment #5 from bearophile_hugs@eml.cc 2012-08-11 05:47:52 PDT ---
*** This issue has been marked as a duplicate of issue 6678 ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536



--- Comment #6 from bearophile_hugs@eml.cc 2012-08-11 05:50:58 PDT ---
Created an attachment (id=1139)
Version 4 of the C program

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8536



--- Comment #7 from bearophile_hugs@eml.cc 2012-08-11 05:58:53 PDT ---
(In reply to comment #4)
> Your test is incorrectly written.
> 
> Use one array, not an array of arrays, and use a macro to compute the r*row+c index.

Using your suggestions, in attach test3.c run-time is 4.84 seconds.

In D there are no macros, so I think you have to replace:

size_t cache_nc;
#define CACHE(r, c) (cache[(r)*cache_nc + (c)])

With something like:

__gshared size_t cache_nc;
ref CACHE(in size_t r, in size_t c) nothrow {
    return cache[r * cache_nc + c];
}

Or maybe use a custom matrix with overloaded [] and avoid global variables (but keep global cache_nc, possibly as an enum, to keep allowing loop unrolling, because many static compilers don't perform unrolling if they don't statically know the loop count. JIT compilers as the Oracle Java one are able to unroll on dynamic values too).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------