April 17, 2015 [Issue 14458] New: very slow ubyte[] assignment (dmd doesn't use memset) | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=14458 Issue ID: 14458 Summary: very slow ubyte[] assignment (dmd doesn't use memset) Product: D Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: P1 Component: DMD Assignee: nobody@puremagic.com Reporter: code@dawg.eu Tracked down a severe performance issue in my new AA implementation, where it zeroed a freshly allocated entry. DMD generates the following code for the assignment. ---- void zero(ubyte[] ary) { ary[] = 0; } ---- mov rcx, rdi ; 0008 _ 48: 89. F9 xor rax, rax ; 000B _ 48: 31. C0 mov rdi, rsi ; 000E _ 48: 8B. FE rep stosb ; 0011 _ F3: AA ---- This is a bytewise store 0 and is about 4x slower than memset, if sz >= 4. It's slightly faster for sz < 4. Not sure why `rep stosb` suddenly becomes 4x slower when sz increases from 3 to 4 bytes, but in any case the compiler should optimize the small case to direct assignments and the big case to memset, or always use memset. -- |
Copyright © 1999-2021 by the D Language Foundation