Thread overview | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
February 05, 2012 [Issue 7442] New: ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=7442 Summary: ctRegex!`\p{Letter}` uses a lot memory in compilation Product: D Version: D2 Platform: All OS/Version: All Status: NEW Keywords: CTFE Severity: critical Priority: P2 Component: DMD AssignedTo: nobody@puremagic.com ReportedBy: kennytm@gmail.com --- Comment #0 from kennytm@gmail.com 2012-02-05 02:30:37 PST --- Test case: ------------------------ import std.regex; enum bug7442 = ctRegex!`\p{Letter}`; ------------------------ Compile with `dmd -c test7442.d`. DMD will very quickly use up all system memory. (I'll try to reduce the test case later.) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 20, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 Jay Norwood <jayn@prismnet.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jayn@prismnet.com --- Comment #1 from Jay Norwood <jayn@prismnet.com> 2012-03-19 20:11:58 PDT --- I see probably a related issue when compiling either of these expressions in win7 void wcpx(string fn) { enum ctr = ctRegex!(r"\w+","g"); } void wcpx(string fn) { enum ctr = regex(r"\w+","g"); } ------ Build started: Project: a7, Configuration: Release Win32 ------ Building Release\a7.exe... Error: out of memory Building Release\a7.exe failed! -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 25, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 josvanuden@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |josvanuden@gmail.com --- Comment #2 from josvanuden@gmail.com 2012-03-25 12:18:47 PDT --- Same here (win7). Either of these cause out of memory error: enum r1 = ctRegex!`\w`; enum r2 = ctRegex!`\W`; -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 11, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 Don <clugdbug@yahoo.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords|CTFE | CC| |clugdbug@yahoo.com.au Component|DMD |Phobos --- Comment #3 from Don <clugdbug@yahoo.com.au> 2012-04-11 01:06:51 PDT --- This is slow because the code makes 15000 concatenations (and I didn't check what else it does, there may be other performance issues). That would be poor performance even in runtime code. The memory usage is exacerbated by CTFE bug 1382 and bug 6498, but basically this is a Phobos bug. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 11, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 Dmitry Olshansky <dmitry.olsh@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dmitry.olsh@gmail.com --- Comment #4 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-04-11 03:25:39 PDT --- 15000 ? Wow. At any rate I tried hard to make it so that the thing never reallocates unless required (but CTFE has no support for assumeSafeAppend) so these should mostly be an in-place appends. Still I'll recheck if the R-T version is fast/slow maybe I missed something. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 13, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 --- Comment #5 from Don <clugdbug@yahoo.com.au> 2012-04-13 07:35:32 PDT --- (In reply to comment #4) > 15000 ? Wow. At any rate I tried hard to make it so that the thing never reallocates unless required (but CTFE has no support for assumeSafeAppend) so these should mostly be an in-place appends. > > Still I'll recheck if the R-T version is fast/slow maybe I missed something. By the way, if you go into the compiler source, file interpret.c, line 34 is #define SHOWPERFORMANCE 0 If you change that 0 to 1, and then recompile, it will tell you some basic CTFE stats, eg how many assignments it performed. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 13, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 --- Comment #6 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-04-13 11:34:43 PDT --- (In reply to comment #5) > > By the way, if you go into the compiler source, file interpret.c, line 34 is > #define SHOWPERFORMANCE 0 > If you change that 0 to 1, and then recompile, it will tell you some basic CTFE > stats, eg how many assignments it performed. Cool. I'm going to use this extensively. Might as well reduce ctRegex as some kind of benchmark for CTFE. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 17, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 --- Comment #7 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-04-17 04:26:21 PDT --- I investigated this further and conclude that there are 2 factors at work. I removed few thousands of codepoints from Letter, so it doesn't run out of RAM outright.(see code below) Also separated parse from build steps. Here are collected stats on CTFE. parse only: ---- CTFE Performance ---- max call depth = 20 max stack = 44 array allocs = 2761 assignments = 430837 build: ---- CTFE Performance ---- max call depth = 20 max stack = 73 array allocs = 8264 assignments = 1293421 Parsing creates all the datastructures for unicode table machinery it takes slightly less then half of all allocs already. Another thing to notice is it fetures higher allocations per assigment. Then comes the codegen step and it's CTFE only and far more alloc happy. Frankly I see no way to reduce all of this alloc fun because of COW that will ruin any attempt to preallocate buffer for generated code. Am I right that arrays do dup on every write? --- test program --- import std.regex; void main(){ version(parse) static r = regex(set); else //build static r = ctRegex!set; } enum set = `[A-Za-z\u00AA\u00B5\u00BA\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02C1\u02C6-\u02D1\u02E0-\u02E4\u02EC\u02EE\u0370-\u0374\u0376\u0377 \u037A-\u037D\u0386\u0388-\u038A\u038C\u038E-\u03A1\u03A3-\u03F5\u03F7-\u0481\u048A-\u0527\u0531-\u0556\u0559\u0561-\u0587\u05D0-\u05EA \u05F0-\u05F2\u0620-\u064A\u066E\u066F\u0671-\u06D3\u06D5\u06E5\u06E6\u06EE\u06EF\u06FA-\u06FC\u06FF\u0710\u0712-\u072F\u074D-\u07A5 \u07B1\u07CA-\u07EA\u07F4\u07F5\u07FA\u0800-\u0815\u081A\u0824\u0828\u0840-\u0858\u08A0\u08A2-\u08AC\u0904-\u0939\u093D\u0950\u0958-\u0961 \u0971-\u0977\u0979-\u097F\u0985-\u098C\u098F\u0990\u0993-\u09A8\u09AA-\u09B0\u09B2\u09B6-\u09B9\u09BD\u09CE\u09DC\u09DD\u09DF-\u09E1 \u09F0\u09F1\u0A05-\u0A0A\u0A0F\u0A10\u0A13-\u0A28\u0A2A-\u0A30\u0A32\u0A33\u0A35\u0A36\u0A38\u0A39\u0A59-\u0A5C\u0A5E\u0A72-\u0A74 \u0A85-\u0A8D\u0A8F-\u0A91\u0A93-\u0AA8\u0AAA-\u0AB0\u0AB2\u0AB3\u0AB5-\u0AB9\u0ABD\u0AD0\u0AE0\u0AE1\u0B05-\u0B0C\u0B0F\u0B10 \u0B13-\u0B28\u0B2A-\u0B30\u0B32\u0B33\u0B35-\u0B39\u0B3D\u0B5C\u0B5D\u0B5F-\u0B61\u0B71\u0B83\u0B85-\u0B8A\u0B8E-\u0B90 \u0B92-\u0B95\u0B99\u0B9A\u0B9C\u0B9E\u0B9F\u0BA3\u0BA4\u0BA8-\u0BAA\u0BAE-\u0BB9\u0BD0\u0C05-\u0C0C\u0C0E-\u0C10\u0C12-\u0C28 \u0C2A-\u0C33\u0C35-\u0C39\u0C3D\u0C58\u0C59\u0C60\u0C61\u0C85-\u0C8C\u0C8E-\u0C90\u0C92-\u0CA8\u0CAA-\u0CB3\u0CB5-\u0CB9 \u0CBD\u0CDE\u0CE0\u0CE1\u0CF1\u0CF2\u0D05-\u0D0C\u0D0E-\u0D10\u0D12-\u0D3A\u0D3D\u0D4E\u0D60\u0D61\u0D7A-\u0D7F\u0D85-\u0D96 \u0D9A-\u0DB1\u0DB3-\u0DBB\u0DBD\u0DC0-\u0DC6\u0E01-\u0E30\u0E32\u0E33\u0E40-\u0E46\u0E81\u0E82\u0E84\u0E87\u0E88\u0E8A\u0E8D \u0E94-\u0E97\u0E99-\u0E9F\u0EA1-\u0EA3\u0EA5\u0EA7\u0EAA\u0EAB\u0EAD-\u0EB0\u0EB2\u0EB3\u0EBD\u0EC0-\u0EC4\u0EC6\u0EDC-\u0EDF \u0F00\u0F40-\u0F47\u0F49-\u0F6C\u0F88-\u0F8C\u1000-\u102A\u103F\u1050-\u1055\u105A-\u105D\u1061\u1065\u1066\u106E-\u1070\u1075-\u1081 \u108E\u10A0-\u10C5\u10C7\u10CD\u10D0-\u10FA\u10FC-\u1248\u124A-\u124D\u1250-\u1256\u1258\u125A-\u125D\u1260-\u1288\u128A-\u128D \u1290-\u12B0\u12B2-\u12B5\u12B8-\u12BE\u12C0\u12C2-\u12C5\u12C8-\u12D6\u12D8-\u1310\u1312-\u1315\u1318-\u135A\u1380-\u138F\u13A0-\u13F4 \u1401-\u166C\u166F-\u167F\u1681-\u169A\u16A0-\u16EA\u1700-\u170C\u170E-\u1711\u1720-\u1731\u1740-\u1751\u1760-\u176C\u176E-\u1770\u1780-\u17B3 \u17D7\u17DC\u1820-\u1877\u1880-\u18A8\u18AA\u18B0-\u18F5\u1900-\u191C\u1950-\u196D\u1970-\u1974\u1980-\u19AB\u19C1-\u19C7\u1A00-\u1A16\u1A20-\u1A54 \u1AA7\u1B05-\u1B33\u1B45-\u1B4B\u1B83-\u1BA0\u1BAE\u1BAF\u1BBA-\u1BE5\u1C00-\u1C23\u1C4D-\u1C4F\u1C5A-\u1C7D\u1CE9-\u1CEC\u1CEE-\u1CF1\u1CF5\u1CF6 \u1D00-\u1DBF\u1E00-\u1F15\u1F18-\u1F1D\u1F20-\u1F45\u1F48-\u1F4D\u1F50-\u1F57\u1F59\u1F5B\u1F5D\u1F5F-\u1F7D\u1F80-\u1FB4\u1FB6-\u1FBC\u1FBE\u1FC2-\u1FC4 \u1FC6-\u1FCC\u1FD0-\u1FD3\u1FD6-\u1FDB\u1FE0-\u1FEC\u1FF2-\u1FF4\u1FF6-\u1FFC\u2071\u207F\u2090-\u209C\u2102\u2107\u210A-\u2113\u2115\u2119-\u211D\u2124 \u2126\u2128\u212A-\u212D\u212F-\u2139\u213C-\u213F\u2145-\u2149\u214E\u2183\u2184\u2C00-\u2C2E\u2C30-\u2C5E\u2C60-\u2CE4\u2CEB-\u2CEE\u2CF2 \u2CF3\u2D00-\u2D25\u2D27\u2D2D\u2D30-\u2D67\u2D6F\u2D80-\u2D96\u2DA0-\u2DA6\u2DA8-\u2DAE\u2DB0-\u2DB6\u2DB8-\u2DBE\u2DC0-\u2DC6\u2DC8-\u2DCE \u2DD0-\u2DD6\u2DD8-\u2DDE\u2E2F\u3005\u3006\u3031-\u3035\u303B\u303C\u3041-\u3096\u309D-\u309F\u30A1-\u30FA\u30FC-\u30FF\u3105-\u312D\u3131-\u318E \u31A0-\u31BA\u31F0-\u31FF\u3400-\u4DB5\u4E00-\u9FCC\uA000-\uA48C\uA4D0-\uA4FD\uA500-\uA60C\uA610-\uA61F\uA62A\uA62B\uA640-\uA66E\uA67F-\uA697\uA6A0-\uA6E5 \uA717-\uA71F\uA722-\uA788\uA78B-\uA78E\uA790-\uA793\uA7A0-\uA7AA\uA7F8-\uA801\uA803-\uA805\uA807-\uA80A\uA80C-\uA822\uA840-\uA873\uA882-\uA8B3\uA8F2-\uA8F7 \uA8FB\uA90A-\uA925\uA930-\uA946\uA960-\uA97C\uA984-\uA9B2\uA9CF\uAA00-\uAA28\uAA40-\uAA42\uAA44-\uAA4B\uAA60-\uAA76\uAA7A\uAA80-\uAAAF\uAAB1\uAAB5 \uAAB6\uAAB9-\uAABD\uAAC0\uAAC2\uAADB-\uAADD\uAAE0-\uAAEA\uAAF2-\uAAF4\uAB01-\uAB06\uAB09-\uAB0E\uAB11-\uAB16\uAB20-\uAB26\uAB28-\uAB2E\uABC0-\uABE2 \uAC00-\uD7A3\uD7B0-\uD7C6\uD7CB-\uD7FB\uF900-\uFA6D\uFA70-\uFAD9\uFB00-\uFB06\uFB13-\uFB17\uFB1D\uFB1F-\uFB28\uFB2A-\uFB36\uFB38-\uFB3C\uFB3E\uFB40 \uFB41\uFB43\uFB44\uFB46-\uFBB1\uFBD3-\uFD3D\uFD50-\uFD8F\uFD92-\uFDC7\uFDF0-\uFDFB\uFE70-\uFE74\uFE76-\uFEFC\uFF21-\uFF3A\uFF41-\uFF5A\uFF66-\uFFBE \uFFC2-\uFFC7\uFFCA-\uFFCF\uFFD2-\uFFD7\uFFDA-\uFFDC\U00010000-\U0001000B\U0001000D-\U00010026\U00010028-\U0001003A\U0001003C\U0001003D\U0001003F-\U0001004D \U00010050-\U0001005D\U00010080-\U000100FA\U00010280-\U0001029C\U000102A0-\U000102D0\U00010300-\U0001031E\U00010330-\U00010340\U00010342-\U00010349\U00010380-\U0001039D \U000103A0-\U000103C3\U000103C8-\U000103CF\U00010400-\U0001049D\U00010800-\U00010805\U00010808\U0001080A-\U00010835\U00010837\U00010838\U0001083C\U0001083F-\U00010855 \U00010900-\U00010915\U00010920-\U00010939\U00010980-\U000109B7\U000109BE\U000109BF\U00010A00\U00010A10-\U00010A13\U00010A15-\U00010A17\U00010A19-\U00010A33 \U00010A60-\U00010A7C\U00010B00-\U00010B35\U00010B40-\U00010B55\U00010B60-\U00010B72\U00010C00-\U00010C48\U00011003-\U00011037\U00011083-\U000110AF \U000110D0-\U000110E8\U00011103-\U00011126\U00011183-\U000111B2\U000111C1-\U000111C4\U00011680-\U000116AA\U00012000-\U0001236E\U00013000-\U0001342E\U00016800-\U00016A38 \U00016F00-\U00016F44\U00016F50\U00016F93-\U00016F9F\U0001B000\U0001B001\U0001D400-\U0001D454\U0001D456-\U0001D49C\U0001D49E\U0001D49F\U0001D4A2\U0001D4A5\U0001D4A6\U0001D4A9-\U0001D4AC \U0001D4AE-\U0001D4B9\U0001D4BB\U0001D4BD-\U0001D4C3\U0001D4C5-\U0001D505\U0001D507-\U0001D50A\U0001D50D-\U0001D514\U0001D516-\U0001D51C\U0001D51E-\U0001D539\U0001D53B-\U0001D53E \U0001D540-\U0001D544\U0001D546\U0001D54A-\U0001D550\U0001D552-\U0001D6A5\U0001D6A8-\U0001D6C0\U0001D6C2-\U0001D6DA\U0001D6DC-\U0001D6FA\U0001D6FC-\U0001D714 \U0001D716-\U0001D734\U0001D736-\U0001D74E\U0001D750-\U0001D76E\U0001D770-\U0001D788\U0001D78A-\U0001D7A8\U0001D7AA-\U0001D7C2\U0001D7C4-\U0001D7CB \U0001EE00-\U0001EE03\U0001EE05-\U0001EE1F\U0001EE21\U0001EE22\U0001EE24\U0001EE27\U0001EE29-\U0001EE32\U0001EE34-\U0001EE37\U0001EE39\U0001EE3B\U0001EE42 \U0001EEA1-\U0001EEA3\U0001EEA5-\U0001EEA9\U0001EEAB-\U0001EEBB\U00020000-\U0002A6D6\U0002A700-\U0002B734\U0002B740-\U0002B81D\U0002F800-\U0002FA1D]`; -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 17, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 --- Comment #8 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-04-17 04:35:31 PDT --- Created an attachment (id=1096) Benchmark unicode Trie Benchmark runs core part of parse step with huge character classes. Currently it chokes on 2 iterations. Unless it can swallow at least 5 CTFE _parsing_ is almost unusable. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 19, 2012 [Issue 7442] ctRegex!`\p{Letter}` uses a lot memory in compilation | ||||
---|---|---|---|---|
| ||||
Posted in reply to kennytm@gmail.com | http://d.puremagic.com/issues/show_bug.cgi?id=7442 Dmitry Olshansky <dmitry.olsh@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |thornik@gmail.com --- Comment #9 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-04-19 08:53:00 PDT --- *** Issue 7928 has been marked as a duplicate of this issue. *** -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation