Thread overview | |||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 29, 2013 New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Recap: During a couple of rounds of the informal review new std.uni had its docs happily destroyed, and later re-written based on the feedback. Notable changes: - Fixed a couple of latent bugs (ouch!) - unicode.xyz helper was redesigned to have a clear path for extension to properties other then binary ones. For instance to get all of code points with hangul syllable type L (leading Jamo): auto leadingJamo = unicode.hangulSyllableType("L"); - Squeezed extra 31Kb slack from object-file size (32 bits, more on 64). Now all of the packed tables occupy around 350Kb (32bits) and If you happen to know some tricks to reduce object file size (and in turn the executable size), please chime in. Code & benchmark: https://github.com/blackwhale/gsoc-bench-2012 Docs: http://blackwhale.github.com/phobos/uni.html (looks far better without the JS jump-table) It's a standalone module at the moment. To use in place of current std.uni replace 'std.uni'->'uni' in your programs and compare the results. Make sure that both uni and unicode_tables modules are linked in, rdmd can take care of this dependency. P.S. Time to go for the formal review? P.P.S. Got to catch some sleep ... -- Dmitry Olshansky |
January 31, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | 30-Jan-2013 01:52, Dmitry Olshansky пишет: > Recap: > During a couple of rounds of the informal review new std.uni had its > docs happily destroyed, and later re-written based on the feedback. > [snip] > - Squeezed extra 31Kb slack from object-file size (32 bits, more on 64). > Now all of the packed tables occupy around 350Kb (32bits) and > If you happen to know some tricks to reduce object file size (and in > turn the executable size), please chime in. My post got lost in the ether apparently. And it even wasn't complete - and on 64bits it's 464Kb of tables alone. Needless to say I'm worried about these sizes getting too large given that D is pretty much statically linked ATM. > > Code & benchmark: https://github.com/blackwhale/gsoc-bench-2012 > > Docs: http://blackwhale.github.com/phobos/uni.html > (looks far better without the JS jump-table) > > It's a standalone module at the moment. To use in place of current > std.uni replace 'std.uni'->'uni' in your programs and compare the > results. Make sure that both uni and unicode_tables modules are linked > in, rdmd can take care of this dependency. Let me make it more explicit. I'm looking for a review manager and anybody willing to revive the review process instead of venting steam on proper property (pun intended) design and seeking a value in requiring parens on no-arg call (or proving otherwise). -- Dmitry Olshansky |
January 31, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On Thu, Jan 31, 2013 at 11:27:57PM +0400, Dmitry Olshansky wrote: > 30-Jan-2013 01:52, Dmitry Olshansky пишет: > >Recap: > >During a couple of rounds of the informal review new std.uni had its > >docs happily destroyed, and later re-written based on the feedback. > > > > [snip] > > >- Squeezed extra 31Kb slack from object-file size (32 bits, more on > >64). Now all of the packed tables occupy around 350Kb (32bits) and > >If you happen to know some tricks to reduce object file size (and in > >turn the executable size), please chime in. > > My post got lost in the ether apparently. And it even wasn't complete - and on 64bits it's 464Kb of tables alone. Needless to say I'm worried about these sizes getting too large given that D is pretty much statically linked ATM. It didn't get lost. I saw it. I just haven't had the chance to review it yet. :) [...] > Let me make it more explicit. > > I'm looking for a review manager and anybody willing to revive the review process instead of venting steam on proper property (pun intended) design and seeking a value in requiring parens on no-arg call (or proving otherwise). [...] Yeah I've basically resorted to thread-deleting the entire @property thread along with its several unending sibling threads. It's not so much that I don't care about it, as that it's just gotten too long-winded and tiring. I'm ready to throw up my hands and let it all go down the pipes. I don't think I've the time/energy to be a review manager, but I *will* try to get to reviewing the code again sometime soon. IMNSHO, getting the new std.uni into Phobos is *far* more important (and far more profitable!) than the mountain out of molehill that is the current property discussion. T -- I'm still trying to find a pun for "punishment"... |
January 31, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On Thursday, 31 January 2013 at 19:27:59 UTC, Dmitry Olshansky wrote:
> I'm looking for a review manager and anybody willing to revive the review process instead of venting steam on proper property (pun intended) design and seeking a value in requiring parens on no-arg call (or proving otherwise).
If nobody else steps up in the next few days, I'll do it. But I really hope somebody beats me to it, as I'd rather focus completely on getting a new 2.061-based LDC release out.
David
|
February 02, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | 31-Jan-2013 23:48, H. S. Teoh пишет: > On Thu, Jan 31, 2013 at 11:27:57PM +0400, Dmitry Olshansky wrote: >> 30-Jan-2013 01:52, Dmitry Olshansky пишет: >>> Recap: >>> During a couple of rounds of the informal review new std.uni had its >>> docs happily destroyed, and later re-written based on the feedback. >>> >> >> [snip] >> >>> - Squeezed extra 31Kb slack from object-file size (32 bits, more on >>> 64). Now all of the packed tables occupy around 350Kb (32bits) and >>> If you happen to know some tricks to reduce object file size (and in >>> turn the executable size), please chime in. >> >> My post got lost in the ether apparently. And it even wasn't complete >> - and on 64bits it's 464Kb of tables alone. Needless to say I'm >> worried about these sizes getting too large given that D is pretty >> much statically linked ATM. > > It didn't get lost. I saw it. I just haven't had the chance to review it > yet. :) > Great, I think I was spoiled by the great speed of the previous destructive review. I guess no news is good news :) > > [...] >> Let me make it more explicit. >> >> I'm looking for a review manager and anybody willing to revive the >> review process instead of venting steam on proper property (pun >> intended) design and seeking a value in requiring parens on no-arg >> call (or proving otherwise). > [...] > > Yeah I've basically resorted to thread-deleting the entire @property > thread along with its several unending sibling threads. It's not so much > that I don't care about it, as that it's just gotten too long-winded and > tiring. I'm ready to throw up my hands and let it all go down the pipes. > > I don't think I've the time/energy to be a review manager, but I *will* > try to get to reviewing the code again sometime soon. IMNSHO, getting > the new std.uni into Phobos is *far* more important (and far more > profitable!) than the mountain out of molehill that is the current > property discussion. > > > T > -- Dmitry Olshansky |
February 23, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On Wed, Jan 30, 2013 at 01:52:20AM +0400, Dmitry Olshansky wrote: > Recap: > During a couple of rounds of the informal review new std.uni had its > docs happily destroyed, and later re-written based on the feedback. > > Notable changes: > > - Fixed a couple of latent bugs (ouch!) > > - unicode.xyz helper was redesigned to have a clear path for extension to properties other then binary ones. For instance to get all of code points with hangul syllable type L (leading Jamo): > > auto leadingJamo = unicode.hangulSyllableType("L"); > > - Squeezed extra 31Kb slack from object-file size (32 bits, more on > 64). Now all of the packed tables occupy around 350Kb (32bits) and > If you happen to know some tricks to reduce object file size (and in > turn the executable size), please chime in. > > Code & benchmark: https://github.com/blackwhale/gsoc-bench-2012 > > Docs: http://blackwhale.github.com/phobos/uni.html > (looks far better without the JS jump-table) > > It's a standalone module at the moment. To use in place of current std.uni replace 'std.uni'->'uni' in your programs and compare the results. Make sure that both uni and unicode_tables modules are linked in, rdmd can take care of this dependency. > > P.S. Time to go for the formal review? [...] Alright, I decided to just jump in and re-review std.uni. I *really* want to see this in Phobos, the sooner the better. Here are some comments: - In the first part of the docs, Terminology section, under "Code unit": I think you mistyped a ddoc macro, it should be ($(D char)) instead of (($D char)). - lineSep, paraSep: are these fixed values? It would be nice to indicate what their values are. - UnicodeDecomposition: it would be nice to document the values in this enum. - normalize(): I think your code example has a duplicated line (NFKC example appears twice). - allowedIn(): How about an example where a character is *not* allowed in a normalization form? - InversionList.opBinary: I still prefer ^ instead of ~ for symmetric difference. In D, ~ means "append", and it's very confusing when x~y means symmetric difference instead of append. - unicode.opDispatch: it would be nice to provide links to official Unicode documentation that lists all blocks/scripts, as a reference. - combiningClass: maybe provide a link to official Unicode docs that list combining class values? OK, a lot of this is just nitpicks... but overall, this new std.uni looks very good. Looking forward to it being merged into Phobos! T -- Marketing: the art of convincing people to pay for what they didn't need before which you can't deliver after. |
February 24, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | would it make sense to incoporate test from the ICU testsuite - there are api tests and many data tests around some can find the tests in the current release under icu4c-50_1_2-src\icu\source\test Am 29.01.2013 22:52, schrieb Dmitry Olshansky: > Recap: > During a couple of rounds of the informal review new std.uni had its > docs happily destroyed, and later re-written based on the feedback. > > Notable changes: > > - Fixed a couple of latent bugs (ouch!) > > - unicode.xyz helper was redesigned to have a clear path for extension > to properties other then binary ones. For instance to get all of code > points with hangul syllable type L (leading Jamo): > > auto leadingJamo = unicode.hangulSyllableType("L"); > > - Squeezed extra 31Kb slack from object-file size (32 bits, more on 64). > Now all of the packed tables occupy around 350Kb (32bits) and > If you happen to know some tricks to reduce object file size (and in > turn the executable size), please chime in. > > Code & benchmark: https://github.com/blackwhale/gsoc-bench-2012 > > Docs: http://blackwhale.github.com/phobos/uni.html > (looks far better without the JS jump-table) > > It's a standalone module at the moment. To use in place of current > std.uni replace 'std.uni'->'uni' in your programs and compare the > results. Make sure that both uni and unicode_tables modules are linked > in, rdmd can take care of this dependency. > > P.S. Time to go for the formal review? > > P.P.S. Got to catch some sleep ... > |
February 25, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | Hi. Just a couple stupid questions: * What is the relation between std.uni and std.utf? Why is two modules needed? Seems confusing to me. Shouldn't these be combined? If not, then please explain the the distinction in the beginning of the module documentation. * Shouldn't the module be renamed to std.unicode? We do not have std.arr, std.alg or std.cont either. To me, it is not at all obvious what std.uni contains based on the module name. |
February 25, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to tn | 25-Feb-2013 22:08, tn пишет: > Hi. Just a couple stupid questions: > > * What is the relation between std.uni and std.utf? Why is two modules > needed? Seems confusing to me. Shouldn't these be combined? If not, then > please explain the the distinction in the beginning of the module > documentation. std.uni was the C's "ctype" of the Unicode. Except it failed to deliver even this starting with about Unicode 5.1. std.utf is all about encoding/decoding UTF-8, UTF-16. If I were designing it from scratch (and what the hell I might one day have to) I'd put these into std.encoding or even std.encoding.utf. I'd probably put a small note that basic encoding is both: a) built-in into the language (foreach) b) to be found in std.utf > > * Shouldn't the module be renamed to std.unicode? Good idea. But part of the reason was fixing the existing std.uni to: a) let it work in Unicode 6.1 world (and even 6.2 as of now) b) make it faster when dealing with Unicode code points in all of the isAlpha etc. functions. c) add a bunch of new cool tools for Unicode Basically the API is a superset of the existing one. I didn't want to change the name. > We do not have > std.arr, std.alg or std.cont either. To me, it is not at all obvious > what std.uni contains based on the module name. What can I say Phobos is an example of software evolution ;) -- Dmitry Olshansky |
February 25, 2013 Re: New std.uni: ready for more beating | ||||
---|---|---|---|---|
| ||||
Posted in reply to dennis luehring | 24-Feb-2013 12:32, dennis luehring пишет: > would it make sense to incoporate test from the ICU testsuite - there > are api tests and many data tests around For key algorithms I'm using consortium's test data files + plus running random generated stress-tests against ICU. It might make sense to incorporate some of their tests but I'm wondering if it'll end up only as a difference in the API. That being said tests are already unwieldy and largely run as a separate programs depending on the said data files. unittests there are mostly sanity and self-agreement between components kind of tests. > some can find the tests in the current release under > > icu4c-50_1_2-src\icu\source\test -- Dmitry Olshansky |
Copyright © 1999-2021 by the D Language Foundation