Jump to page: 1 2
Thread overview
Breaking news: std.uni changes!
Dec 25, 2022
Dom Disc
Dec 26, 2022
Robert Schadek
Dec 27, 2022
Walter Bright
Dec 27, 2022
Dukc
Jan 02, 2023
Dukc
Jan 03, 2023
H. S. Teoh
Jan 03, 2023
Adam D Ruppe
Jan 03, 2023
Dukc
December 24, 2022

Hello one and all on this merry of all days!

Today unfortunately I bring all but joy. For std.uni has had a bout of work!

  • Unicode tables have been updated to 15 from 6.2 (and with that the generator is now in Phobos!).
  • Unicode categories C aka Other have been brought in line with TR44 specification. E.g. unicode.C.

In both cases if you use std.uni directly or indirectly (say std.regex), you may find yourself with code breakage on next release.

If you do find yourself with problems, first check that you are not referencing the C category, if you are, here is some code to mitigate your circumstance however it would be better to prevent such need.

@property auto loadPropertyOriginal(string name)() pure
{
    import std.uni : unicode;

    static if (name == "C" || name == "c" || name == "other" || name == "Other")
    {
        auto target = unicode.Co;
        target |= unicode.Lo;
        target |= unicode.No;
        target |= unicode.So;
        target |= unicode.Po;
        return target;
    }
    else
        return unicode.opDispatch!name;
}

Lastly, the tables updating have already brought much joy to MIR, with a broken test. A character that was being tested wasn't allocated in 6.2 but was in 7 therefore results were different. If your test suite is not part of the Phobos runners, please be aware that once you update you may experience failed tests. These are not avoidable due to external specification its based upon. However in even worse news the table generator was not kept in a working condition in the last 10 years, so there is a chance that something may have been missed.

In all cases, please do contact me if you need assistance. I'm available on Discord, OFTC #d and of course N.G. or even email if you really need it (firstname@lastname.co.nz).

--- Happy holidays to those that are currently enjoying them or about to!

December 25, 2022

On Saturday, 24 December 2022 at 21:26:40 UTC, Richard (Rikki) Andrew Cattermole wrote:

>
  • Unicode tables have been updated to 15 from 6.2 (and with that the generator is now in Phobos!).

Hurray!
Whatever problems this may cause, its problems in very very outdated code that would already need an overhaul, so what.
But it's super to have finally tables that are (at least now) up to date!

December 26, 2022

Awesome work, thank you

December 26, 2022
A big thank you!
December 27, 2022

On Saturday, 24 December 2022 at 21:26:40 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

Hello one and all on this merry of all days!

Today unfortunately I bring all but joy. For std.uni has had a bout of work!

  • Unicode tables have been updated to 15 from 6.2 (and with that the generator is now in Phobos!).
  • Unicode categories C aka Other have been brought in line with TR44 specification. E.g. unicode.C.

This is a big service for us at Symmetry. Getting Unicode support up to date was needed, we would have had to switch libraries at some point or update it ourselves. But now, nothing to do except perhaps dealing with a bit of breakage. Thank you!

I see it's not quite Unicode 15 though. graphemeStride does not take Emoji sequences and prepend characters into account. I'm going to contribute a bit now since it's holiday, and this is a good task for me. PR coming soon unless I run into issues!

December 28, 2022
On 28/12/2022 12:13 AM, Dukc wrote:
> This is a big service for us at Symmetry. Getting Unicode support up to date was needed, we would have had to switch libraries at some point or update it ourselves. But now, nothing to do except perhaps dealing with a bit of breakage. Thank you!

I had no idea that this was becoming an issue for you guys. It wasn't in any of the meeting notes and I haven't seen it brought up anywhere. So if there is anything more like this, please talk about it!

> I see it's not quite Unicode 15 though. `graphemeStride` does not take Emoji sequences and prepend characters into account. I'm going to contribute a bit now since it's holiday, and this is a good task for me. PR coming soon unless I run into issues!

Yeah, there will be tons of small stuff currently missed out due to such a big jump and of course ping me @rikkimax, when you have something to review.

Loads of other work available such as culling all the version specific information out of the docs :)
January 02, 2023
(Sorry for the late answer)

On Wednesday, 28 December 2022 at 00:10:36 UTC, Richard (Rikki) Andrew Cattermole wrote:
> On 28/12/2022 12:13 AM, Dukc wrote:
>> This is a big service for us at Symmetry. Getting Unicode support up to date was needed, we would have had to switch libraries at some point or update it ourselves. But now, nothing to do except perhaps dealing with a bit of breakage. Thank you!
>
> I had no idea that this was becoming an issue for you guys. It wasn't in any of the meeting notes and I haven't seen it brought up anywhere. So if there is anything more like this, please talk about it!

Yes, I should have done that.

>
>> I see it's not quite Unicode 15 though. `graphemeStride` does not take Emoji sequences and prepend characters into account. I'm going to contribute a bit now since it's holiday, and this is a good task for me. PR coming soon unless I run into issues!
>
> Yeah, there will be tons of small stuff currently missed out due to such a big jump and of course ping me @rikkimax, when you have something to review.
>
> Loads of other work available such as culling all the version specific information out of the docs :)

Other things coming to mind: Bidirectional grapheme iteration, Word break and line break algorithms, lazy normalisation. Indeed, lots of improvement potential.


January 03, 2023
On 03/01/2023 10:24 AM, Dukc wrote:
> Other things coming to mind: Bidirectional grapheme iteration, Word break and line break algorithms, lazy normalisation. Indeed, lots of improvement potential.

I've done word break, "lazy" normalization (so can stop at any point), and lazy case insensitive comparison with normalization.

But: Bidirectional grapheme iteration makes my eye twitch lol.

My main concern for adding new features is increasing the size of Phobos binary for the tables. Most people don't need a lot of these optional algorithms, but they do need things like casing to work correctly (which makes increased size worth it).
January 02, 2023
On Tue, Jan 03, 2023 at 05:13:53PM +1300, Richard (Rikki) Andrew Cattermole via Digitalmars-d-announce wrote:
> On 03/01/2023 10:24 AM, Dukc wrote:
> > Other things coming to mind: Bidirectional grapheme iteration, Word break and line break algorithms, lazy normalisation. Indeed, lots of improvement potential.
> 
> I've done word break, "lazy" normalization (so can stop at any point), and lazy case insensitive comparison with normalization.
> 
> But: Bidirectional grapheme iteration makes my eye twitch lol.
> 
> My main concern for adding new features is increasing the size of Phobos binary for the tables. Most people don't need a lot of these optional algorithms, but they do need things like casing to work correctly (which makes increased size worth it).

Is there a way to make these tables pay-as-you-go? As in, if you never call a function that depends on a table, it would not be pulled into the binary?


T

-- 
They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
January 03, 2023
On 03/01/2023 6:13 PM, H. S. Teoh wrote:
> Is there a way to make these tables pay-as-you-go? As in, if you never
> call a function that depends on a table, it would not be pulled into the
> binary?

This should already be the case. I saw some stuff involving Rainer 10 years ago who helped improve it along these lines.

The main concern would be shared libraries, which Phobos should be able to be distributed as on all platforms by all compilers.
« First   ‹ Prev
1 2