Jump to page: 1 2
Thread overview
Should std.conv:parse parse html entities?
Nov 13, 2019
berni44
Nov 13, 2019
Jonathan M Davis
Nov 13, 2019
Jonathan M Davis
Nov 13, 2019
Jonathan Marler
Nov 13, 2019
berni44
Nov 13, 2019
Jonathan Marler
Nov 14, 2019
Suleyman
Nov 14, 2019
berni44
Nov 15, 2019
Jonathan Marler
Aug 17, 2020
Sebastiaan Koppe
Aug 18, 2020
James Blachly
Aug 18, 2020
Sebastiaan Koppe
November 13, 2019
Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all.

As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think?

[1] https://issues.dlang.org/show_bug.cgi?id=9621
November 13, 2019
On Wednesday, November 13, 2019 5:17:17 AM MST berni44 via Digitalmars-d wrote:
> Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all.
>
> As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think?
>
> [1] https://issues.dlang.org/show_bug.cgi?id=9621

I fail to see why std.conv.to or std.conv.parse should handle either octal literals or HTML entities, and I don't know why anyone would expect them to. HTML entities are the kind of thing that I would expect an HTML parser to handle, not the standard library. The compiler does handle some of them (which honestly, I think is kind of weird), which is the only argument I can see for supporting them in std.conv, but it's not like std.conv is designed to be parsing D code. Also, IIRC, octal literals were removed from the language. So, that's not an argument for adding them to std.conv. They also not all that commonly needed by anything AFAIK. parse can already parse integer values of arbitrary bases if you give it an explicit based / radix.

- Jonathan M Davis



November 13, 2019
On Wednesday, November 13, 2019 7:41:45 AM MST Jonathan M Davis via Digitalmars-d wrote:
> On Wednesday, November 13, 2019 5:17:17 AM MST berni44 via Digitalmars-d
>
> wrote:
> > Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all.
> >
> > As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think?
> >
> > [1] https://issues.dlang.org/show_bug.cgi?id=9621
>
> I fail to see why std.conv.to or std.conv.parse should handle either octal literals or HTML entities, and I don't know why anyone would expect them to. HTML entities are the kind of thing that I would expect an HTML parser to handle, not the standard library. The compiler does handle some of them (which honestly, I think is kind of weird), which is the only argument I can see for supporting them in std.conv, but it's not like std.conv is designed to be parsing D code. Also, IIRC, octal literals were removed from the language. So, that's not an argument for adding them to std.conv. They also not all that commonly needed by anything AFAIK. parse can already parse integer values of arbitrary bases if you give it an explicit based / radix.

Actually, it looks like you can still have octal literals in strings even though support for octal integer literals was removed. Either way, given that the compiler is going to translate a string literal with an octal or HTML entity into what it represents rather than have it be something to parse, unless someone is constructing strings that use these rather than using string literals, there won't even be anything to parse. Personally, I don't see much reason to support either. What's the use case?

- Jonathan M Davis



November 13, 2019
On Wednesday, 13 November 2019 at 12:17:17 UTC, berni44 wrote:
> Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all.
>
> As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think?
>
> [1] https://issues.dlang.org/show_bug.cgi?id=9621

Maybe you could put the table inside a template so it only get compiled/included when it's used?

template HtmlEntityTable()
{
    const HtmlEntityTable = ...;
}
November 13, 2019
On Wednesday, 13 November 2019 at 18:31:09 UTC, Jonathan Marler wrote:
> Maybe you could put the table inside a template so it only get compiled/included when it's used?
>
> template HtmlEntityTable()
> {
>     const HtmlEntityTable = ...;
> }

As far, as I understood the discussion in the bugreport, the problem with that is, that most of the time you'll not know if it will be needed, but most strings parsed (I assume, they are not available on compiletime) do not contain entities (presumably).
November 13, 2019
On Wednesday, 13 November 2019 at 18:55:42 UTC, berni44 wrote:
> On Wednesday, 13 November 2019 at 18:31:09 UTC, Jonathan Marler wrote:
>> Maybe you could put the table inside a template so it only get compiled/included when it's used?
>>
>> template HtmlEntityTable()
>> {
>>     const HtmlEntityTable = ...;
>> }
>
> As far, as I understood the discussion in the bugreport, the problem with that is, that most of the time you'll not know if it will be needed, but most strings parsed (I assume, they are not available on compiletime) do not contain entities (presumably).

True, if its reachable through a high-level generic function then it would be used most of the time.  Sorry I'm not familiar with which functions would be calling it, but for me, I've never really needed a function that escaped valid D strings so I'm not sure which specific function would be using this.
November 14, 2019
On Wednesday, 13 November 2019 at 12:17:17 UTC, berni44 wrote:
> Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not [...]

What is the concern about the table? Is it binary size, runtime performance, or something else?


November 14, 2019
On Thursday, 14 November 2019 at 18:14:07 UTC, Suleyman wrote:
> What is the concern about the table? Is it binary size, runtime performance, or something else?

I think binary size. Runtime shouldn't be a problem, because it should be possible to implement a perfect hash table for that (or an other fast lookup method).
November 15, 2019
On Thursday, 14 November 2019 at 20:23:57 UTC, berni44 wrote:
> On Thursday, 14 November 2019 at 18:14:07 UTC, Suleyman wrote:
>> What is the concern about the table? Is it binary size, runtime performance, or something else?
>
> I think binary size. Runtime shouldn't be a problem, because it should be possible to implement a perfect hash table for that (or an other fast lookup method).

A compile-time perfect hash generator sounds like a really nice feature. Someone should get on that.
August 17, 2020
On Friday, 15 November 2019 at 01:13:27 UTC, Jonathan Marler wrote:
> On Thursday, 14 November 2019 at 20:23:57 UTC, berni44 wrote:
>> On Thursday, 14 November 2019 at 18:14:07 UTC, Suleyman wrote:
>>> What is the concern about the table? Is it binary size, runtime performance, or something else?
>>
>> I think binary size. Runtime shouldn't be a problem, because it should be possible to implement a perfect hash table for that (or an other fast lookup method).
>
> A compile-time perfect hash generator sounds like a really nice feature. Someone should get on that.

Mine gets pretty close, if only there was a compile time random number generator: https://github.com/skoppe/perfect-hash
« First   ‹ Prev
1 2