Thread overview | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
November 13, 2019 Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all. As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think? [1] https://issues.dlang.org/show_bug.cgi?id=9621 |
November 13, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Wednesday, November 13, 2019 5:17:17 AM MST berni44 via Digitalmars-d wrote:
> Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all.
>
> As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think?
>
> [1] https://issues.dlang.org/show_bug.cgi?id=9621
I fail to see why std.conv.to or std.conv.parse should handle either octal literals or HTML entities, and I don't know why anyone would expect them to. HTML entities are the kind of thing that I would expect an HTML parser to handle, not the standard library. The compiler does handle some of them (which honestly, I think is kind of weird), which is the only argument I can see for supporting them in std.conv, but it's not like std.conv is designed to be parsing D code. Also, IIRC, octal literals were removed from the language. So, that's not an argument for adding them to std.conv. They also not all that commonly needed by anything AFAIK. parse can already parse integer values of arbitrary bases if you give it an explicit based / radix.
- Jonathan M Davis
|
November 13, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
On Wednesday, November 13, 2019 7:41:45 AM MST Jonathan M Davis via Digitalmars-d wrote:
> On Wednesday, November 13, 2019 5:17:17 AM MST berni44 via Digitalmars-d
>
> wrote:
> > Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all.
> >
> > As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think?
> >
> > [1] https://issues.dlang.org/show_bug.cgi?id=9621
>
> I fail to see why std.conv.to or std.conv.parse should handle either octal literals or HTML entities, and I don't know why anyone would expect them to. HTML entities are the kind of thing that I would expect an HTML parser to handle, not the standard library. The compiler does handle some of them (which honestly, I think is kind of weird), which is the only argument I can see for supporting them in std.conv, but it's not like std.conv is designed to be parsing D code. Also, IIRC, octal literals were removed from the language. So, that's not an argument for adding them to std.conv. They also not all that commonly needed by anything AFAIK. parse can already parse integer values of arbitrary bases if you give it an explicit based / radix.
Actually, it looks like you can still have octal literals in strings even though support for octal integer literals was removed. Either way, given that the compiler is going to translate a string literal with an octal or HTML entity into what it represents rather than have it be something to parse, unless someone is constructing strings that use these rather than using string literals, there won't even be anything to parse. Personally, I don't see much reason to support either. What's the use case?
- Jonathan M Davis
|
November 13, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Wednesday, 13 November 2019 at 12:17:17 UTC, berni44 wrote:
> Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not needed at all.
>
> As I don't think, I should try to decide this on my own, I'd like to know your oppinion: What is better: Add the entities or write in the docs, that they are not supported? What do you think?
>
> [1] https://issues.dlang.org/show_bug.cgi?id=9621
Maybe you could put the table inside a template so it only get compiled/included when it's used?
template HtmlEntityTable()
{
const HtmlEntityTable = ...;
}
|
November 13, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan Marler | On Wednesday, 13 November 2019 at 18:31:09 UTC, Jonathan Marler wrote:
> Maybe you could put the table inside a template so it only get compiled/included when it's used?
>
> template HtmlEntityTable()
> {
> const HtmlEntityTable = ...;
> }
As far, as I understood the discussion in the bugreport, the problem with that is, that most of the time you'll not know if it will be needed, but most strings parsed (I assume, they are not available on compiletime) do not contain entities (presumably).
|
November 13, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Wednesday, 13 November 2019 at 18:55:42 UTC, berni44 wrote:
> On Wednesday, 13 November 2019 at 18:31:09 UTC, Jonathan Marler wrote:
>> Maybe you could put the table inside a template so it only get compiled/included when it's used?
>>
>> template HtmlEntityTable()
>> {
>> const HtmlEntityTable = ...;
>> }
>
> As far, as I understood the discussion in the bugreport, the problem with that is, that most of the time you'll not know if it will be needed, but most strings parsed (I assume, they are not available on compiletime) do not contain entities (presumably).
True, if its reachable through a high-level generic function then it would be used most of the time. Sorry I'm not familiar with which functions would be calling it, but for me, I've never really needed a function that escaped valid D strings so I'm not sure which specific function would be using this.
|
November 14, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Wednesday, 13 November 2019 at 12:17:17 UTC, berni44 wrote:
> Concerning issue 9621 [1]: There are two things, that parse doesn't parse currently, namely octal numbers and html entities. While there is no argument against the former (I actually wrote a PR to add them), there has been some discussion around the later, because the whole table of those entities (about 3000) would make it in the code, even if not [...]
What is the concern about the table? Is it binary size, runtime performance, or something else?
|
November 14, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Suleyman | On Thursday, 14 November 2019 at 18:14:07 UTC, Suleyman wrote:
> What is the concern about the table? Is it binary size, runtime performance, or something else?
I think binary size. Runtime shouldn't be a problem, because it should be possible to implement a perfect hash table for that (or an other fast lookup method).
|
November 15, 2019 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Thursday, 14 November 2019 at 20:23:57 UTC, berni44 wrote:
> On Thursday, 14 November 2019 at 18:14:07 UTC, Suleyman wrote:
>> What is the concern about the table? Is it binary size, runtime performance, or something else?
>
> I think binary size. Runtime shouldn't be a problem, because it should be possible to implement a perfect hash table for that (or an other fast lookup method).
A compile-time perfect hash generator sounds like a really nice feature. Someone should get on that.
|
August 17, 2020 Re: Should std.conv:parse parse html entities? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan Marler | On Friday, 15 November 2019 at 01:13:27 UTC, Jonathan Marler wrote: > On Thursday, 14 November 2019 at 20:23:57 UTC, berni44 wrote: >> On Thursday, 14 November 2019 at 18:14:07 UTC, Suleyman wrote: >>> What is the concern about the table? Is it binary size, runtime performance, or something else? >> >> I think binary size. Runtime shouldn't be a problem, because it should be possible to implement a perfect hash table for that (or an other fast lookup method). > > A compile-time perfect hash generator sounds like a really nice feature. Someone should get on that. Mine gets pretty close, if only there was a compile time random number generator: https://github.com/skoppe/perfect-hash |
Copyright © 1999-2021 by the D Language Foundation