DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1 (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1 (page 5)

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by Exil
in reply to Andrei Alexandrescu

Exil

Posted in reply to Andrei Alexandrescu

On Tuesday, 3 December 2019 at 19:42:12 UTC, Andrei Alexandrescu wrote:
> On 12/3/19 9:45 AM, Dennis wrote:
>> On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:
>>> Waste of labor is sadly a common theme in our community.
>> 
>> I consider this low-hanging fruit: just deprecating a token takes little implementation effort, and reduction in language complexity is (as far as I know) always welcome
>
> These can never be the primary reasons for removing a feature. One doesn't remove a feature because it's easy to remove. One removes a feature because there are good reasons to remove it, and as perks we get simplification of the language and maybe it's easy to remove.

C++ removed features that were almost never used. So much so I don't even remember what they were called. This is a D feature I never knew existed. It does make it simpler and I'd argue for removing it entirely rather than adding replacements for it.

>> In this case, such tools would be syntax highlighters.
>
> The entire narrative of the DIP puts CFG front and center. Reader's first thought is, "wait, the author is confused about what a CFG is."
>
> FIRST sentence in the abstract: "D is intended to have a context-free grammar..."
>
> FIRST paragraph in the rationale: "Regarding language design, Walter Bright has stated: [... CFG stuff ...]"
>
> Even the "Grammar Changes" section should be a give-away: the diff proposed is in the LEXICAL definition (https://dlang.org/spec/lex.html), not in the GRAMMAR definition (https://dlang.org/spec/grammar.html).
>
> If syntax highlighters are the primary reason for the DIP, it should be the primary reason in the DIP. The entire rationale needs to be redone. There should be an enumeration of syntax highlighters along with their success/failure of implementing heredocs. (Didn't test all but far as I can tell I've never heard of difficulties with implementing heredocs for bash, perl and the like.)

The tools for IDEs, I'd argue auto complete is probably the most useful tool an IDE has. You can't implement it without basically having the entire front end of the compiler because of CTFE. Its so complicated in fact that there are no tools for D that support. Ice seen some incorrect syntax highlighting for D but I think it was specifically cause by q{} which this doesn't remove anyways.

>> Maybe you don't care about syntax highlighting, but please judge this DIP by its own merits and not compared to potential other DIPs that you care more about.
>
> A DIP ought to be judged by reading the DIP. This DIP is ill informed because it is built around the CFG argument, a non-existing issue. If the DIP requires a forum post explaining how it needs to be judged, that's a problem with the DIP, not the reader.

DIP1021. If the D federation leadership holds itself to that kind of standard, I don't see why anyone should expect them to hold someone else to a standard above and beyond their own.

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by Arun Chandrasekaran
in reply to Mike Parker

Arun Chandrasekaran

Posted in reply to Mike Parker

On Tuesday, 3 December 2019 at 09:03:44 UTC, Mike Parker wrote:
> This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals":
>
> [...]

We use this feature. We can fix the code, but the DIP doesn't state a convincing reason to remove this from the language.

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by FeepingCreature
in reply to mipri

FeepingCreature

Posted in reply to mipri

On Tuesday, 3 December 2019 at 23:35:21 UTC, mipri wrote:
> 2. D's problem is "too many features" -> let's remove any
> feature that we can -> this DIP as step #1, remove something
> that looks relatively easy to remove.
>
> How much agreement do you think there is on the first point?
>
> Consider the "remove ~= from arrays" DIP. It removed a
> feature, and removing the feature arguably materially improved
> D's options to evolve as a language, and it got a really
> incensed negative response.
>

I think this is a really questionable argument, because it implicitly presumes that all features are worth the same. The "remove ~= from arrays" DIP got, as far as I could see, basically no feedback along the lines of "whatever, we use it but we could replace it easily" or "I think D doesn't need to reduce its feature set in general." The feedback it got was, as far as I could tell, overwhelmingly "this feature is a core component of the usefulness of the D language and definitely the *wrong place* to start removing things."

Logically speaking, the more people think it is the wrong place to start removing features, the less that debate says about removing features as a whole, because people were more motivated by the specific feature rather than the general state of the language.

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by Dennis
in reply to mipri

Dennis

Posted in reply to mipri

Thanks for your detailed breakdown.

On Tuesday, 3 December 2019 at 23:35:21 UTC, mipri wrote:
> It makes "other people's code" slightly more annoying to consider,
> as you may have to update that code to remove since-deprecated
> features.

That's the nature of deprecation: a short term cost for a long term improvement.

> If a feature were to be judged a mistake, it can still be a
> mistake to remove the feature later on. Less is not always
> better.

That's true.

> 2. D's problem is "too many features" -> let's remove any
> feature that we can -> this DIP as step #1, remove something
> that looks relatively easy to remove.
>
> How much agreement do you think there is on the first point?

I don't know how much explicit agreement there is to the sentiment that D has too many features, but I do know at least Walter is always interested in reducing language complexity, and many non-actionable complaints of users (such as "D is difficult too learn") are rooted in things like this.

> 3. "Walter said a thing about D, but a StackOverflow comment
> refuted that, so the language should change so that this
> criticism is no longer true."

That is only there for the narrative / background, correcting criticism is not a goal of this DIP.

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by Ola Fosheim Grøstad
in reply to Dennis

Ola Fosheim Grøstad

Posted in reply to Dennis

On Wednesday, 4 December 2019 at 09:42:32 UTC, Dennis wrote:
> That is only there for the narrative / background, correcting criticism is not a goal of this DIP.

Suggesting a workable alternative usually is easier. Like:

replace: q"delimiter...

with Python like: """

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by Dennis
in reply to H. S. Teoh

Dennis

Posted in reply to H. S. Teoh

On Wednesday, 4 December 2019 at 01:26:24 UTC, H. S. Teoh wrote:
> And just for a bit more perspective, Python also has heredoc syntax, so does Perl, PHP, bash, and probably many others. If heredocs were really such a bad idea, why are people putting them into so many languages, over and over again?

To me the opposite seems true. First of all:

> Python does not have here-docs. It does however have triple-quoted strings which can be used similarly.
https://rosettacode.org/wiki/Here_document#Python

Then considering which notable languages have context-sensitive string literals:
1987: Perl
1989: Bash
1995: PHP
2001: D
2011: C++11

If you know any other examples, please tell. I don't think context-sensitive string literals were ever put in a notable language created after 2001. (C++ has the most recent addition, but parsing that is already so complex they have nothing to lose)

> That DIP seems dead in the water though. The author has vanished and nobody has taken up the reins.

I was referring to Walter Bright's one:
https://github.com/dlang/DIPs/pull/165

> 1) Generating HTML snippets
> 2) Generating PovRay scene description snippets
> 3) Generating D code snippets
> 4) Generating snippets of a DSL I use for generating geometric models
> 5) Generating boilerplate for input data to an external convex hull
>    solver (has its own peculiar syntax)
> 6) Generating GLSL shader code snippets
> 7) Generating Java code snippets
> 8) Command line usage descriptions

I do believe for most of these you can use ``, q{} and q"<>" with little problems, but I understand that you prefer the q"EOS EOS" ones and would not want to rewrite your old code.

> ?!  Can't you just use a custom lexer with your PEG grammar?
>
> Then isn't the solution simply to write a self-contained heredoc parsing function, put it in a dub package, and let everyone reuse it? Then nobody will have to write it for themselves again. Problem solved.

Of course you can make it work. I'm not saying that context-sensitive string literals make or break all D lexers, it's just a little source of complexity that may not bear its weight.

And a good couple of syntax highlighters support multiple different languages while being implemented in one, take for example this one written in Go:

https://github.com/alecthomas/chroma/blob/master/lexers/d/d.go

I wouldn't expect them to add dub package for D, cargo package for Rust, npm package for JavaScript etc.

> This whole debacle feels like heredocs are being singled out as a scapegoat in a misguided quest to "simplify the language".  Like we're grasping at straws because we're unable to tackle the bigger issues, so here's a convenient simple target we can shoot and kill and feel good about ourselves that we're finally making progress.  Talking about straining out the gnat and swallowing the camel.

It seems to me D has this history of removing small features with a small problem:

- Small feature: escape string literals
  Small problem: doesn't have much use
- Small feature: octal string literals
  Small problem: can be confused for decimal literal, and can be made a library feature
- Small feature: hexstring literals
  Small problem: can be better represented in a library function

Now my proposed next one is:

- Small feature: context-sensitive string literals
  Small problem: accidentally bumps the complexity class of D's lexical grammar.

Now I understand that reviewers are debating whether it is a small feature ("I actually use these a lot") and whether the small problem isn't too small ("making D lexers still isn't hard"). That's what I like to see in the review, thanks especially to WebFreak and Adam D. Ruppe for their input on their VSCode and Vim highlighters, and thanks to you for your use cases.

What I don't get is why this is called a "non-starter" by Andrei and a "debacle" / "misguided quest" by you. Is it such a ludicrous idea to deprecate this particular part of the language?

I admit that I misjudged that amount of use, breakage and complexity this feature has before writing this DIP. If this trend continues then this DIP is dead, I'm not going to push this hard or anything. But I am at least still interested in Walter and Atila's opinion.

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by Timon Gehr
in reply to Dennis

Timon Gehr

Posted in reply to Dennis

On 04.12.19 12:10, Dennis wrote:
> 
> 
> Now my proposed next one is:
> 
> - Small feature: context-sensitive string literals
>    Small problem: accidentally bumps the complexity class of D's lexical grammar.

A small fix for this small problem is to just say in the specification that heredoc identifiers may not exceed 1e100 characters. ;)

Another fix could be to just go over the language specification and replace all wrongly applied CS terms by a short explanation of what is actually going on. (In practice, when Walter says D's grammar is context-free, what he means is that parsing does not depend on semantic analysis on a prefix of the code, a property that C++ has which implies context-sensitivity and is usually abbreviated this way, and Walter's aim was to contrast D to this.)

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by mipri
in reply to Dennis

mipri

Posted in reply to Dennis

On Wednesday, 4 December 2019 at 11:10:45 UTC, Dennis wrote:
> I do believe for most of these you can use ``, q{} and q"<>"
> with little problems, but I understand that you prefer the
> q"EOS EOS" ones and would not want to rewrite your old code.

The big (and only) advantage of HERE docs is that you so rarely
have to think about them or revise them that this is not a
concern. "Check and see if you've broken the string literal" is
not a step that you go through every single time you have to
touch the content of the string. The most annoying part of HERE
docs for code presentation, that the ending delimiter has such
strict requirements, is precisely what makes them not annoying
at all for holding random snippets of HTML or whatever. You
just don't get collisions, or they are very obvious. The
reader's ease is repeated in the ease that tools have with
them: they don't need a stack; they can just read lines and
throw them away until they find a line that (classically) has
some exact contents, or (in D) starts with some exact prefix.

With only matching nested delimiter strings, accidental
collisions will happen. Not often. But neither " or ])>} are
infrequent characters to find randomly in a string, and the
first time you have to change both ends of a q"( string to make
it a q"[ string because you added a URL that ended in
parentheses to some embedded HTML, you'll think: man, I should
just take all these snippets and stuff them under __EOF__ ,
then read that statically, and stuff them into a map on module
load.

And *then* you'll think: wait, people hardly ever use __EOF__
in D, so someone's definitely going to come along and deprecate
*that* code, too!

The world isn't divided only between good practices and bad
practices. Across from the Scylla of legacy-code-is-sacred
languages that never remove anything, even obviously bad
features that nobody likes (' as a module separator in Perl, or
octal literals that start with 0), there's a Charybdis of
code-is-always-bitrotting languages that jerk you around with
pointless deprecations.

> It seems to me D has this history of removing small features
> with a small problem:
>
> - Small feature: escape string literals
>   Small problem: doesn't have much use

I was surprised when \e didn't work. So it was removed for such
a reason.

> - Small feature: octal string literals
>   Small problem: can be confused for decimal literal, and can
> be made a library feature

This is a significant problem actually. The *only* reason
languages have C-style octal literals is because they can't
remove them anymore. It's not "octal literals" in general that
are bad because they could be made a library feature. 8#123 and
0o123 are octal literals that don't get confused with a nice
decimal number like 0123.

> - Small feature: hexstring literals
>   Small problem: can be better represented in a library function

What these removals all have in common is that the post-removal
experience is: you reach for the removed feature, you get an
error, you find out what to do instead, and then there are no
more problems for you. Yes, you're still moving towards
Charybdis with stuff like this, but the point of the myth isn't
"all movements in the direction of Charybdis are bad.", as
those movements are still movements *away* from Scylla.

Removing HERE docs, though, makes the language permanently
more annoying to use for the task that would've benefited from
them. To the point that, rather than just use the intended
replacement, people might rather do something else entirely.
Someone might personally not like the look of \033 vs. \e, or
octal!123 vs. 0123, but the replacement doesn't make them work
any harder.

It's not a huge problem, but it's a difference between this
small deprecation and the previous ones.

> Now I understand that reviewers are debating whether it is a
> small feature ("I actually use these a lot") and whether the
> small problem isn't too small ("making D lexers still isn't
> hard"). That's what I like to see in the review, thanks
> especially to WebFreak and Adam D. Ruppe for their input on
> their VSCode and Vim highlighters, and thanks to you for your
> use cases.
>
> What I don't get is why this is called a "non-starter" by
> Andrei and a "debacle" / "misguided quest" by you. Is it such a
> ludicrous idea to deprecate this particular part of the
> language?

1. It's *because* the proposed change isn't that bad that it's
getting the responses it's getting, rather than complaints that
the proposed change is very bad and that HERE docs are
irreplaceable treasures. It wasn't until my post just now that
anyone took the time to say that HERE docs have any unique
advantages at all.

2. Andrei's response isn't just "non-starter" but also "HERE
documents have no impact on the language grammar."

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by mipri
in reply to Ola Fosheim Grøstad

mipri

Posted in reply to Ola Fosheim Grøstad

On Wednesday, 4 December 2019 at 10:10:09 UTC, Ola Fosheim Grøstad wrote:
> On Wednesday, 4 December 2019 at 09:42:32 UTC, Dennis wrote:
>> That is only there for the narrative / background, correcting criticism is not a goal of this DIP.
>
> Suggesting a workable alternative usually is easier. Like:
>
> replace: q"delimiter...
>
> with Python like: """

Or specify that q"<<< (three chars exactly) can only be matched
with >>>", along with the other matching delimiters. This is a
breaking change though since the current behavior is:

  $ rdmd --eval 'writeln(q"<<< hello >>>")'
  << hello >>

December 04, 2019

Re: DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Posted by Kagamin
in reply to Andrei Alexandrescu

Kagamin

Posted in reply to Andrei Alexandrescu

On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:
> This DIP is a non-starter. Here documents are easily and effectively handled during lexing and have no impact on the language grammar.

In a compiler.

Here's an implementation for bash heredoc strings, say something nice about it:

---
class HereDocCls {	// Class to manage HERE document elements
public:
	int State;		// 0: '<<' encountered
	// 1: collect the delimiter
	// 2: here doc text (lines after the delimiter)
	int Quote;		// the char after '<<'
	bool Quoted;		// true if Quote in ('\'','"','`')
	bool Indent;		// indented delimiter (for <<-)
	int DelimiterLength;	// strlen(Delimiter)
	char *Delimiter;	// the Delimiter, 256: sizeof PL_tokenbuf
	HereDocCls() {
		State = 0;
		Quote = 0;
		Quoted = false;
		Indent = 0;
		DelimiterLength = 0;
		Delimiter = new char[HERE_DELIM_MAX];
		Delimiter[0] = '\0';
	}
	void Append(int ch) {
		Delimiter[DelimiterLength++] = static_cast<char>(ch);
		Delimiter[DelimiterLength] = '\0';
	}
	~HereDocCls() {
		delete []Delimiter;
	}
};
HereDocCls HereDoc;
---

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation