On Monday, 12 August 2024 at 18:30:02 UTC, monkyyy wrote:
> Ascii deprecated several marks of english grammar to fit into 7 bits, one of these features was the directional quotes and so c had to make strings with single quotes and rules about escaping. We are no longer c and its no longer the 60's.
Imagine making a 1 char typo of escape characters when making a deeply nested strings for mixins.
In that case, use iq{}
strings.
> I'd suggest "heavy double comma" as its visibly distinct in 3 monospace fonts I checked
❝ ❞
U+275D U+275E
I believe all directional quote schemes will require users to add custom xmodmap to type or ide plugins so I believe monospace font behavoir so be the primary concern.
A directional quoted string should have the simplest parsing rule of it counts up on U+275D and down at U+275E and returns when its 0; all other escapes and characters are ignored.
I’m 80% sure this is trolling.
D already has delimited strings: q"(abc(")adb")"
. It’s hard to believe you’ll ever run into a case where all of the four delimiters ()
, []
, {}
, <>
will be in the string in an unbalanced way.
But that doesn’t even convey how bad this idea is, if you think it through.
Not all fonts have U+275D and U+275E, not even close. You’d be much better suited with chevrons («»
), as those are reasonably supported by fonts because chevrons are standard in French. Generally, you can’t expect fonts having more than the basic ASCII characters. Even those that have, they might not be visually distinct enough. There’s a reason D only has 10L
and not 10l
as literals, even if on most monospace fonts, l
, I
, and 1
are distinct enough. IMO, allowing anything non-ASCII in D code (except for comments) is an error and will trip people up. I have run into issues of C++ compilers making assumptions what the input and output encoding is. I work for a German company and all our error messages are in German. You won’t find any literal Ää, Öö, Üü, ß in our codebase; those are all \u00FC
for ü etc. and they’re in u8""
literals.
Proponents’ best arguments are: “Why not” and “Some words look like slurs when using ASCII replacements”. Too bad. I’m confronted with BS and ASS daily (which stand for balance sheet and assets, to be clear), and it’s funny initially and then you just get used to it.
They never had to debug code because in some string literal, there was Unicode nonsense like a soft hyphen, which made it unequal to every string it was compared to. Best thing is, printing the string to Windows’s CMD removes the soft hyphen!
With ASCII, what strings are equal and which aren’t is obvious. With Unicode, it’s some special circle of hell:
// This compiles:
void main()
{
int ä = 0;
int ä = 1;
}
Maybe I’m overly conservative, but I can tell you, it’s not out of spite, it’s just from real, non-hypothetical experience. Probably, people who live and work in the US have little to no experience with those kinds of issues. UK folk basically only because £ (U+00A3) is non-ASCII.
Don’t get me wrong, I love typographically correct quotes. I have them on my keyboard and use them everywhere it makes sense. It makes sense for forum posts, but not for code.