Jump to page: 1 2
Thread overview
January 16

They are in the ASCII table.
They are directional and balanced.
The D language should use them.

Just look:

writeln(«Hello, World!»);
January 17
They are not part of ASCII, they are part of an "extended ASCII" ISO/IEC 8859-1 aka Latin-1.

They do not fit in a single byte.

C2 AB

https://symbl.cc/en/00AB/

For us to introduce a new string syntax, it would need to do something that the existing ones cannot reasonably do.
January 16
On Thursday, 16 January 2025 at 21:26:29 UTC, Richard (Rikki) Andrew Cattermole wrote:
> They are not part of ASCII, they are part of an "extended ASCII" ISO/IEC 8859-1 aka Latin-1.
>
> They do not fit in a single byte.
>
> C2 AB
>
> https://symbl.cc/en/00AB/
>
> For us to introduce a new string syntax, it would need to do something that the existing ones cannot reasonably do.

The extended ASCII has 8 bits, 256 distinguish characters
January 16

On Thursday, 16 January 2025 at 21:19:34 UTC, barbosso wrote:

>

They are in the ASCII table.
They are directional and balanced.
The D language should use them.

Just look:

writeln(«Hello, World!»);

https://forum.dlang.org/post/ipyynnyaszcypnzioyng@forum.dlang.org

January 17
On 17/01/2025 10:34 AM, barbosso wrote:
> On Thursday, 16 January 2025 at 21:26:29 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> They are not part of ASCII, they are part of an "extended ASCII" ISO/ IEC 8859-1 aka Latin-1.
>>
>> They do not fit in a single byte.
>>
>> C2 AB
>>
>> https://symbl.cc/en/00AB/
>>
>> For us to introduce a new string syntax, it would need to do something that the existing ones cannot reasonably do.
> 
> The extended ASCII has 8 bits, 256 distinguish characters

D files are encoded as UTF-8.

Therefore it does not support extended ASCII.

January 16
On Thursday, 16 January 2025 at 21:38:50 UTC, Richard (Rikki) Andrew Cattermole wrote:
> On 17/01/2025 10:34 AM, barbosso wrote:
>> On Thursday, 16 January 2025 at 21:26:29 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>> They are not part of ASCII, they are part of an "extended ASCII" ISO/ IEC 8859-1 aka Latin-1.
>>>
>>> They do not fit in a single byte.
>>>
>>> C2 AB
>>>
>>> https://symbl.cc/en/00AB/
>>>
>>> For us to introduce a new string syntax, it would need to do something that the existing ones cannot reasonably do.
>> 
>> The extended ASCII has 8 bits, 256 distinguish characters
>
> D files are encoded as UTF-8.
>
> Therefore it does not support extended ASCII.

Do you understand what you wrote?
January 17
On 17/01/2025 10:43 AM, barbosso wrote:
> On Thursday, 16 January 2025 at 21:38:50 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> On 17/01/2025 10:34 AM, barbosso wrote:
>>> On Thursday, 16 January 2025 at 21:26:29 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>>> They are not part of ASCII, they are part of an "extended ASCII" ISO/ IEC 8859-1 aka Latin-1.
>>>>
>>>> They do not fit in a single byte.
>>>>
>>>> C2 AB
>>>>
>>>> https://symbl.cc/en/00AB/
>>>>
>>>> For us to introduce a new string syntax, it would need to do something that the existing ones cannot reasonably do.
>>>
>>> The extended ASCII has 8 bits, 256 distinguish characters
>>
>> D files are encoded as UTF-8.
>>
>> Therefore it does not support extended ASCII.
> 
> Do you understand what you wrote?

Yes.

Extended ASCII is both a character set and an encoding.

The character set is supported as part of Unicode, the encoding is not supported as we use UTF-8 which conflicts on the 8th bit for the first byte in the code unit.

January 16
On Thursday, 16 January 2025 at 21:45:39 UTC, Richard (Rikki) Andrew Cattermole wrote:
> On 17/01/2025 10:43 AM, barbosso wrote:
>> On Thursday, 16 January 2025 at 21:38:50 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>> On 17/01/2025 10:34 AM, barbosso wrote:
>>>> On Thursday, 16 January 2025 at 21:26:29 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>>>> [...]
>>>>
>>>> The extended ASCII has 8 bits, 256 distinguish characters
>>>
>>> D files are encoded as UTF-8.
>>>
>>> Therefore it does not support extended ASCII.
>> 
>> Do you understand what you wrote?
>
> Yes.
>
> Extended ASCII is both a character set and an encoding.
>
> The character set is supported as part of Unicode, the encoding is not supported as we use UTF-8 which conflicts on the 8th bit for the first byte in the code unit.


now I see.
UTF-8 use 1 byte to represent 128 characters ASCII
and 2 bytes for other characters (including «chevrons»).
So, what's the problem?
January 16
On Thursday, 16 January 2025 at 22:03:25 UTC, barbosso wrote:
> On Thursday, 16 January 2025 at 21:45:39 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> On 17/01/2025 10:43 AM, barbosso wrote:
>>> On Thursday, 16 January 2025 at 21:38:50 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>>> [...]
>>> 
>>> Do you understand what you wrote?
>>
>> Yes.
>>
>> Extended ASCII is both a character set and an encoding.
>>
>> The character set is supported as part of Unicode, the encoding is not supported as we use UTF-8 which conflicts on the 8th bit for the first byte in the code unit.
>
>
> now I see.
> UTF-8 use 1 byte to represent 128 characters ASCII
> and 2 bytes for other characters (including «chevrons»).
> So, what's the problem?

GCC and Clang can compile identifiers with Unicode symbols.
January 17
On 17/01/2025 11:16 AM, barbosso wrote:
> On Thursday, 16 January 2025 at 22:03:25 UTC, barbosso wrote:
>> On Thursday, 16 January 2025 at 21:45:39 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>> On 17/01/2025 10:43 AM, barbosso wrote:
>>>> On Thursday, 16 January 2025 at 21:38:50 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>>>> [...]
>>>>
>>>> Do you understand what you wrote?
>>>
>>> Yes.
>>>
>>> Extended ASCII is both a character set and an encoding.
>>>
>>> The character set is supported as part of Unicode, the encoding is not supported as we use UTF-8 which conflicts on the 8th bit for the first byte in the code unit.
>>
>>
>> now I see.
>> UTF-8 use 1 byte to represent 128 characters ASCII
>> and 2 bytes for other characters (including «chevrons»).
>> So, what's the problem?
> 
> GCC and Clang can compile identifiers with Unicode symbols.

I know, I implemented D's UAX31 identifiers.

Better to have the right terminology for this.

However the current stance is that we have possibly too many string types. So far you have proposed new delimiters but not new behaviors (which would be required to add it).

« First   ‹ Prev
1 2