September 22, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #9 from Sobirari Muhomori <dfj1esp02@sneakemail.com> ---
(In reply to Ketmar Dark from comment #7)
> (In reply to Sobirari Muhomori from comment #6)
> > AFAIK, the standard text encoding on posix today is utf-8
> oh, and for what reason we have that strange "locale settings" then?

AFAIK, locale defines time, number formats and user language for localization. It's orthogonal to text encoding.

> also, can you point me at the exact standard part which tells that text encoding is utf-8 regardless to current locale settings?

posix is not very strict with standardization, it only roughly describes what can be done and how. After all, it's not really a standard, but just written down de facto conventions, which established some other way.

> > Shebang is sort of brittle by design. It works only for text files
> WUT?! O_O it works perfectly for *any* type of file.

If it would work perfectly for any type of file, you wouldn't report this problem in the first place as everything would just work.

> it's completely ok to
> place binary data after shebang if interpreter can cope with that.

Binary data formats are not that flexible. And if interpreter is sufficiently smart, it can cope with various text encodings too.

> > and if the text file encoding matches that of your system.
> and the given example matches. yet dmd refuses to compile my sample. not *run*, but *compile*.

utf-8 matches koi8 only in ascii range. If you use only ascii, it should work.

> the right shebang support in dmd must be like this: check if the first chars of the file forms shebang, and if they are, then just skipping other chars until '\n'. and skip '\n'. that's all. no validation. no martian logic. just skipping chars.

D source is a text file, and text files have single encoding. Having variable encoding contradicts usual logic of text files.

--
September 22, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #10 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
(In reply to Sobirari Muhomori from comment #9)
> posix is not very strict with standardization, it only roughly describes what can be done and how. After all, it's not really a standard, but just written down de facto conventions, which established some other way.
WUT?!

sorry, i don't want to speak with trolls. bye.

--
September 22, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

Andrei Alexandrescu <andrei@erdani.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrei@erdani.com

--- Comment #11 from Andrei Alexandrescu <andrei@erdani.com> ---
Not sure what best to do about this. I'd say if #! is detected, the first line should be just scanned through the first \n and ignored. In a way the semantics of the shebang line is determined by the environment. Regular scanning shouldn't be affected.

--
September 22, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #12 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
(In reply to Andrei Alexandrescu from comment #11)
> Not sure what best to do about this. I'd say if #! is detected, the first line should be just scanned through the first \n and ignored. In a way the semantics of the shebang line is determined by the environment. Regular scanning shouldn't be affected.
my attached patch does right that: it just skips shebang line if it is found and not changing other lexing code. and it mostly consists of deleted lines, so we now have less code to test! ;-)

--
September 22, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #13 from Andrei Alexandrescu <andrei@erdani.com> ---
(In reply to Ketmar Dark from comment #12)
> (In reply to Andrei Alexandrescu from comment #11)
> > Not sure what best to do about this. I'd say if #! is detected, the first line should be just scanned through the first \n and ignored. In a way the semantics of the shebang line is determined by the environment. Regular scanning shouldn't be affected.
> my attached patch does right that: it just skips shebang line if it is found and not changing other lexing code. and it mostly consists of deleted lines, so we now have less code to test! ;-)

Sounds good. Did you convert it to a pull request?

--
September 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #14 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
(In reply to Andrei Alexandrescu from comment #13)
> Sounds good. Did you convert it to a pull request?
no. i'm not using github, sorry.

--
September 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #15 from Sobirari Muhomori <dfj1esp02@sneakemail.com> ---
(In reply to Andrei Alexandrescu from comment #11)
> Not sure what best to do about this. I'd say if #! is detected, the first line should be just scanned through the first \n and ignored. In a way the semantics of the shebang line is determined by the environment. Regular scanning shouldn't be affected.

There were two other requests for full support for legacy encodings. If such support is introduced, it should probably extend to the entire source code. It may be not in a language standard, just a compiler vendor-specific extension. Maybe a compilation option in dmd build script.

--
February 25, 2015
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #16 from Sobirari Muhomori <dfj1esp02@sneakemail.com> ---
BTW, java supports shebang like this: the runner extracts the following code, compiles and runs it. If the cached compiled code is newer than the script, the runner just runs the executable. Something like this can be written for D too to support variable text encoding.

--
July 02, 2017
https://issues.dlang.org/show_bug.cgi?id=13512

Vladimir Panteleev <dlang-bugzilla@thecybershadow.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dlang-bugzilla@thecybershad
                   |                            |ow.net
            Summary|dmd cannot compile          |Allow non-UTF-8 encoding in
                   |perfectly valid code with   |shebang line
                   |shebang                     |

--
July 02, 2017
https://issues.dlang.org/show_bug.cgi?id=13512

Vladimir Panteleev <dlang-bugzilla@thecybershadow.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull

--- Comment #17 from Vladimir Panteleev <dlang-bugzilla@thecybershadow.net> ---
https://github.com/dlang/dmd/pull/6959

--