Jump to page: 1 2 3
Thread overview
[Issue 13512] dmd cannot compile perfectly valid code with shebang
Sep 21, 2014
Vladimir Panteleev
Sep 21, 2014
Ketmar Dark
Sep 21, 2014
Vladimir Panteleev
Sep 21, 2014
Vladimir Panteleev
Sep 21, 2014
Ketmar Dark
Sep 21, 2014
Sobirari Muhomori
Sep 21, 2014
Ketmar Dark
Sep 21, 2014
Ketmar Dark
Sep 22, 2014
Sobirari Muhomori
Sep 22, 2014
Ketmar Dark
Sep 22, 2014
Ketmar Dark
Sep 23, 2014
Ketmar Dark
Sep 23, 2014
Sobirari Muhomori
Feb 25, 2015
Sobirari Muhomori
[Issue 13512] Allow non-UTF-8 encoding in shebang line
Jul 02, 2017
Vladimir Panteleev
Jul 02, 2017
Vladimir Panteleev
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

Vladimir Panteleev <thecybershadow@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |thecybershadow@gmail.com

--- Comment #1 from Vladimir Panteleev <thecybershadow@gmail.com> ---
OMG, someone still uses KOI-8?

This is probably a WONTFIX because it conflicts with the D spec. DMD does not allow invalid UTF-8 even in comments, because Unicode conversion is done as a separate step from the lexer/tokenizer.

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

Ketmar Dark <ketmar@ketmar.no-ip.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ketmar@ketmar.no-ip.org

--- Comment #2 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
(In reply to Vladimir Panteleev from comment #1)
> OMG, someone still uses KOI-8?
yes, it's me. ;-)

> This is probably a WONTFIX because it conflicts with the D spec. DMD does not allow invalid UTF-8 even in comments, because Unicode conversion is done as a separate step from the lexer/tokenizer.
actually, it's a very simple patch to lexer (i already did that). but if dmd forbids a perfectly valid shebang… well, i'm sure that D is not in the position to change existing standards. if dmd can't compile valid code… it's not a bug in users' locale, it's THE bug in dmd. let's redefine POSIX then, POSIX is not cool. let's dictate what usernames are allowed. let's dictate what pathes are allowed. and so on. D über alles, fsck standards!

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #3 from Vladimir Panteleev <thecybershadow@gmail.com> ---
DMD is not going to change existing standards, but it can choose to not follow them. After all, you don't expect to have a working KOI-8 shebang on a UTF-16 source file?

You can work around this issue as follows:

sudo ln -s /opt/dmd/пробы /opt/dmd/tests

then using #!/opt/dmd/tests/rdmd as your shebang.

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

Vladimir Panteleev <thecybershadow@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|thecybershadow@gmail.com    |

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #4 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
i'm not also expecting correct EBCDIC decoding. but it's not UTF-16 file, and ahering the standard is easy in this case: just stop validating things that should not be validated. i.e. either kill shebang feature entirely or do it right.

and yes, trying to validate comments drives me mad too. i mean hey, this is comment, just skip it and allow me to write any BS there.

i know how i can workaround this, but i completely refuse to understand why this workaround is necessary at the first place. it's complete nonsence.

yes, it's a very minor ussue, but i want this bug to be officially fixed or marked as WONTFIX to clarify some of my inner thoughts.

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

hsteoh@quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hsteoh@quickfur.ath.cx

--- Comment #5 from hsteoh@quickfur.ath.cx ---
Actually, a deeper underlying issue that is being assumed, not just by dmd but by much of druntime/phobos that interfaces with the outside world, is that system-level things like filenames are UTF-8 encoded. While it's perfectly fine to do everything only in Unicode internally in D programs, this ultimately unfounded assumption can cause problems, e.g., if the filesystem uses a non-utf8 encoding, or if the program is (hypothetically) running on an EBCDIC machine, or if the D program has to interface with non-Unicode legacy programs. For example, writeln assumes the target terminal understands utf8, which may not necessarily be true.

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #6 from Sobirari Muhomori <dfj1esp02@sneakemail.com> ---
(In reply to Ketmar Dark from comment #4)
> i'm not also expecting correct EBCDIC decoding. but it's not UTF-16 file, and ahering the standard is easy in this case: just stop validating things that should not be validated.

AFAIK, the standard text encoding on posix today is utf-8, so D adheres to this standard.

> i.e. either kill shebang feature entirely or do it right.

Shebang is sort of brittle by design. It works only for text files (which doesn't always hold) and if the text file encoding matches that of your system. If both conditions don't hold, you should find another way, like finding executable by file extension - that works independently of file content.

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #7 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
(In reply to Sobirari Muhomori from comment #6)
> AFAIK, the standard text encoding on posix today is utf-8
oh, and for what reason we have that strange "locale settings" then? also, can you point me at the exact standard part which tells that text encoding is utf-8 regardless to current locale settings? or the part that tells anything about text encoding for that matter.

and no, GNU/Linux is *not* The New Standard Maker.

> Shebang is sort of brittle by design. It works only for text files
WUT?! O_O it works perfectly for *any* type of file. it's completely ok to place binary data after shebang if interpreter can cope with that.

> and if the text file encoding matches that of your system.
and the given example matches. yet dmd refuses to compile my sample. not *run*, but *compile*.

the right shebang support in dmd must be like this: check if the first chars of the file forms shebang, and if they are, then just skipping other chars until '\n'. and skip '\n'. that's all. no validation. no martian logic. just skipping chars.

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

hsteoh@quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|hsteoh@quickfur.ath.cx      |

--
September 21, 2014
https://issues.dlang.org/show_bug.cgi?id=13512

--- Comment #8 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
Created attachment 1431
  --> https://issues.dlang.org/attachment.cgi?id=1431&action=edit
proposed fix

just for completeness sake.

--
« First   ‹ Prev
1 2 3