July 18, 2010
== Quote from Jonathan M Davis (jmdavisprog@gmail.com)'s article
> On Sunday 18 July 2010 06:16:09 strtr wrote:
> > I agree with the warning. A good warning would get people to read up on
> > UTF. And if you really want to have char you'll need to cast:
> > foreach(cast(char)c; chars)
> Actually, the cast would be totally unnecessary. Putting
> foreach(char c; chars)
> would be enough. Forcing a cast would change how foreach normally works. I'm not
> even sure that you can legally put a cast there like that. What we'd want to
> disallow would be
> foreach(c; chars)
> As long as the programmer puts the element type, we can assume that they know
> what they're doing. But warning in cases where they don't put it would catch a
> large number of errors in iterating over strings and wstrings.
> In any case, I filed a bug report for it:
> http://d.puremagic.com/issues/show_bug.cgi?id=4483

As a habit I tend to put types everywhere, just recently have I started using auto.
Conceptually, it just looked so obvious foreach(char c; chars) would iterate over
characters.
And you can go on programming like that (in English) for quite a while without
getting any errors whatsoever.
The moment I finally used a single non ascii char I noticed something going wrong
and had to go back and fix quite a few bugs.
And the worst part is, I wasn't the only one making this mistake.
Well, what I wanted to say was that I at least won't assume the programmer knows
what he's doing only because he adds a type.

I totally agree that putting a cast there is probably not really a solution (or
legal).
Warnings for all non-dchar types.
Is there anybody using foreach(c;chars) || foreach(char c;chars) correctly (which
couldn't be done with ubytes)?



July 18, 2010
On Sunday 18 July 2010 10:59:21 strtr wrote:
> I totally agree that putting a cast there is probably not really a solution
> (or legal).
> Warnings for all non-dchar types.
> Is there anybody using foreach(c;chars) || foreach(char c;chars) correctly
> (which couldn't be done with ubytes)?

As soon as some wants to process code units (for whatever reason) instead of code points, then using char and wchar makes sense. Now, I suppose that you could use ubyte and ushort in such circumstances, but I'm sure that _someone_ will be looking to do it, and (there's a decent chance that phobos does it) I don't think that it would go over very well to give them lots of warnings.

The issue, of course, is that the common case is that anything other than dchar in a foreach over string types would be a logic error in your code. D does a lot to make things safer, but I don't think that there are very many cases where things like this are special-cased in order to stop errors. The programmer is expected to have some clue as to what they're doing, and the general trend in D from what I can tell is to not use a type unless you have to, so it would be perfectly normal to expect the programmer to have really meant char or wchar if they put it explicitly.

I don't know. The truth is that on the one hand, programmers _need_ to understand how D deals with strings and unicode, or they _will_ have bugs. There's no getting around that. So, cases where someone who knows what they're doing is likely to screw up on (like forgetting the type on the foreach)  should have warnings associated with them if it's reasonable. However, expecting the compiler to catch each and every instance that a programmer is likely to shoot themself in the foot with unicode and strings is not particularly reasonable. The compiler can't always save the programmer from their own ignorance or stupidity. If anything, that would indicate that making errors _easier_ in code which someone who doesn't understand how D deals with unicode would write would be a good idea.

It should be the case that competent D programmers will be able to use strings easily. But it's likely better if the ones who don't know what they're doing shoot themselves in the foot earlier rather than sooner so that they learn what they need to learn about unicode and _become_ competent D programmers.

A competent D programmer will not put an explicit char in a foreach loop unless that's what they really mean. The only issue there is that char could be a type for dchar. But that sort of typo would be rather hard to defend against in general. So, certainly on the surface, it would seem overkill to effectively disallow char and wchar in foreach loops and force ubyte and ushort.

Still, this is an area which isn't all that hard to screw up on, so I don't know what the best solution is. When it comes down to it, you can't always hold the programmers hand. They need to be informed and responsible. But on the other hand, you do want to make it harder for them to make stupid mistakes, since even competent programmers do make stupid mistakes at least some of the time.

A warning for a foreach loop over strings where the element type is not specified is a start. If you have a solid suggestion which would reduce errors in the common case without unduly restraing folks who really know what they're doing, then create a bug report for it with the severity of enhancement. Walter and company will decide what works best with what they intend for D. Your suggestion may or may not be implemented, but it's worth a try.

- Jonathan M Davis
July 19, 2010
== Quote from Jonathan M Davis (jmdavisprog@gmail.com)'s article
> On Sunday 18 July 2010 10:59:21 strtr wrote:
> > I totally agree that putting a cast there is probably not really a solution
> > (or legal).
> > Warnings for all non-dchar types.
> > Is there anybody using foreach(c;chars) || foreach(char c;chars) correctly
> > (which couldn't be done with ubytes)?
> As soon as some wants to process code units (for whatever reason) instead of
> code points, then using char and wchar makes sense. Now, I suppose that you
> could use ubyte and ushort in such circumstances, but I'm sure that _someone_
> will be looking to do it, and (there's a decent chance that phobos does it) I
> don't think that it would go over very well to give them lots of warnings.
> The issue, of course, is that the common case is that anything other than dchar
> in a foreach over string types would be a logic error in your code. D does a lot
> to make things safer, but I don't think that there are very many cases where
> things like this are special-cased in order to stop errors. The programmer is
> expected to have some clue as to what they're doing, and the general trend in D
> from what I can tell is to not use a type unless you have to, so it would be
> perfectly normal to expect the programmer to have really meant char or wchar if
> they put it explicitly.
> I don't know. The truth is that on the one hand, programmers _need_ to
> understand how D deals with strings and unicode, or they _will_ have bugs.
> There's no getting around that. So, cases where someone who knows what they're
> doing is likely to screw up on (like forgetting the type on the foreach)  should
> have warnings associated with them if it's reasonable. However, expecting the
> compiler to catch each and every instance that a programmer is likely to shoot
> themself in the foot with unicode and strings is not particularly reasonable.
> The compiler can't always save the programmer from their own ignorance or
> stupidity. If anything, that would indicate that making errors _easier_ in code
> which someone who doesn't understand how D deals with unicode would write would
> be a good idea.
> It should be the case that competent D programmers will be able to use strings
> easily. But it's likely better if the ones who don't know what they're doing
> shoot themselves in the foot earlier rather than sooner so that they learn what
> they need to learn about unicode and _become_ competent D programmers.

I actually knew about unicode, but I mistakenly thought a char to be a code point
(thus variable in size).
Somehow I missed any documentation telling me otherwise.
Now that I look for it it actually says:
char | 	unsigned 8 bit UTF-8

Maybe some stronger pointers in the documentation would help.

> A competent D programmer will not put an explicit char in a foreach loop unless
> that's what they really mean. The only issue there is that char could be a type
> for dchar. But that sort of typo would be rather hard to defend against in
> general. So, certainly on the surface, it would seem overkill to effectively
> disallow char and wchar in foreach loops and force ubyte and ushort.
> Still, this is an area which isn't all that hard to screw up on, so I don't know
> what the best solution is. When it comes down to it, you can't always hold the
> programmers hand. They need to be informed and responsible. But on the other
> hand, you do want to make it harder for them to make stupid mistakes, since even
> competent programmers do make stupid mistakes at least some of the time.
> A warning for a foreach loop over strings where the element type is not specified
> is a start. If you have a solid suggestion which would reduce errors in the
> common case without unduly restraing folks who really know what they're doing,
> then create a bug report for it with the severity of enhancement. Walter and
> company will decide what works best with what they intend for D. Your suggestion
> may or may not be implemented, but it's worth a try.
> - Jonathan M Davis

I agree with your bug-report.
July 19, 2010
On Sunday 18 July 2010 17:15:15 strtr wrote:
> 
> I actually knew about unicode, but I mistakenly thought a char to be a code
> point (thus variable in size).
> Somehow I missed any documentation telling me otherwise.
> Now that I look for it it actually says:
> char | 	unsigned 8 bit UTF-8
> 
> Maybe some stronger pointers in the documentation would help.
> 

The section in TDPL on strings is excellent. A good article on unicode on D's site would be good a good additon though. While some of the documentation is good, it does tend to be fairly sparse.

- Jonathan M Davis
July 19, 2010
== Quote from Jonathan M Davis (jmdavisprog@gmail.com)'s article
> On Sunday 18 July 2010 00:46:36 Jonathan M Davis wrote:
> > I'll file a bug report
> >
> > - Jonathan M Davis
> Wait. That's not the problem. Or at least, that's not the problem that needs to
> be reported. The problem is that we're not compiling with -w. If you compile
> with -w, then statements such as
> scope(failure) continue;
> won't compile due to being unreachable statements. But if you compile with -w,
> then the compiler flags it as an error, and the program fails to compile. So, I
> filed a bug report on the fact that such warnins aren't reported without -w
> (though they would still compile since they're warnings rather than errors):
> http://d.puremagic.com/issues/show_bug.cgi?id=4482
> Regardless, what you're trying to do is clearly an error, and compiling with -w
> will show that.
> - Jonathan M Davis

I don't agree with this bug report because of two reasons.
1. Warnings are supposed to be warnings, not errors. If you want to see those
warnings you'll use -w.
What you probably want is for the dmd to have a -!w flag instead (warnings by
default, disable with flag)
2. In this particular example, the problem is not that the warning isn't shown
without -w, but that the warning is incorrect and scope(failure) shouldn't be able
to catch the exception.

Here is a smaller example of the same problem[D1]:
----
void main()
{
	for(int i=0;i<10;i++)
	{
		scope(failure){
			writefln("continue");
			continue;
		}
		//scope(failure) writefln("fail");
		writefln(i);
		throw new Exception(format(i));
	}
}
----

Enable warnings and you'll get the same unreachable warning, but which statement is unreachable as when you compile this without -w it happily prints all ten i's and continues.
July 19, 2010
On Sunday 18 July 2010 17:36:58 strtr wrote:
> 
> I don't agree with this bug report because of two reasons.
> 1. Warnings are supposed to be warnings, not errors. If you want to see
> those warnings you'll use -w.
> What you probably want is for the dmd to have a -!w flag instead (warnings
> by default, disable with flag)
> 2. In this particular example, the problem is not that the warning isn't
> shown without -w, but that the warning is incorrect and scope(failure)
> shouldn't be able to catch the exception.
> 
> Here is a smaller example of the same problem[D1]:
> ----
> void main()
> {
> 	for(int i=0;i<10;i++)
> 	{
> 		scope(failure){
> 			writefln("continue");
> 			continue;
> 		}
> 		//scope(failure) writefln("fail");
> 		writefln(i);
> 		throw new Exception(format(i));
> 	}
> }
> ----
> 
> Enable warnings and you'll get the same unreachable warning, but which statement is unreachable as when you compile this without -w it happily prints all ten i's and continues.

With any other compiler that I've ever used, it prints warnings normally. It may or may not have a way to make then errors, but it will print them normally and compile with them. dmd won't display warnings with -w, but when you use -w, it instantly makes them errors. There needs to be a middle ground where warnings are reported and not flagged as errors.

As for unreachable code being an error, that's debatable. Obviously, dmd doesn't consider it one. Personally, I hate the fact that javac does with Java. I _want_ that to be a warning. I'd like to be warned about it, and I don't want it to be in production code, but it happens often enough when developing, that I don't want to have to fix it to get code to compile. As such, a warning makes perfect sense.

However, when you combine that with the fact that dmd doesn't even report warnings unless it treats them as errors, it becomes easy to miss.

- Jonathan M Davis
July 19, 2010
== Quote from Jonathan M Davis (jmdavisprog@gmail.com)'s article
> On Sunday 18 July 2010 17:36:58 strtr wrote:
> >
> > I don't agree with this bug report because of two reasons.
> > 1. Warnings are supposed to be warnings, not errors. If you want to see
> > those warnings you'll use -w.
> > What you probably want is for the dmd to have a -!w flag instead (warnings
> > by default, disable with flag)
> > 2. In this particular example, the problem is not that the warning isn't
> > shown without -w, but that the warning is incorrect and scope(failure)
> > shouldn't be able to catch the exception.
> >
> > Here is a smaller example of the same problem[D1]:
> > ----
> > void main()
> > {
> > 	for(int i=0;i<10;i++)
> > 	{
> > 		scope(failure){
> > 			writefln("continue");
> > 			continue;
> > 		}
> > 		//scope(failure) writefln("fail");
> > 		writefln(i);
> > 		throw new Exception(format(i));
> > 	}
> > }
> > ----
> >
> > Enable warnings and you'll get the same unreachable warning, but which statement is unreachable as when you compile this without -w it happily prints all ten i's and continues.
> With any other compiler that I've ever used, it prints warnings normally. It may or may not have a way to make then errors, but it will print them normally and compile with them. dmd won't display warnings with -w, but when you use -w, it instantly makes them errors. There needs to be a middle ground where warnings are reported and not flagged as errors.

I would use this middle ground by default, if available.

> As for unreachable code being an error, that's debatable. Obviously, dmd doesn't consider it one. Personally, I hate the fact that javac does with Java. I _want_ that to be a warning. I'd like to be warned about it, and I don't want it to be in production code, but it happens often enough when developing, that I don't want to have to fix it to get code to compile. As such, a warning makes perfect sense.
I'm not sure whether you missed my point or are simple thinking out loud about
unreachable code being a warning.
My point was that the unreachable warning was wrong as there is no unreachable code.

> However, when you combine that with the fact that dmd doesn't even report
> warnings unless it treats them as errors, it becomes easy to miss.
> - Jonathan M Davis

July 19, 2010
On Sunday 18 July 2010 19:14:11 strtr wrote:
> I'm not sure whether you missed my point or are simple thinking out loud
> about unreachable code being a warning.
> My point was that the unreachable warning was wrong as there is no
> unreachable code.

Except that there _is_. You just can't see it. scope(X) creates a try-catch block. So,

scope(exit) whatever;
/* code */

becomes

try
{
*/ code */
}
finally
{
    whatever;
}


scope(success) whatever;
/* code */

becomes

/* code */
whatever;


scope(failure) whatever;
/* code */

becomes

try
{
/* code */
}
catch(Exception e)
{
    whatever;
    throw e;
}


So, something like

scope(failure) continue;
/* code */

becomes

try
{
    /* code */
}
catch(Exception e)
{
    continue;
    throw e;
}


The throw statement is then unreachable. So, the warning is correct. The problem is that it's not clear. Ideally, you would have a warning which specifically mentions the fact that you can't do that sort of thing in a scope statement. Unless the programmer is thinking about what exactly scope() becomes, the unreachable statement warning will be confusing. So, that's a problem. It is, however, correct. It probably merits its own bug report.

- Jonathan M Davis
July 19, 2010
== Quote from Jonathan M Davis (jmdavisprog@gmail.com)'s article
> On Sunday 18 July 2010 19:14:11 strtr wrote:
> > I'm not sure whether you missed my point or are simple thinking out loud
> > about unreachable code being a warning.
> > My point was that the unreachable warning was wrong as there is no
> > unreachable code.
> Except that there _is_. You just can't see it. scope(X) creates a try-catch
> block. So,
> scope(exit) whatever;
> /* code */
> becomes
> try
> {
> */ code */
> }
> finally
> {
>     whatever;
> }
> scope(success) whatever;
> /* code */
> becomes
> /* code */
> whatever;
> scope(failure) whatever;
> /* code */
> becomes
> try
> {
> /* code */
> }
> catch(Exception e)
> {
>     whatever;
>     throw e;
> }
> So, something like
> scope(failure) continue;
> /* code */
> becomes
> try
> {
>     /* code */
> }
> catch(Exception e)
> {
>     continue;
>     throw e;
> }
> The throw statement is then unreachable. So, the warning is correct. The problem
> is that it's not clear. Ideally, you would have a warning which specifically
> mentions the fact that you can't do that sort of thing in a scope statement.
> Unless the programmer is thinking about what exactly scope() becomes, the
> unreachable statement warning will be confusing. So, that's a problem. It is,
> however, correct. It probably merits its own bug report.
> - Jonathan M Davis

Thanks for the explanation!
But what you are talking about is implementation, nowhere in the spec does it say
anything like this (or did I just miss it :).
I could find only this about scope(failure):
"scope(failure) executes NonEmptyOrScopeBlockStatement  when the scope exits due
to exception unwinding."
So at the very least it is a documentation bug:
It should say something about catching the exception and then re-throwing it, or
explain that scope guards are sugar for re-throwing try statements
July 19, 2010
On Sunday 18 July 2010 19:47:37 strtr wrote:
> Thanks for the explanation!
> But what you are talking about is implementation, nowhere in the spec does
> it say anything like this (or did I just miss it :).
> I could find only this about scope(failure):
> "scope(failure) executes NonEmptyOrScopeBlockStatement  when the scope
> exits due to exception unwinding."
> So at the very least it is a documentation bug:
> It should say something about catching the exception and then re-throwing
> it, or explain that scope guards are sugar for re-throwing try statements

Bug report created: http://d.puremagic.com/issues/show_bug.cgi?id=4484
1 2 3
Next ›   Last »