February 15, 2011
"Walter Bright" <newshound2@digitalmars.com> wrote in message news:ijerk4$2u3a$1@digitalmars.com...
> Nick Sabalausky wrote:
>> "Walter Bright" <newshound2@digitalmars.com> wrote in message news:ijeil4$2aso$3@digitalmars.com...
>>> spir wrote:
>>>> Having to constantly explain that "_t" means type, that "size" does not mean size, what this type is supposed to mean instead, what it is used for in core and stdlib functionality, and what programmers are supposed to use it for... isn't this a waste of our time? This, only because the name is mindless?
>>> No, because there is a vast body of work that uses size_t and a vast body of programmers who know what it is and are totally used to it.
>>>
>>
>> And there's a vast body who don't.
>>
>> And there's a vast body who are used to C++, so let's just abandon D and make it an implementation of C++ instead.
>
> I would agree that D is a complete waste of time if all it consisted of was renaming things.

And since D *does* force C++ users to learn far bigger differences, learning a different name for something is trivial.


February 15, 2011
On Monday, February 14, 2011 18:11:10 Don wrote:
> Nick Sabalausky wrote:
> > "Jonathan M Davis" <jmdavisProg@gmx.com> wrote in message news:mailman.1650.1297733226.4748.digitalmars-d@puremagic.com...
> > 
> >> On Monday, February 14, 2011 17:06:43 spir wrote:
> >>> Rename size-t, or rather introduce a meaningful standard alias? (would
> >>> vote
> >>> for Natural)
> >> 
> >> Why? size_t is what's used in C++. It's well known and what lots of
> >> programmers
> >> would expect What would you gain by renaming it?
> > 
> > Although I fully realize how much this sounds like making a big deal out of nothing, to me, using "size_t" has always felt really clumsy and awkward. I think it's partly because of using an underscore in such an otherwise short identifier, and partly because I've been aware of size_t for years and still don't have the slightest clue WTF that "t" means. Something like "wordsize" would make a lot more sense and frankly feel much nicer.
> > 
> > And, of course, there's a lot of well-known things in C++ that D deliberately destroys. D is a different language, it may as well do things better.
> 
> To my mind, a bigger problem is that size_t is WRONG. It should be an integer. NOT unsigned.

Why exactly should it be signed? You're not going to index an array with a negative value (bounds checking would blow up on that I would think, though IIRC you can do that in C/C++ - which is a fairly stupid thing to do IMHO). You lose half the possible length of arrays if you have a signed size_t (less of a problem in 64-bit land than 32-bit land). I don't see any benefit to it being signed other than you can have a for loop do something like this:

for(size_t i = a.length - 1; i >= 0; --i)

And while that can be annoying at times, it's not like it's all that hard to code around.

Is there some low level reason why size_t should be signed or something I'm completely missing?

- Jonathan M Davis
February 15, 2011
On 02/15/2011 10:45 PM, Nick Sabalausky wrote:
> "Adam Ruppe"<destructionator@gmail.com>  wrote in message
> news:ije0gi$18vo$1@digitalmars.com...
>> Sometimes I think we should troll the users a little and make
>> a release with names like so:
>>
>> alias size_t
>> TypeUsedForArraySizes_Indexes_AndOtherRelatedTasksThatNeedAnUnsignedMachineSizeWord;
>>
>> alias ptrdiff_t
>> TypeUsedForDifferencesBetweenPointers_ThatIs_ASignedMachineSizeWordAlsoUsableForOffsets;
>>
>> alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;
>>
>>
>> Cash money says everyone would be demanding an emergency release with
>> shorter names. We'd argue for months about it... and probably settle
>> back where we started.
>
> A small software company I once worked for, Main Sequence Technologies, had
> their heads so far up their asses it was trivial for me to get posted on
> TheDailyWTF's Code Snippet of the Day (This company had a
> rather...interesting...way of creating their "else" clauses).
>
> One of the many "Programming 101, Chapter 1" things they had a habit of
> screwing up was "Use meaningful variable names!". Throughout the codebase
> (VB6 - yea, that tells you a lot about their level of competence), there
> were variables like "aaa", "staaa", "bbb", "stbbb", "ccc", etc. Those are
> actual names they used. (I even found a file-loading function named "save".)
>
> Needless to say, trying to understand the twisted codebase enough to
> actually do anything with it was...well, you can imagine. So I would try to
> clean things up when I could, in large part just so I could actually keep it
> all straight in my own mind.
>
> Anyway, to bring this all back around to what you said above, there were
> times when I understood enough about a variable to know it wasn't relevant
> to whatever my main task was, and therefore didn't strictly need to go
> wasting even *more* time trying to figure out what the hell the variable
> actually did. So I ended up in the habit of just renaming those variables to
> things like:
>
> bbb
> ->
> thisVariableNeedsAMuchMoreMeaningfulNameThan_bbb

Did you actually type this yourself, Nick, or do you have a secret prototype of camel-case automaton, based on an English language lexing DFA?

denis
-- 
_________________
vita es estrany
spir.wikidot.com

February 15, 2011
On 02/15/2011 10:40 PM, Daniel Gibson wrote:
> Am 15.02.2011 20:15, schrieb Rainer Schuetze:
>>
>> I think David has raised a good point here that seems to have been lost in the
>> discussion about naming.
>>
>> Please note that the C name of the machine word integer was usually called
>> "int". The C standard only specifies a minimum bit-size for the different types
>> (see for example http://www.ericgiguere.com/articles/ansi-c-summary.html). Most
>> of current C++ implementations have identical "int" sizes, but now "long" is
>> different. This approach has failed and has caused many headaches when porting
>> software from one platform to another. D has recognized this and has explicitely
>> defined the bit-size of the various integer types. That's good!
>>
>> Now, with size_t the distinction between platforms creeps back into the
>> language. It is everywhere across phobos, be it as length of ranges or size of
>> containers. This can get viral, as everything that gets in touch with these
>> values might have to stick to size_t. Is this really desired?
>>
>> Consider saving an array to disk, trying to read it on another platform. How
>> many bits should be written for the size of that array?
>>
>
> This can indeed be a problem which actually is existent in Phobos: std.streams
> Outputstream has a write(char[]) method - and similar methods for wchar and
> dchar - that do exactly this: write a size_t first and then the data.. in many
> places they used uint instead of size_t, but at the one method where this is a
> bad idea they used size_t ;-) (see also
> http://d.puremagic.com/issues/show_bug.cgi?id=5001 )
>
> In general I think that you just have to define how you serialize data to
> disk/net/whatever (what endianess, what exact types) and you won't have
> problems. Just dumping the data to disk isn't portable anyway.

How do you, in general, cope with the issue that, when using machine-size types, programs or (program+data) combinations will work on some machines and not on others? This disturbs me a lot. I prefere having a constant field of applicability, even if artificially reduced for some set of machines.
Similar reflexion about "infinite"-size numbers.

Note this is different from using machine-size (unsigned) integers on the implementation side, for implementation reasons. This could be done, I guess, without having language-side issues. Meaning int, for instance, could be on the implementation side the same thing as long on 64-bit machine, but still be semantically limited to 32-bit; so that the code works the same way on all machines.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

February 15, 2011
On 02/15/2011 10:49 PM, Michel Fortin wrote:
> On 2011-02-15 16:33:33 -0500, Walter Bright <newshound2@digitalmars.com> said:
>
>> Nick Sabalausky wrote:
>>> "Walter Bright" <newshound2@digitalmars.com> wrote in message
>>> news:ijeil4$2aso$3@digitalmars.com...
>>>> spir wrote:
>>>>> Having to constantly explain that "_t" means type, that "size" does not
>>>>> mean size, what this type is supposed to mean instead, what it is used for
>>>>> in core and stdlib functionality, and what programmers are supposed to use
>>>>> it for... isn't this a waste of our time? This, only because the name is
>>>>> mindless?
>>>> No, because there is a vast body of work that uses size_t and a vast body
>>>> of programmers who know what it is and are totally used to it.
>>>
>>> And there's a vast body who don't.
>>>
>>> And there's a vast body who are used to C++, so let's just abandon D and
>>> make it an implementation of C++ instead.
>>
>> I would agree that D is a complete waste of time if all it consisted of was
>> renaming things.
>
> I'm just wondering whether 'size_t', because it is named after its C
> counterpart, doesn't feel too alien in D, causing people to prefer 'uint' or
> 'ulong' instead even when they should not. We're seeing a lot of code failing
> on 64-bit because authors used the fixed-size types which are more D-like in
> naming. Wouldn't more D-like names that don't look like relics from C --
> something like 'word' and 'uword' -- have helped prevent those bugs by making
> the word-sized type look worth consideration?

Exactly :-)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

February 15, 2011
Nick Sabalausky wrote:

> "Nick Sabalausky" <a@a.a> wrote in message news:ijesem$brd$1@digitalmars.com...
>> "Steven Schveighoffer" <schveiguy@yahoo.com> wrote in message news:op.vqx78nkceav7ka@steve-laptop...
>>>
>>> size_t works,  it has a precedent, it's already *there*, just use it, or alias it if you  don't like it.
>>>
>>
>> One could make much the same argument about the whole of C++. It works, it has a precedent, it's already *there*, just use it.
>>
> 
> The whole reason I came to D was because, at the time, D was more interested in fixing C++'s idiocy than just merely aping C++ as the theme seems to be now.

I don't see any difference, D has always kept a strong link to it's C++ heritage. It's just a matter of what you define as idiocy.
February 15, 2011
On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
> Is there some low level reason why size_t should be signed or something I'm
> completely missing?

My personal issue with unsigned ints in general as implemented in C-like languages is that the range of non-negative signed integers is half of the range of corresponding unsigned integers (for same size).
* practically: known issues, and bugs if not checked by the language
* conceptually: contradicts the "obvious" idea that unsigned (aka naturals) is a subset of signed (aka integers)

denis
-- 
_________________
vita es estrany
spir.wikidot.com

February 15, 2011
Steven Schveighoffer wrote:
> 
> In addition size_t isn't actually defined by the compiler.  So the library controls the size of size_t, not the compiler.  This should make it extremely portable.
> 

I do not consider the language and the runtime as completely seperate when it comes to writing code. BTW, though defined in object.di, size_t is tied to some compiler internals:

	alias typeof(int.sizeof) size_t;

and the compiler will make assumptions about this when creating array literals.

>> Consider saving an array to disk, trying to read it on another platform. How many bits should be written for the size of that array?
> 
> It depends on the protocol or file format definition.  It should be irrelevant what platform/architecture you are on.  Any format or protocol worth its salt will define what size integers you should store.

Agreed, the example probably was not the best one.

>> I don't have a perfect solution, but maybe builtin arrays could be limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless signed/unsigned conversions), so the normal type to be used is still "int". Ranges should adopt the type sizes of the underlying objects.
> 
> No, this is too limiting.  If I have 64GB of memory (not out of the question), and I want to have a 5GB array, I think I should be allowed to.  This is one of the main reasons to go to 64-bit in the first place.

Yes, that's the imperfect part of the proposal. An array of ints could still use up to 16 GB, though.

What bothers me is that you have to deal with these "portability issues" from the very moment you store the length of an array elsewhere. Not a really big deal, and I don't think it will change, but still feels a bit awkward.
February 15, 2011
Am 16.02.2011 00:03, schrieb spir:
> On 02/15/2011 10:40 PM, Daniel Gibson wrote:
>> In general I think that you just have to define how you serialize data to disk/net/whatever (what endianess, what exact types) and you won't have problems. Just dumping the data to disk isn't portable anyway.
> 
> How do you, in general, cope with the issue that, when using machine-size types,
> programs or (program+data) combinations will work on some machines and not on
> others? This disturbs me a lot. I prefere having a constant field of
> applicability, even if artificially reduced for some set of machines.
> Similar reflexion about "infinite"-size numbers.
> 

I'm not sure I understand your question correctly..
1. You can't always deal with it, there may always be platforms (e.g. 16bit
platforms) that just can't execute your code and can't handle your types.
2. When handling data that is exchanged between programs (that may run on
different platforms) you just have to agree on a format for that data. You could
for example serialize it to XML or JSON or use a binary protocol that defines
exactly what types (what size, what endianess, what encoding) are used and how.
You can then decide for your applications data things like "this array will
*never* exceed 65k elements, so I can store it's size as ushort" and so on.
You should enforce these constraints on all platforms, of course (e.g.
assert(arr.length <= ushort.max); )
This also means that you can decide that you'll never have any arrays longer
than uint.max so they can be read and written on any platform - you just need to
make sure that, when reading it from disk/net/..., you read the length in the
right format (analog for writing).

Or would you prefer D to behave like a 32bit language on any platforms?
That means arrays *never* have more than uint.max elements etc?
Such constraints are not acceptable for a system programming language.
(The alternative - using ulong for array indexes on 32bit platforms - is
unacceptable as well because it'd slow thing down to much).

> Note this is different from using machine-size (unsigned) integers on the implementation side, for implementation reasons. This could be done, I guess, without having language-side issues. Meaning int, for instance, could be on the implementation side the same thing as long on 64-bit machine, but still be semantically limited to 32-bit; so that the code works the same way on all machines.
> 
> Denis

Cheers,
- Daniel
February 16, 2011
"spir" <denis.spir@gmail.com> wrote in message news:mailman.1709.1297810216.4748.digitalmars-d@puremagic.com...
> On 02/15/2011 10:45 PM, Nick Sabalausky wrote:
>> "Adam Ruppe"<destructionator@gmail.com>  wrote in message news:ije0gi$18vo$1@digitalmars.com...
>>> Sometimes I think we should troll the users a little and make a release with names like so:
>>>
>>> alias size_t TypeUsedForArraySizes_Indexes_AndOtherRelatedTasksThatNeedAnUnsignedMachineSizeWord;
>>>
>>> alias ptrdiff_t TypeUsedForDifferencesBetweenPointers_ThatIs_ASignedMachineSizeWordAlsoUsableForOffsets;
>>>
>>> alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;
>>>
>>>
>>> Cash money says everyone would be demanding an emergency release with shorter names. We'd argue for months about it... and probably settle back where we started.
>>
>> A small software company I once worked for, Main Sequence Technologies,
>> had
>> their heads so far up their asses it was trivial for me to get posted on
>> TheDailyWTF's Code Snippet of the Day (This company had a
>> rather...interesting...way of creating their "else" clauses).
>>
>> One of the many "Programming 101, Chapter 1" things they had a habit of screwing up was "Use meaningful variable names!". Throughout the codebase (VB6 - yea, that tells you a lot about their level of competence), there were variables like "aaa", "staaa", "bbb", "stbbb", "ccc", etc. Those are actual names they used. (I even found a file-loading function named "save".)
>>
>> Needless to say, trying to understand the twisted codebase enough to
>> actually do anything with it was...well, you can imagine. So I would try
>> to
>> clean things up when I could, in large part just so I could actually keep
>> it
>> all straight in my own mind.
>>
>> Anyway, to bring this all back around to what you said above, there were
>> times when I understood enough about a variable to know it wasn't
>> relevant
>> to whatever my main task was, and therefore didn't strictly need to go
>> wasting even *more* time trying to figure out what the hell the variable
>> actually did. So I ended up in the habit of just renaming those variables
>> to
>> things like:
>>
>> bbb
>> ->
>> thisVariableNeedsAMuchMoreMeaningfulNameThan_bbb
>
> Did you actually type this yourself, Nick, or do you have a secret prototype of camel-case automaton, based on an English language lexing DFA?
>

With all the coding I do, holding 'shift' between words is almost as natural to me as hitting 'space' between words.

An automated english -> camel-case tool wouldn't need anything fancy though. Just toUpper() the first character after each space and then remove the spaces.

I may be missing what you meant, though.