February 16, 2011
Don wrote:
> [1] What was size_t on the 286 ?

16 bits

> Note that in the small memory model (all pointers 16 bits) it really was possible to have an object of size 0xFFFF_FFFF, because the code was in a different address space.

Not really. I think the 286 had a hard limit of 16 Mb.

There was a so-called "huge" memory model which attempted (badly) to fake a linear address space across the segmented model. It never worked very well (such as having wacky problems when an object straddled a segment boundary), and applications built with it sucked in the performance dept. I never supported it for that reason.

A lot of the effort in 16 bit programming went to breaking up data structures so no individual part of it spanned more than 64K.
February 16, 2011
Walter Bright wrote:
> Don wrote:
>> [1] What was size_t on the 286 ?

> 
> 16 bits
> 
>> Note that in the small memory model (all pointers 16 bits) it really was possible to have an object of size 0xFFFF_FFFF, because the code was in a different address space.
> 
> Not really. I think the 286 had a hard limit of 16 Mb.

I mean, you can have a 16 bit code pointer, and a 16 bit data pointer. So, you can concievably have a 64K data item, using the full size of size_t.
That isn't possible on a modern, linear address space, because the code has to go somewhere...

> 
> There was a so-called "huge" memory model which attempted (badly) to fake a linear address space across the segmented model. It never worked very well (such as having wacky problems when an object straddled a segment boundary), and applications built with it sucked in the performance dept. I never supported it for that reason.
> 
> A lot of the effort in 16 bit programming went to breaking up data structures so no individual part of it spanned more than 64K.

Yuck.
I just caught the very last of that era. I wrote a couple of 16-bit DLLs. From memory, you couldn't assume the stack was in the data segment, and you got horrific memory corruption if you did.
I've got no nostalgia for those days...

February 17, 2011
Don wrote:
> Walter Bright wrote:
>> Don wrote:
>>> [1] What was size_t on the 286 ?
> 
>>
>> 16 bits
>>
>>> Note that in the small memory model (all pointers 16 bits) it really was possible to have an object of size 0xFFFF_FFFF, because the code was in a different address space.
>>
>> Not really. I think the 286 had a hard limit of 16 Mb.
> 
> I mean, you can have a 16 bit code pointer, and a 16 bit data pointer. So, you can concievably have a 64K data item, using the full size of size_t.
> That isn't possible on a modern, linear address space, because the code has to go somewhere...

Actually, you can have a segmented model on a 32 bit machine rather than a flat model, with separate segments for code, data, and stack. The Digital Mars DOS Extender actually does this. The advantage of it is you cannot execute data on the stack.

>> There was a so-called "huge" memory model which attempted (badly) to fake a linear address space across the segmented model. It never worked very well (such as having wacky problems when an object straddled a segment boundary), and applications built with it sucked in the performance dept. I never supported it for that reason.
>>
>> A lot of the effort in 16 bit programming went to breaking up data structures so no individual part of it spanned more than 64K.
> 
> Yuck.
> I just caught the very last of that era. I wrote a couple of 16-bit DLLs. From memory, you couldn't assume the stack was in the data segment, and you got horrific memory corruption if you did.
> I've got no nostalgia for those days...

I rather enjoyed it, and the pay was good <g>.
February 17, 2011
This whole conversation makes me feel like The Naive Noob for complaining about how much 32-bit address space limitations suck and we need 64 support.

On 2/16/2011 8:52 PM, Walter Bright wrote:
> Don wrote:
>> Walter Bright wrote:
>>> Don wrote:
>>>> [1] What was size_t on the 286 ?
>>
>>>
>>> 16 bits
>>>
>>>> Note that in the small memory model (all pointers 16 bits) it really
>>>> was possible to have an object of size 0xFFFF_FFFF, because the code
>>>> was in a different address space.
>>>
>>> Not really. I think the 286 had a hard limit of 16 Mb.
>>
>> I mean, you can have a 16 bit code pointer, and a 16 bit data pointer.
>> So, you can concievably have a 64K data item, using the full size of
>> size_t.
>> That isn't possible on a modern, linear address space, because the
>> code has to go somewhere...
>
> Actually, you can have a segmented model on a 32 bit machine rather than
> a flat model, with separate segments for code, data, and stack. The
> Digital Mars DOS Extender actually does this. The advantage of it is you
> cannot execute data on the stack.
>
>>> There was a so-called "huge" memory model which attempted (badly) to
>>> fake a linear address space across the segmented model. It never
>>> worked very well (such as having wacky problems when an object
>>> straddled a segment boundary), and applications built with it sucked
>>> in the performance dept. I never supported it for that reason.
>>>
>>> A lot of the effort in 16 bit programming went to breaking up data
>>> structures so no individual part of it spanned more than 64K.
>>
>> Yuck.
>> I just caught the very last of that era. I wrote a couple of 16-bit
>> DLLs. From memory, you couldn't assume the stack was in the data
>> segment, and you got horrific memory corruption if you did.
>> I've got no nostalgia for those days...
>
> I rather enjoyed it, and the pay was good <g>.

February 17, 2011
== Quote from spir (denis.spir@gmail.com)'s article
> On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
> > On Tuesday, February 15, 2011 15:13:33 spir wrote:
> >> On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
> >>> Is there some low level reason why size_t should be signed or something I'm completely missing?
> >>
> >> My personal issue with unsigned ints in general as implemented in C-like
> >> languages is that the range of non-negative signed integers is half of the
> >> range of corresponding unsigned integers (for same size).
> >> * practically: known issues, and bugs if not checked by the language
> >> * conceptually: contradicts the "obvious" idea that unsigned (aka naturals)
> >> is a subset of signed (aka integers)
> >
> > It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
> I cannot prove it, but I really think you're wrong on that.
> First, the question of 1 bit. Think at this -- speaking of 64 bit size:
> * 99.999% of all uses of unsigned fit under 2^63
> * To benefit from the last bit, you must have the need to store a value 2^63 <=
> v < 2^64
> * Not only this, you must step on a case where /any/ possible value for v
> (depending on execution data) could be >= 2^63, but /all/ possible values for v
> are guaranteed < 2^64
> This can only be a very small fraction of cases where your value does not fit
> in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)?
> Something like: "what a luck! this value would not (always) fit in 31 bits, but
> (due to this constraint), I can be sure it will fit in 32 bits (always,
> whatever input data it depends on).
> In fact, n bits do the job because (1) nearly all unsigned values are very
> small (2) the size used at a time covers the memory range at the same time.
> Upon efficiency, if unsigned is not a subset of signed, then at a low level you
> may be forced to add checks in numerous utility routines, the kind constantly
> used, everywhere one type may play with the other. I'm not sure where the gain is.
> Upon correctness, intuitively I guess (just a wild guess indeed) if unigned
> values form a subset of signed ones programmers will more easily reason
> correctly about them.
> Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-)
> (*)
> Denis
> (*) But you know, when as a young guy you have coded for 8 & 16-bit machines,
> having 63 or 64...

If you write low level code, it happens all the time.  For example, you can copy memory areas quickly on some machines by treating them as arrays of "long" and copying the values -- which requires the upper bit to be preserved.

Or you compute a 64 bit hash value using an algorithm that is part of some standard protocol.  Oops -- requires an unsigned 64 bit number, the signed version would produce the wrong result.  And since the standard expects normal behaving int64's you are stuck -- you'd have to write a little class to simulate unsigned 64 bit math.  E.g. a library that computes md5 sums.

Not to mention all the code that uses 64 bit numbers as bit fields where the different bits or sets of bits are really subfields of the total range of values.

What you are saying is true of high level code that models real life -- if the value is someone's salary or the number of toasters they are buying from a store you are probably fine -- but a lot of low level software (ipv4 stacks, video encoders, databases, etc) are based on designs that require numbers to behave a certain way, and losing a bit is going to be a pain.

I've run into this with Java, which lacks unsigned types, and once you run into a case that needs that extra bit it gets annoying right quick.

Kevin
February 17, 2011
Am 17.02.2011 05:19, schrieb Kevin Bealer:
> == Quote from spir (denis.spir@gmail.com)'s article
>> On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
>>> On Tuesday, February 15, 2011 15:13:33 spir wrote:
>>>> On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
>>>>> Is there some low level reason why size_t should be signed or something I'm completely missing?
>>>>
>>>> My personal issue with unsigned ints in general as implemented in C-like
>>>> languages is that the range of non-negative signed integers is half of the
>>>> range of corresponding unsigned integers (for same size).
>>>> * practically: known issues, and bugs if not checked by the language
>>>> * conceptually: contradicts the "obvious" idea that unsigned (aka naturals)
>>>> is a subset of signed (aka integers)
>>>
>>> It's inevitable in any systems language. What are you going to do, throw away a bit for unsigned integers? That's not acceptable for a systems language. On some level, you must live with the fact that you're running code on a specific machine with a specific set of constraints. Trying to do otherwise will pretty much always harm efficiency. True, there are common bugs that might be better prevented, but part of it ultimately comes down to the programmer having some clue as to what they're doing. On some level, we want to prevent common bugs, but the programmer can't have their hand held all the time either.
>> I cannot prove it, but I really think you're wrong on that.
>> First, the question of 1 bit. Think at this -- speaking of 64 bit size:
>> * 99.999% of all uses of unsigned fit under 2^63
>> * To benefit from the last bit, you must have the need to store a value 2^63 <=
>> v < 2^64
>> * Not only this, you must step on a case where /any/ possible value for v
>> (depending on execution data) could be >= 2^63, but /all/ possible values for v
>> are guaranteed < 2^64
>> This can only be a very small fraction of cases where your value does not fit
>> in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)?
>> Something like: "what a luck! this value would not (always) fit in 31 bits, but
>> (due to this constraint), I can be sure it will fit in 32 bits (always,
>> whatever input data it depends on).
>> In fact, n bits do the job because (1) nearly all unsigned values are very
>> small (2) the size used at a time covers the memory range at the same time.
>> Upon efficiency, if unsigned is not a subset of signed, then at a low level you
>> may be forced to add checks in numerous utility routines, the kind constantly
>> used, everywhere one type may play with the other. I'm not sure where the gain is.
>> Upon correctness, intuitively I guess (just a wild guess indeed) if unigned
>> values form a subset of signed ones programmers will more easily reason
>> correctly about them.
>> Now, I perfectly understand the "sacrifice" of one bit sounds like a sacrilege ;-)
>> (*)
>> Denis
>> (*) But you know, when as a young guy you have coded for 8 & 16-bit machines,
>> having 63 or 64...
> 
> If you write low level code, it happens all the time.  For example, you can copy memory areas quickly on some machines by treating them as arrays of "long" and copying the values -- which requires the upper bit to be preserved.
> 
> Or you compute a 64 bit hash value using an algorithm that is part of some standard protocol.  Oops -- requires an unsigned 64 bit number, the signed version would produce the wrong result.  And since the standard expects normal behaving int64's you are stuck -- you'd have to write a little class to simulate unsigned 64 bit math.  E.g. a library that computes md5 sums.
> 
> Not to mention all the code that uses 64 bit numbers as bit fields where the different bits or sets of bits are really subfields of the total range of values.
> 
> What you are saying is true of high level code that models real life -- if the value is someone's salary or the number of toasters they are buying from a store you are probably fine -- but a lot of low level software (ipv4 stacks, video encoders, databases, etc) are based on designs that require numbers to behave a certain way, and losing a bit is going to be a pain.
> 
> I've run into this with Java, which lacks unsigned types, and once you run into a case that needs that extra bit it gets annoying right quick.
> 
> Kevin

It was not proposed to alter ulong (int64), but to only a size_t equivalent. ;)
And I agree that not having unsigned types (like in Java) just sucks.
Wasn't Java even advertised as a programming language for network stuff? Quite
ridiculous without unsigned types..

Cheers,
- Daniel
February 17, 2011
"KennyTM~" <kennytm@gmail.com> wrote in message news:ijghne$ts1$1@digitalmars.com...
> On Feb 16, 11 11:49, Michel Fortin wrote:
>> On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a@a.a> said:
>>
>>> I like "nint".
>>
>> But is it unsigned or signed? Do we need 'unint' too?
>>
>> I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.
>>
>>
>
> 'word' may be confusing to Windows programmers because in WinAPI a 'WORD' means an unsigned 16-bit integer (aka 'ushort').
>
> http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx

That's just a legacy issue from when windows was mainly on 16-bit machines. "Word" means native size.


February 17, 2011
On 17.02.2011 9:09, Nick Sabalausky wrote:
> "KennyTM~"<kennytm@gmail.com>  wrote in message
> news:ijghne$ts1$1@digitalmars.com...
>> On Feb 16, 11 11:49, Michel Fortin wrote:
>>> On 2011-02-15 22:41:32 -0500, "Nick Sabalausky"<a@a.a>  said:
>>>
>>>> I like "nint".
>>> But is it unsigned or signed? Do we need 'unint' too?
>>>
>>> I think 'word'&  'uword' would be a better choice. I can't say I'm too
>>> displeased with 'size_t', but it's true that the 'size_t' feels out of
>>> place in D code because of its name.
>>>
>>>
>> 'word' may be confusing to Windows programmers because in WinAPI a 'WORD'
>> means an unsigned 16-bit integer (aka 'ushort').
>>
>> http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx
> That's just a legacy issue from when windows was mainly on 16-bit machines.
> "Word" means native size.
>
Tell that Intel guys, their assembler syntax (read most x86 assemblers) uses size prefixes word (2 bytes!), dword (4bytes), qword (8) etc.
And if that was only assembler syntax issue...

-- 
Dmitry Olshansky

February 17, 2011
On Wed, 16 Feb 2011 06:49:26 +0300, Michel Fortin <michel.fortin@michelf.com> wrote:

> On 2011-02-15 22:41:32 -0500, "Nick Sabalausky" <a@a.a> said:
>
>> I like "nint".
>
> But is it unsigned or signed? Do we need 'unint' too?
>
> I think 'word' & 'uword' would be a better choice. I can't say I'm too displeased with 'size_t', but it's true that the 'size_t' feels out of place in D code because of its name.
>
>

I second that. word/uword are shorter than ssize_t/size_t and more in line with other type names.

I like it.
February 17, 2011
On 2/17/11 8:56 AM, Denis Koroskin wrote:
> I second that. word/uword are shorter than ssize_t/size_t and more in
> line with other type names.
>
> I like it.

I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with fire, but when I read about »word«, I intuitively associated it with »two bytes« first – blame Intel or whoever else, but the potential for confusion is definitely not negligible.

David