Basic question about size_t and ulong

Basic question about size_t and ulong
Mar 18, 2022 WhatMeWorry
Mar 18, 2022 Adam Ruppe
Mar 18, 2022 Ali Çehreli
Mar 21, 2022 Steven Schveighoffer
Mar 22, 2022 Era Scarecrow
Mar 22, 2022 Ali Çehreli
Mar 22, 2022 Era Scarecrow
Mar 22, 2022 H. S. Teoh
Mar 23, 2022 Era Scarecrow
Mar 23, 2022 Era Scarecrow

March 18, 2022

Posted by WhatMeWorry

Permalink

WhatMeWorry

Permalink

Quoting the D documentation:

size_t is an alias to one of the unsigned integral basic types, and represents a type that is large enough to represent an offset into all addressable memory. And I have a line of code:

size_t huge = ulong.max;

dmd GC.d
GC.d(29): Error: cannot implicitly convert expression 18446744073709551615LU of type 'ulongtouint

Isn't ulong an integer? And isn't memory addresses 64 bits long?

size_t huge = uint.max; // compiles

works but now I'm just curious. I was just seeing what is the largest dynamic array I could create.

March 18, 2022

Re: Basic question about size_t and ulong

Posted by Adam Ruppe
in reply to WhatMeWorry

Permalink

Adam Ruppe

Posted in reply to WhatMeWorry

Permalink

On Friday, 18 March 2022 at 21:54:55 UTC, WhatMeWorry wrote:

Isn't ulong an integer? And isn't memory addresses 64 bits long?

Only if you are doing a 64 bit build. Try using -m64

March 18, 2022

Re: Basic question about size_t and ulong

Posted by Ali Çehreli
in reply to WhatMeWorry

Permalink

Ali Çehreli

Posted in reply to WhatMeWorry

Permalink

On 3/18/22 14:54, WhatMeWorry wrote:

> size_t is an alias to one of the unsigned integral basic types, and
> represents a type that is large enough to represent an offset into all
> addressable memory.

In practice, that general description means "size_t is either ulong or uint" depending on your platform (or build e.g. -m32 as Adam said).

> size_t huge = uint.max;  // compiles

That means size_t is uint on that build.

Ali

P.S. On a related note, I used to make the mistake of using size_t for file offsets as well. That is a mistake because even on a 32-bit system (or build), file sizes can be larger than uint.max. So, the correct type is long for seek() so that we can seek() to an earlier place and ulong for tell().

March 21, 2022

Re: Basic question about size_t and ulong

Posted by Steven Schveighoffer
in reply to Ali Çehreli

Permalink

Steven Schveighoffer

Posted in reply to Ali Çehreli

Permalink

On 3/18/22 7:01 PM, Ali Çehreli wrote:

On 3/18/22 14:54, WhatMeWorry wrote:

size_t huge = uint.max; // compiles

That means size_t is uint on that build.

Not that Ali is wrong in the full sense, but this line alone will compile on both 64 and 32-bit systems, so it is not informative. However, the fact that assigning it to ulong.max doesn't work coupled with this means that it's 32-bit.

You can use code like this to tell you what your platform bits are:

pragma(msg, cast(int)(size_t.sizeof * 8), " bit");

-Steve

March 22, 2022

Re: Basic question about size_t and ulong

Posted by Era Scarecrow
in reply to Ali Çehreli

Permalink

Era Scarecrow

Posted in reply to Ali Çehreli

Permalink

On Friday, 18 March 2022 at 23:01:05 UTC, Ali Çehreli wrote:
> P.S. On a related note, I used to make the mistake of using size_t for file offsets as well. That is a mistake because even on a 32-bit system (or build), file sizes can be larger than uint.max. So, the correct type is long for seek() so that we can seek() to an earlier place and ulong for tell().

 Perhaps we should back up and ask a different question. I've been working on adaptation of Reed Solomon Codes, and i keep getting thrown with casting errors, to the point where i just want to make everything size_t to make the errors go away.

 So when should you use size_t? Is it better to use int, long, size_t? Or is it better to try to use the smallest type you need that will fulfill the function's needs and just add to handle issues due to downcasting?

March 22, 2022

Re: Basic question about size_t and ulong

Posted by Ali Çehreli
in reply to Era Scarecrow

Permalink

Ali Çehreli

Posted in reply to Era Scarecrow

Permalink

On 3/22/22 11:28, Era Scarecrow wrote:

>   So when should you use size_t?

I use size_t for anything that is related to count, index, etc. However, this is a contested topic because size_t is unsigned. As soon as you use it in an expression, the whole expression becomes unsigned as well. (Related: Usual Arithmetic Conversions at the link below.)

For that reason, at least during an "ask us anything" session at a C++ conference, where Andrei was among the panel, Herb Sutter and others agreed that it was a mistake to choose unsigned for size_t.

So far, I didn't have much trouble from that decision. I am always careful when subtracting two size_ts.

> Is it better to use int, long, size_t?

D uses size_t for automatic indexes during foreach, and as I said, it makes sense to me.

Otherwise, I think the go-to type should be int for small values. long, if we know it won't fit in an int.

> Or is it better to try to use the smallest type you need that will
> fulfill the function's needs and just add to handle issues due to
> downcasting?

That may be annoying, misleading, or error-prone because smaller types are converted at least to int in expressions anyway:

  https://dlang.org/spec/type.html#integer-promotions

(Every D programmer should know the whole section 6.4 there.)

But yeah, if your function works on a byte, sure, it should take a byte.

Expect wild disagreements on this whole topic. :)

Ali

March 22, 2022

Re: Basic question about size_t and ulong

Posted by Era Scarecrow
in reply to Ali Çehreli

Permalink

Era Scarecrow

Posted in reply to Ali Çehreli

Permalink

On Tuesday, 22 March 2022 at 18:47:19 UTC, Ali Çehreli wrote:

On 3/22/22 11:28, Era Scarecrow wrote:

So when should you use size_t?

I use size_t for anything that is related to count, index, etc. However, this is a contested topic because size_t is unsigned.

I don't see a problem with that. It's not like you can access -300 address space or index (although making your own index function technically you could). I'm actually surprised signed is the default rather than unsigned. Were negative numbers really that important in 16bit MS-DOS that C had to have signed as the default?

This question is probably going off topic but still be interesting to know if there's an answer.

> >

Is it better to use int, long, size_t?

D uses size_t for automatic indexes during foreach, and as I said, it makes sense to me.

Otherwise, I think the go-to type should be int for small values. long, if we know it won't fit in an int.

Mhmm. More or less this is what i would think. I'm just getting sick of either numbers i return that i have to feed into indexes and it complains it's too big, or putting what is going to be a smaller number in an array because the array type is too small. Casting or using masks may resolve the issue, but it may crop up again when i make a change or try to compile on a different architecture.

My usual writing at this time is doing my work on a 64bit laptop, but sometimes i go and run the 32bit dmd version in windows on a different computer and checking for differences between ldc/gdc and dmd for if the code complains. I see more and more why different versions of compilers/OSes is a pain in the ass.

I'd almost wish D had a more lenient mode and would do automatic down-casting, then complain if it would have failed to downcast data at runtime.

> >

Or is it better to try to use the smallest type you need that will fulfill the function's needs and just add to handle issues due to downcasting?

That may be annoying, misleading, or error-prone because smaller types are converted at least to int in expressions anyway:

Yeah, and i remember reading about optimization in GCC where doing smaller types can actually be slower, much like in some architectures having offsets in memory address results in a speed penalty for non-aligned data.

But yeah, if your function works on a byte, sure, it should take a byte.

Expect wild disagreements on this whole topic. :)

Though internally it may be an int...

March 22, 2022

Re: Basic question about size_t and ulong

Posted by H. S. Teoh
in reply to Era Scarecrow

Permalink

H. S. Teoh

Posted in reply to Era Scarecrow

Permalink

On Tue, Mar 22, 2022 at 09:11:00PM +0000, Era Scarecrow via Digitalmars-d-learn wrote: [...]
> I'd almost wish D had a more lenient mode and would do automatic down-casting, then complain if it *would* have failed to downcast data at runtime.
[...]

We already have this:

	import std.conv : to;
	int x;
	long y;
	y = x.to!long;	// equivalent to straight assignment / cast
	x = y.to!int;	// throws if out of range for int


T

-- 
Gone Chopin. Bach in a minuet.
I see that you JS got Bach.

March 23, 2022

Re: Basic question about size_t and ulong

Posted by Era Scarecrow
in reply to H. S. Teoh

Permalink

Era Scarecrow

Posted in reply to H. S. Teoh

Permalink

On Tuesday, 22 March 2022 at 21:23:43 UTC, H. S. Teoh wrote:

On Tue, Mar 22, 2022 at 09:11 PM, Era Scarecrow wrote:

[...]
I'd almost wish D had a more lenient mode and would do automatic down-casting, then complain if it would have failed to downcast data at runtime.
[...]

We already have this:

import std.conv : to;
int x;
long y;
y = x.to!long; // equivalent to straight assignment / cast
x = y.to!int; // throws if out of range for int

At which point I might as well just do cast(int) on everything regardless BECAUSE the point of it is NOT having to add a bunch of conversions or extra bits to it.

This particular usage can be useful, just not in the automatic sense i was meaning.

March 23, 2022

Re: Basic question about size_t and ulong

Posted by Era Scarecrow
in reply to Era Scarecrow

Permalink

Era Scarecrow

Posted in reply to Era Scarecrow

Permalink

On Wednesday, 23 March 2022 at 00:51:42 UTC, Era Scarecrow wrote:

On Tuesday, 22 March 2022 at 21:23:43 UTC, H. S. Teoh wrote:

We already have this:

import std.conv : to;
int x;
long y;
y = x.to!long; // equivalent to straight assignment / cast
x = y.to!int; // throws if out of range for int

This particular usage can be useful, just not in the automatic sense i was meaning.

Forgot to add this; for the more automatic mode maybe add a new tag say @autodowncast, which may add the .to!passingtype leaving said checks without needing to throw casts in a dozen places.

Though i doubt Walter or Andrei would go for it.

Top | Forum index | About this forum

Forums