Make size_t its own type

Jan 23

Quirin Schroll

Jan 28

Dukc

Jan 30

IchorDev

Feb 02

Walter Bright

January 23

Make size_t its own type

Posted by Quirin Schroll

Permalink

Quirin Schroll

Permalink

The title says it: Make size_t be a type that’s distinguished from all currently existing unsigned types.

The rationale for that is that conversions from and to size_t fail and succeed depending on compile-target. On a 32-bit machine, size_t is an alias of uint and on a 64-bit it’s an alias of ulong ― I know it’s not defined that way, but it ends up being that. That means size_t implicitly converts to uint on 32-bit and ulong implicitly converts to size_t on 64-bit, but flip the architectures and the conversions fail. This means writing portable code is needlessly hard.

If size_t were its own type, the language could reject any implicit conversion that would fail on any target supported by the compiler / the language.

There are already unsigned and signed types and character types. Speaking of x86 assembly, there is no difference between uint, int, and dchar, i.e. they all use the same 32-bit registers; that’s contrary to float which does have dedicated floating point registers. Yet, most languages distinguish uint, int, and dchar. They are conceptually different enough to warrant being different types. IMO, that applies to size_t even more because it isn’t even consistent across targets.

AFAIK, D only supports 32- and 64-bit targets. So size_t could be initialized by a uint (and anything that implicitly converts to uint), and implicitly converts to ulong.

What could be added is allowing the target-specific implicit conversions (e.g. size_t to uint or ulong to size_t) if the code is in under a version statement of which the compiler knows it determines the width of size_t.

My experience comes from working on a C++ project where we still have to support 32-bit Windows and had to support 32-bit Linux up to last year. We have many warnings on and treat all of them as errors across 5 compilers, including loss of precision on implicit casts. D could make it easier for its users not to run into these all the time.

On Thursday, 23 January 2025 at 16:00:20 UTC, Quirin Schroll wrote:

If size_t were its own type, the language could reject any implicit conversion that would fail on any target supported by the compiler / the language.

In general, having to cross-compile your code to make sure that it works sucks and is something I would expect from a non-portability-friendly language like C. Making size_t its own type would greatly benefit code portability for people who aren’t actively testing their code on 32-bit platforms (personally I don’t own any 32-bit computers, so I can’t test for issues like that). I wholly agree that the same logic should be applied to ptrdiff_t, as dukc said.
Technically this change would cause a lot of existing code to no longer compile, but fixing it would be as simple as adding explicit casts where necessary; and that code would not have compiling for either 32- or 64-bit machines anyway.

I share your pain working with diverse C compilers with their diverse ideas of what warnings should be emitted, and how they often contradict each other. Things get worse when using C code checkers. This is why I've tried hard to not even have warnings in D. Anyhow, I have a lot of experience moving code from 16 to 32 to 64 bits, and the effects of size_t. The error messages come from misusing size_t. D does it better by flagging implicit casts that lose bits (casting from 64 bit size_t to 32 bit). My problems porting D code 32<=>64 have vanished because of this. There might be an error now and then compiling 64 bit code to 32 bit code, but at least the error happens at compile time. The biggest problem I had with size_t is printf format string mismatches. When the format string checker was added, that problem thankfully disappeared. I understand you want to flag the error even when compiling 64 bit code. The problems with defining a new type: 1. break any code that sets its own size_t type 2. break any code that relies on the name mangling 3. added complexity to function overloading 4. new overloading rules, not at all simple 5. existing code that has cases for each type will miss the new size_t Regarding dchar: That came about at the creation of D. I thought at the time that dchar would be how people process text. That, of course, never happened, and dchar doesn't have much of a role these days. You can still create a unique type for size_t, as size_t is not set by the compiler, but by druntime. ``` struct size_t { ... } ``` and give it the desired behavior. Perhaps that can help with determining how useful it would be. D has had success with deprecating the builtin imaginary/complex floating point types and replacing them with structs. I have also implemented the `struct halffloat` which provides support for 16 bit floating point.

Forums