November 11, 2021

On Thursday, 11 November 2021 at 09:15:54 UTC, Atila Neves wrote:

>

I have not yet encountered...
I think that...
I don't think this is a problem.
I wouldn't care about it either.
Me, ~99.9% of the time. I definitely don't miss...

This all amounts to "I don't need it, therefore nobody should either". That's not a very good metric.

It's 2021. Your footnote on "nobody" simply does not apply. Everybody should be writing code designed to run on multiple processors simultaneously. Because that's what the machines are. A large portion of that "everybody" should be writing code where a large number of those processors are the GPU, or GPU(s), and I'm not even talking computer games. In either case, memory becomes key. Because, at least today, it's your slow resource (CPU) and also your somewhat limited resource (GPU).

>

My algorithm:

  1. Write code
  2. Did I notice it being slow or too big? Go to 4.
  3. Move on to the next task.
  4. Profile and optimise, go to 2.

That's an interesting way of doing science. How would you notice anything without a baseline, which you haven't even set yet because you're not at (4)? You won't. You just wrote a slow piece of software and you don't even know it because you didn't look, because it didn't appear slow? Eh?

People after you, who use your software, maybe, MAYBE, would notice though (because they set the baseline from the start), but by that time it is too late, because your software is part of the OS or CRT or whatever other dependency the application must rely on, but you didn't care. Not because you didn't optimize. Just because you pessimized, and never bothered to look.

...and that is exactly why Visual Studio (a text editor) takes 10 seconds to load a few small text files (whereas the compiler itself loads and compiles those same files in less than a second), Photoshop (an image editor) greets you with a nice splash screen (that contains an image) "almost" immediately, but lets you actually work on images only a dozen seconds later, Libre- (and any other) Office shows you a progress bar while it's "starting up", and so on and so forth... And then we have Andrei on stage lamenting on how awesome and lucrative it was to squeeze out that meager extra 1%...

>

having to make things fit into 48k of RAM

??? You DO have to make things that fit into 32k, or 64k, or (insert your cache size here). Over time - modulated by the speed of that cache. Which means, you have to take care with how, and where, your data is laid out. Which, eventually, brings you back to how you allocate. Moreover, you DO have to make things that fit into a single cache line, or are bounded by the size of that cache line. Which, again, brings you back to how you allocate.

Nobody would do any of that for you. Malloc or new won't (well, malloc may try, but it'll inevitably fail). Vector and map won't. Unique, shared, and other ptrs won't.

November 11, 2021

On Thursday, 11 November 2021 at 16:09:46 UTC, Stanislav Blinov wrote:

>

That's an interesting way of doing science.

Most computer programming is not science. Programming is a trade. There is, of course, such a thing as computer science, but it is not what most programmers do most of the time.

On Thursday, 11 November 2021 at 16:09:46 UTC, Stanislav Blinov wrote:

>

How would you notice anything without a baseline, which you haven't even set yet because you're not at (4)? You won't. You just wrote a slow piece of software and you don't even know it because you didn't look, because it didn't appear slow? Eh?

The simple fact is that for roughly 99% of code, (in terms of lines of code, not complete programs), performance is basically irrelevant.

November 11, 2021

On Thursday, 11 November 2021 at 17:03:42 UTC, Greg Strong wrote:

>

Most computer programming is not science. Programming is a trade. There is, of course, such a thing as computer science, but it is not what most programmers do most of the time.

Call it whatever you like. "I don't think it's slow" is not a measurement, therefore drawing any conclusions from that is pointless.

>

The simple fact is that for roughly 99% of code, (in terms of lines of code, not complete programs), performance is basically irrelevant.

Running fast, and simply not running slow, are two very different things. The "simple fact" here is that most people either don't care their code is unnecessarily slow, or don't know that it is, because they didn't bother to even look.

Man, this is so off-topic... I'm sorry guys.

November 11, 2021

On Thursday, 11 November 2021 at 16:09:46 UTC, Stanislav Blinov wrote:

>

On Thursday, 11 November 2021 at 09:15:54 UTC, Atila Neves

>

My algorithm:

  1. Write code
  2. Did I notice it being slow or too big? Go to 4.
  3. Move on to the next task.
  4. Profile and optimise, go to 2.

That's an interesting way of doing science. How would you notice anything without a baseline, which you haven't even set yet because you're not at (4)?

Maybe it needs to hit a certain FPS, or maybe its "are a lot of people complaining that it's slow", or "hmm i though that would be a lot faster". You have users, expectations, requirements etc.. Could be a whole bunch of reasons to jump to 4, it doesn't require an absolute "this is measurably slower than version 1.03"

November 11, 2021
On 2021-11-08 19:12, H. S. Teoh wrote:
> On Mon, Nov 08, 2021 at 10:15:09PM +0000, deadalnix via Digitalmars-d wrote:
> [...]
>> I think however, you missed several massive problems:
>> 4. Type qualifier transitivity. const RCSlice!T -> const
>> RCSlice!(const T) conversion needs to happen transparently.
> 
> The only way this can happen is via a language change.  The only way
> arrays get to enjoy such benefits is because the necessary implicit
> conversion rules are baked into the language.  User-defined types do not
> have such privileges, and there is currently no combination of language
> constructs that can express such a thing.

I keep on thinking of proposing opFunCall() that is called whenever an object is passed by value into a function. The lowering for such objects would be:

fun(obj);

===>

fun(obj.opFunCall());



November 11, 2021
On 2021-11-08 19:44, deadalnix wrote:
> On Monday, 8 November 2021 at 22:38:27 UTC, Andrei Alexandrescu wrote:
>>> I believe 4 and 5 to be impossible in D right now, 6 can be solved using a ton of runtime checks.
>>
>> To which I say, stop posting, start coding.
> 
> As I said, I believe 4 and 5 to be impossible at the moment.

Do 1, 2, and 3.

November 11, 2021
On 2021-11-08 20:04, deadalnix wrote:
> On Monday, 8 November 2021 at 22:40:03 UTC, Andrei Alexandrescu wrote:
>> On 2021-11-08 17:26, rikki cattermole wrote:
>>> a reference counted struct should never be const
>>
>> So then we're toast. If that's the case we're looking at the most important challenge to the D language right now.
> 
> Not necessarily, but this is in fact the same problem as the head mutable problem.
> 
> If `const RCSlice!(const T)` can convert to `RCSlice!(const T)`, which it should to have the same semantic as slice.
> 
> The reference counting problem goes away if you can mutate the head.

I very much wish that were the case. From what I remember working on the code, there were multiple other challenges.
November 11, 2021
On 2021-11-08 20:12, deadalnix wrote:
> On Monday, 8 November 2021 at 22:38:27 UTC, Andrei Alexandrescu wrote:
>>> shared_ptr does atomic operation all the time. The reality is that on modern machines, atomic operation are cheap *granted there is no contention*. It will certainly limit what the optimizer can do, but all in all, it's almost certainly better than keeping the info around and doing a runtime check.
>>
>> In my measurements uncontested atomic increment are 2.5x or more slower than the equivalent increment.
>>
> 
> Do you mind sharing this?

Quick and dirty code that's been long overwritten. Just redo it. Use C++ as a baseline.

> I find that curious, because load/stores on x86 are almost sequentially consistent by default, and you don't even need sequential consistency to increment the counter to begin with, so a good old `inc` instruction is enough.
> 
> I'd look to look at what's the compiler is doing here, because maybe we are trying to fix the wrong problem.

The overhead comes from the bus "lock" operation which both gcc and clang emit: https://godbolt.org/z/zx4cMYE39
November 11, 2021
On 2021-11-08 20:14, tsbockman wrote:
> On Monday, 8 November 2021 at 21:42:12 UTC, Andrei Alexandrescu wrote:
>> Eduard Staniloiu (at the time a student I was mentoring) tried really hard to define a built-in slice RCSlice(T) from the following spec:
> 
> Your spec is very focused on homogenizing the API for GC and RC slices as much as (or rather, more than) possible.
> 
> But, it isn't possible to make a truly `@safe`, general-purpose RC slice in D *at all* right now, even without all the additional constraints that you are placing on the problem.
> 
> Borrowing is required for a general-purpose RC type, so that the payload can actually be used without a reference to the payload escaping outside the lifetime of the counting reference. But, the effective lifetime of the counting reference is not visible to the `scope` borrow checker, because at any point the reference's destructor may be manually called, potentially `free`ing the payload while there is still an extant borrowed reference.

Walter would argue that his work on scope makes that possible too. The short of it is getting an answer to "can it be done" is important in and of itself because it gives us hints on what needs to be done.

> With current language semantics, the destructor (and any other similar operations, such as reassignment) of the reference type must be `@system` to prevent misuse of the destructor in `@safe` code.
>      https://issues.dlang.org/show_bug.cgi?id=21981
> 
> The solution to this problem is to introduce some way of telling the compiler, "this destructor is `@safe` to call automatically at the end of the object's scope, but `@system` to call early or manually."

This I'm not worried about. `@trusted` should be fine in low-level code, no? `destroy` is not safe. What am I missing?

> Also, the DIP1000 implementation is very buggy and incomplete; it doesn't work right for several kinds of indirections, including slices and `class` objects:
> https://forum.dlang.org/thread/zsjqqeftthxtfkytrnwp@forum.dlang.org?page=1

<nod>

>> - work in pure code just like T[] does
>> - work with qualifiers like T[] does, both RCSlice!(qual T) and qual(RCSlice!T)
> 
> I wrote a semi-`@safe` reference counting system for D recently, which includes slices, weak references, shared references with atomic counting, etc. It works well enough to be useful to me (way better than manual `malloc` and `free`), but not well enough to be a candidate for the standard library due to the above compiler bugs.

Interesting, have you published the code?

> `RCSlice!(qual T)` is no problem; the reference count and the payload do not need to use the same qualifiers. Whether the count is `shared` or not can be tracked statically as part of the RC type, so the "const might be immutable, and therefore have a shared reference count, and therefore require expensive atomic operations" issue that you raise is easy enough to solve.
> 
> On the other hand, `pure` compatibility and a usable `immutable(RCSlice!T)` are mutually exclusive, I think:
> 
> **Either** the reference count is part of the target of the RC type's internal indirection, in which case it can be mutated in `pure` code, but is frozen by an outer, transitive `immutable`, **or** the reference count is conceptualized as an entry in a separate global data structure which can be located by using the address of the payload as a key, meaning that incrementing it does not mutate the target, but is im`pure`.

Yah, my thoughts exactly. My money is on the latter option. The reference counter is metadata, not data, and should not be treated as part of the data even though implementation-wise it is.

Eduard and I stopped short of being able to formalize this.

> I believe the former solution (compatible with `pure`, but not outer `immutable`) is preferable since it is the most honest, least weird solution, and therefore least likely to trip up either the compiler developers or the users somehow.
> 
> What you seem to be asking for instead is a way to trick the type system into agreeing that mutating a reference count doesn't actually mutate anything, which is nonsense. If that's really necessary for some reason, it needs to be special cased into the language spec, like how `pure` explicitly permits memory allocation.

It's not nonsense. Or if it is a lot other things can be categorized as nonsense, such as the GC recycling memory of one type and presenting it as a different type etc.

November 11, 2021
On 2021-11-08 20:22, rikki cattermole wrote:
> 
> On 09/11/2021 2:14 PM, tsbockman wrote:
>> What you seem to be asking for instead is a way to trick the type system into agreeing that mutating a reference count doesn't actually mutate anything, which is nonsense. If that's really necessary for some reason, it needs to be special cased into the language spec, like how `pure` explicitly permits memory allocation.
> 
> Imagine saying to someone:
> 
> Yes you have made this struct immutable.
> 
> Yes you have set this bit of memory containing that immutable struct to read only.
> 
> Yes you ended up with a crashed program because that immutable struct went ahead and tried to write to that read only memory.
> 
> And yes, I understand that you couldn't have known that a field that you didn't write the implementation of used an escape hatch to write to const data.
> 
> It doesn't make sense.

It makes perfect sense.

Yes you have made this struct immutable.

Yes you have set this bit of memory containing that immutable struct to read only.

Yes the immutable structure has metadata associated with it that is not mutable.

Yes that metadata can be in fact lying next to the data itself (for optimization purposes) even though it's conceptually global.

No you cannot end up with a crashed program because the mutable metadata and the immutable part are carefully handled so as to not interfere.

And yes, I understand that you may find that odd.