DIP 1024--Shared Atomics--Community Review Round 1 (page 8) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » DIP 1024--Shared Atomics--Community Review Round 1 (page 8)

October 13, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Manu
in reply to Ola Fosheim Grøstad

Manu

Posted in reply to Ola Fosheim Grøstad

On Sun, Oct 13, 2019 at 1:10 AM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Saturday, 12 October 2019 at 23:36:55 UTC, Manu wrote:
> > Here's a rough draft of one such sort of tool I use all the
> > time in shared-intensive code:
> > https://gist.github.com/TurkeyMan/c16db7a0be312e9a0a2083f5f4a6efec
>
> Thanks! That looks quite low level, but now I understand more what you are looking for.

Well the implementation is low-level, obviously. But the tool is not. It's quite approachable at user level, the only risk being to avoid deadlock. If you want to systemically avoid dead-locks, then higher-level tools appear.

> What I had in mind was writing APIs that allows ordinary programmers to do parallell programming safely.

That doesn't exist. I would call that a 'framework', and that's a very
high level library suite.
What we're doing here is making it possible to write such a thing safely.

'Ordinary' programmers don't do concurrency on a low level. Period.

> Like taking the single threaded code they have written for processing a single array or merging two arrays with each other and then use a library for speeding it up.

I'm working towards a safe parallel-for... this needs to land first before I can attempt to argue for the next steps to make that work.

> Anyway, my core argument is to put more meta-programming power in hands of library authors like you so that people who have the shoes on can define the semantics they are after. I really don't think this has to be done at the compiler level, given the right meta programming tools, based on the proposed semantics. And I don't think providing those meta programming tools are more work than hardwiring more stuff into the compiler. Clearly that is just my opinion. Others might feel differently.

I've said this a few times now, and I'll say it again.
Your insistence that shared should be a library thing applies equally
to const. It behaves in an identical manner to const, and if you can't
make a compelling case for moving const to a library, you can't
possibly sell me on that idea with respect to shared.

1. `shared` defines D's thread-local by default semantics... without
shared, there is no thread-local by default. That decision is so deep,
there is absolutely no way to move away from that.
2. We must be able to overload on `shared`. We can't 'overload' on
template wrappers; we can try and specialise and stuff and abuse
clever inference and type deduction, but things break down quickly. I
have tried to simulate type qualifiers with template wrappers before.
It always goes *very* badly, and it's a mess to work with.
3. Most importantly, there is a future (not shown in this discussion)
where shared should allow some implicit promotions safely; for
instance:

void fun(scope ref shared T arg);
T x;
fun(x); // OK; promote to shared, strictly bound to the lifetime of
this call tree

This will allow some of the simple deployment of algorithms you refer to above.
Interaction with scope and other escape analysis tools is very
low-level and quite complex, and we need shared to have a solid
definition before we can experiment with that stuff.
This is the key that will unlock parallel-for and other algorithms.

October 13, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Ola Fosheim Grøstad
in reply to Manu

Ola Fosheim Grøstad

Posted in reply to Manu

On Sunday, 13 October 2019 at 19:23:45 UTC, Manu wrote:
> Your insistence that shared should be a library thing applies equally
> to const. It behaves in an identical manner to const, and if you can't
> make a compelling case for moving const to a library, you can't
> possibly sell me on that idea with respect to shared.

I'm not trying to sell anything, but you cannot easily implement D's const using metaprogramming as metaprogramming is realized in D.

You can easily extend the metaprogramming capabilities of D to implement shared.

Apples and Oranges.

> 1. `shared` defines D's thread-local by default semantics... without
> shared, there is no thread-local by default. That decision is so deep,
> there is absolutely no way to move away from that.

That argument makes no sense to me. Thread local or not thread local is essentially only by social agreement if you have syntactical means to get around it.  It is predominantly a social contract.

It is a matter of agreement, interpretation, conceptualization, mindset... That is what defines what is thread local in D. Not machinery.

That is not how it is in a type system that is stronger like in Pony. Where it is strictly enforced and clearly defined fully semantically in the machinery with no means to break the type system.

In that regard, D is not much different from C++, it is predominantly social and syntactical. Except in D you might have stronger agreement, stronger social contract, more visible syntactical symbols. So it appears to be stronger.

But as long as D is going to interface tightly with C/C++ it will always rely on social contracts.

> 2. We must be able to overload on `shared`. We can't 'overload' on
> template wrappers; we can try and specialise and stuff and abuse
> clever inference and type deduction, but things break down quickly. I
> have tried to simulate type qualifiers with template wrappers before.
> It always goes *very* badly, and it's a mess to work with.

You have two alternatives:

1.  you can do a simple one with template wrappers.

2. you can extend the meta programming capabilities and have a more advanced  solution that is more in line with the current syntax

But it certainly is possible, and not very hard to envision. After all making something completely unavailable, is much easier than only making some aspects of an object unavailable. Which is why "shared" is relatively easy to do on the syntactical level in D, but "const" is harder. If meta-programming in D was more deductive then const would be fair game too, but that is not the case, so... it becomes less obvious and probably not worth the effort.

> 3. Most importantly, there is a future (not shown in this discussion)
> where shared should allow some implicit promotions safely; for
> instance:

Ok, not sure if that really impacts whether it can be done in a library or not, but it certainly is VERY important to list such use-cases with sufficient detail when designing foundational mechanisms that is supposed to enable good interfaces to parallell programming libraries!

October 13, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Manu
in reply to Ola Fosheim Grøstad

Manu

Posted in reply to Ola Fosheim Grøstad

On Sun, Oct 13, 2019 at 2:10 PM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Sunday, 13 October 2019 at 19:23:45 UTC, Manu wrote:
> > Your insistence that shared should be a library thing applies
> > equally
> > to const. It behaves in an identical manner to const, and if
> > you can't
> > make a compelling case for moving const to a library, you can't
> > possibly sell me on that idea with respect to shared.
>
> I'm not trying to sell anything, but you cannot easily implement D's const using metaprogramming as metaprogramming is realized in D.
>
> You can easily extend the metaprogramming capabilities of D to implement shared.
>
> Apples and Oranges.

How so?
`shared` is exactly like const, but with an additional restriction
that you can't read either.

> > 1. `shared` defines D's thread-local by default semantics...
> > without
> > shared, there is no thread-local by default. That decision is
> > so deep,
> > there is absolutely no way to move away from that.
>
> That argument makes no sense to me. Thread local or not thread local is essentially only by social agreement if you have syntactical means to get around it.  It is predominantly a social contract.

"if you have syntactical means to get around it" <- this is `shared`; if we lose shared, we lose thread-local.

> It is a matter of agreement, interpretation, conceptualization, mindset... That is what defines what is thread local in D. Not machinery.

Yeah, but if there's no way to describe not-thread-local, then
thread-local doesn't really mean anything.
Assuming there were no shared, and "data is thread-local" is
specified, then you'd be doing UB anytime you spawn a second thread...
then the language would just need to tolerate that, and accept that
so-called thread-local data may or may not actually be thread-local
(and conservatively assume not thread-local), because no way to
distinguish.
I mean, I understand your point of view, but in practical terms, it's
not correct. thread-local by default exists because `shared` exists.

> > 2. We must be able to overload on `shared`. We can't 'overload'
> > on
> > template wrappers; we can try and specialise and stuff and abuse
> > clever inference and type deduction, but things break down
> > quickly. I
> > have tried to simulate type qualifiers with template wrappers
> > before.
> > It always goes *very* badly, and it's a mess to work with.
>
> You have two alternatives:
>
> 1.  you can do a simple one with template wrappers.

It goes very badly every time I've tried this.

> 2. you can extend the meta programming capabilities and have a more advanced  solution that is more in line with the current syntax

That's a huge development, and much uncertainty.
And it's not like we're introducing shared here... he's been in D
forever, we're just making it mean something.

> But it certainly is possible, and not very hard to envision. After all making something completely unavailable, is much easier than only making some aspects of an object unavailable. Which is why "shared" is relatively easy to do on the syntactical level in D, but "const" is harder.

I don't understand... I think they're almost identical hard-ness.

> > 3. Most importantly, there is a future (not shown in this
> > discussion)
> > where shared should allow some implicit promotions safely; for
> > instance:
>
> Ok, not sure if that really impacts whether it can be done in a library or not, but it certainly is VERY important to list such use-cases with sufficient detail when designing foundational mechanisms that is supposed to enable good interfaces to parallell programming libraries!

We're not designing foundational mechanisms, Walter designed shared
decades ago and it's firmly embedded in the language. We're just
making it not completely broken right now.
Designing a better shared is future work, which needs to start from
stable ground.

October 13, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Ola Fosheim Grøstad
in reply to Manu

Ola Fosheim Grøstad

Posted in reply to Manu

On Sunday, 13 October 2019 at 21:44:45 UTC, Manu wrote:
> On Sun, Oct 13, 2019 at 2:10 PM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> How so?
> `shared` is exactly like const, but with an additional restriction
> that you can't read either.

Shared provides only identity of the top level.
Const provides recursively methods for reading.

> "if you have syntactical means to get around it" <- this is `shared`; if we lose shared, we lose thread-local.

Nah, C++ memory was local... Threads came through a library/syscall. Didn't loose or win the concept of local memory... It is semiotic if not strictly enforced, a matter of interpretation.
A matter of intersubjectivity, e.g people sharing the same interpretation.

> Assuming there were no shared, and "data is thread-local" is
> specified, then you'd be doing UB anytime you spawn a second thread...

No, they could have disjunct stores. Like JavaScript where you transfer unique ownership.

Conceptually close to using a mutex... Except in the mutex case there is no formal representation, the connection between mutex and object is in the head of the programmers.

> not correct. thread-local by default exists because `shared` exists.

Nah, I'd rather say that you can transfer ownership. Shared just means that it is up for grabs, it is in transition. Just like in JavaScript. At some point the mediating system is in control.

No conflict.

> It goes very badly every time I've tried this.

I'll have to take your word for it... Something serious?

>> 2. you can extend the meta programming capabilities and have a more advanced  solution that is more in line with the current syntax
>
> That's a huge development, and much uncertainty.
> And it's not like we're introducing shared here... he's been in D
> forever, we're just making it mean something.

Yes, there might be a cultural barrier to changing the syntax/expected features.

> I don't understand... I think they're almost identical hard-ness.

You can a simple shared wrapper in 10 lines, so I don't understand the "identical" part...

> We're not designing foundational mechanisms, Walter designed shared
> decades ago and it's firmly embedded in the language.

Designed? If it was, there would be no DIP.

October 14, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Ola Fosheim Grøstad
in reply to Manu

Ola Fosheim Grøstad

Posted in reply to Manu

On Sunday, 13 October 2019 at 19:08:00 UTC, Manu wrote:
> On Sun, Oct 13, 2019 at 12:55 AM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

>> That said, if you had a formalization of the threads and what states different threads are in then you could get some performance benefits by eliding shared overhead when it can be proven that no other threads are accessing a shared variable.
>
> What does that even mean? You're just making things up.

(Drop the ad hominems... Especially when your in an area that is not your field.)

> Thread's don't have 'states', that's literally the point of threads!

All program execution is in a state at any given point in time.

So yes, you transition between states. That is the premise for building an advanced type system. That's what makes it possible to implement Rusts borrow checker.

Anything that moves in a discrete fashion will move between states, that is a basic computer science conceptualization of computing. It is the foundation that theories about computing is built on.

> Maybe there's a 10 year research project there, but that idea is so
> ridiculous and unlikely to work that nobody would ever try.

Oh. People do it. Sure, it is hard in the most general case, but you do have languages with high level concurrency constructs that provide type systems that makes it possible to write provably correct concurrent programs.

It doesn't make much sense to claim that nobody would ever try to build something that actually exists... Does it?

> I think what you're talking about is something more like a framework;

No. Languages/compilers that check concurrency states at compile time.

>> But that is not on the table...
>
> Definitely not. Not at the language level at least.

That's right. So those benefits are not avilable... which was my point. :-)

October 14, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Manu
in reply to Ola Fosheim Grøstad

Manu

Posted in reply to Ola Fosheim Grøstad

On Sun, Oct 13, 2019 at 4:30 PM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Sunday, 13 October 2019 at 21:44:45 UTC, Manu wrote:
> > On Sun, Oct 13, 2019 at 2:10 PM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> >
> > How so?
> > `shared` is exactly like const, but with an additional
> > restriction
> > that you can't read either.
>
> Shared provides only identity of the top level.
> Const provides recursively methods for reading.
>
> > "if you have syntactical means to get around it" <- this is `shared`; if we lose shared, we lose thread-local.
>
> Nah, C++ memory was local... Threads came through a
> library/syscall. Didn't loose or win the concept of local
> memory... It is semiotic if not strictly enforced, a matter of
> interpretation.
> A matter of intersubjectivity, e.g people sharing the same
> interpretation.
>
> > Assuming there were no shared, and "data is thread-local" is specified, then you'd be doing UB anytime you spawn a second thread...
>
> No, they could have disjunct stores. Like JavaScript where you transfer unique ownership.

They could, but they don't..
You're talking about a lot of hypothetical, I'm talking about how D
is, and how it's always been.

> > It goes very badly every time I've tried this.
>
> I'll have to take your word for it... Something serious?

It just doesn't scale, and relies extremely heavily on template constraints.
You lose; function pointers, virtual functions, binary libs, DLL's,
etc... you also lose auto-complete style features in the editor.
Templates should be the solution to generic code problems, not to
problems like this.

> >> 2. you can extend the meta programming capabilities and have a more advanced  solution that is more in line with the current syntax
> >
> > That's a huge development, and much uncertainty.
> > And it's not like we're introducing shared here... he's been in
> > D
> > forever, we're just making it mean something.
>
> Yes, there might be a cultural barrier to changing the syntax/expected features.

What's the point of that tangent though?
We're not seriously going to consider changing the way const is
defined in D, and make a proposal for user-defines type constructors
in a conversation about making shared work properly.

> > We're not designing foundational mechanisms, Walter designed
> > shared
> > decades ago and it's firmly embedded in the language.
>
> Designed? If it was, there would be no DIP.

I agree it should have been like this from the start, but it exists, it's the tool we have, and we need to make it work properly.

October 14, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Manu
in reply to Ola Fosheim Grøstad

Manu

Posted in reply to Ola Fosheim Grøstad

On Mon, Oct 14, 2019 at 5:50 AM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Sunday, 13 October 2019 at 19:08:00 UTC, Manu wrote:
> > On Sun, Oct 13, 2019 at 12:55 AM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> >> That said, if you had a formalization of the threads and what states different threads are in then you could get some performance benefits by eliding shared overhead when it can be proven that no other threads are accessing a shared variable.
> >
> > What does that even mean? You're just making things up.
>
> (Drop the ad hominems... Especially when your in an area that is
> not your field.)

Threading and synchronisation is exactly my field.

And it's not at hominem, I really do think you're just making stuff up
(although I'm sure informed by ideas in competing ecosystems)... I'm
sure you mean well and have some great idea, but you haven't presented
anything that I can think on.
The closest thing you've said to a real implementation proposal so far
is strong-ownership based message passing like rust, but that's not
really in contest with this DIP, and I can't imagine how `shared`
would have any influence on that toolbox.

> > Thread's don't have 'states', that's literally the point of threads!
>
> All program execution is in a state at any given point in time.
>
> So yes, you transition between states. That is the premise for building an advanced type system. That's what makes it possible to implement Rusts borrow checker.

That's all thread-local concepts. There's nothing reasonable you can
determine between threads without synchronising the threads.
D is flirting with escape analysis, we have `scope` and `return` and
things like that, and those are the things that will eventually allow
us to do in-language transitions to/from shared without unsafe casts.

> Anything that moves in a discrete fashion will move between states, that is a basic computer science conceptualization of computing. It is the foundation that theories about computing is built on.

The only reasonable CS model of threading I've ever seen requires you
to start talking about strong ownership and transfering ownership
between threads.
While I agree that strong ownership and message passing is robust and
worth having in any async ecosystem, it's also tremendously
inefficient for certain workloads, and if that were all we had baked
into the language, it excludes other (useful) possibilities.

> > Maybe there's a 10 year research project there, but that idea
> > is so
> > ridiculous and unlikely to work that nobody would ever try.
>
> Oh. People do it. Sure, it is hard in the most general case, but you do have languages with high level concurrency constructs that provide type systems that makes it possible to write provably correct concurrent programs.

That's not so much language at that point, it becomes blessed library,
or a 'framework' even.
Point me to some examples which are more than in-language message passing?
I don't think we're excluding those sorts of solutions from D.

> It doesn't make much sense to claim that nobody would ever try to build something that actually exists... Does it?

The way you initially described the premise was something along the lines of "the language knows the *state* of interacting threads and does magic thread-safe stuff", which I'm skeptical exists anywhere without a very elaborate framework to manage any such orchestration. I think now that you might be talking about ownership and message passing... but it's still not clear. What are you actually talking about? Do you have references?

Incidentally, my code that I linked to back up this thread is effectively an implementation of a rust-like shared object where it maintains a 'borrow' ref count, and guards access safely... it's easy to write those tools in D.

> > I think what you're talking about is something more like a framework;
>
> No. Languages/compilers that check concurrency states at compile time.

Show me. That is exactly the opposite of what threads are for :/
I can't imagine it, and if a _language_ did that, it almost certainly
comes at a high cost, and it's almost certainly not what I want it to
do... but I'm willing to be surprised.

> >> But that is not on the table...
> >
> > Definitely not. Not at the language level at least.
>
> That's right. So those benefits are not avilable... which was my point. :-)

I can't predict what you're talking about, and you haven't shown anything substantial. I suspect if you show what you mean, I may find the term 'benefits' to be problematic ;)

October 14, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Ola Fosheim Grøstad
in reply to Manu

Ola Fosheim Grøstad

Posted in reply to Manu

On Monday, 14 October 2019 at 18:04:28 UTC, Manu wrote:
> And it's not at hominem, I really do think you're just making stuff up
> (although I'm sure informed by ideas in competing ecosystems)...

If you go after the person and not the statement then that is "ad hominem", isn't it?  Claiming that people are making stuff up does not compel me to expand on the topic, but I'll try to give you an idea of the landscape as I see it, anyway.

I'm not sure what you mean by «competing eco systems», but I am talking about theoretical and applied research. And it is not a field I am highly interested in, but I have a passing interest in it. It makes a lot of sense to use such approaches if you are designing communication systems or critical control systems. It might be wasted on games.

Anyway, there are various ways of reasoning formally about concurrency, and it has been done for a long time... And there are many approaches.  Ranging from dedicated tools for programming in various niches where it is beneficial, to formal methods and tools for analyzing distributed protocols. It is a big field, and I think the latter theme is easier to get into, and you'll find plenty on this if you search on keywords like: formal methods, formally verified, distributed computing, distributed protocols...

> The closest thing you've said to a real implementation proposal so far
> is strong-ownership based message passing like rust, but that's not
> really in contest with this DIP, and I can't imagine how `shared`
> would have any influence on that toolbox.

My argument was that D would not have it, so it would not be possible to make significant optimization based on modelling concurrency. Therefore "shared" as a library type ought to be sufficient.  That was my line of argument.

I do not think D should model concurrency formally.

I belive such languages should be created from a clean slate. Though, some appears to do stuff with annotated C, but it does not sound very compelling to just slap it onto an existing language.

> That's all thread-local concepts. There's nothing reasonable you can
> determine between threads without synchronising the threads.

You might need synchronization, but you can potentially have less of it if you have a proof that thread 1 is always in state B when thread 2 is in state A, or that thread 1 is never in state B if thread 2 is not in state D, etc.

It is not uncommon to modell processes as statemachines and analyze them formally. When you have that model, you can implement it at a lower level if you want, and make a correspondence proof that tie each chunk of low level code to the high level model. Then there are languages where everything is handled for you in a "regular" programming language, but they will have a particular focus, with a particular set of trade-offs.

> The only reasonable CS model of threading I've ever seen requires you
> to start talking about strong ownership and transfering ownership
> between threads.

There are many ways to model and reason about concurrency. One approach was actually inspired by the modelling of chemical reactions using PetriNets.

One "impractical", but interesting, low level instruction-by-instruction approach uses Hoare logic. While possible, you might be limited to very few instructions. Anyway, with this approach you might in theory be able to avoid explicit synchronization in some cases as you can then prove whether or not all possible interleaving of instructions during execution comply with the specification! Still, the combinatorial explosion makes this very limiting and hard to reason about.  Anyway, Hoare logic is a very low level approach that has been researched in the 70s or 80s, so as you can see there is a history of various approaches.  Maybe applicable to a tight low level scope like a simple high performance hardware driver.

Many problems are only computationally prohibitively expensive if you want the optimal solution always, so researchers sometimes find ways to solve problems that appears to be impossible due to computational complexity, but with the caveat that you sometimes get the result "I don't know".  In optimization, "I don't know" is good enough, because then you just  fall back on the conservative implementation. A solver that is insufficient for unguided verification might be good enough for optimization. If "impossible" problems can be dealt with "sometimes", then you will get optimizations "sometimes".

Anyway, applied solutions seem to focus more on high level representations and high level mechanisms, but I would not be surprised if you find applied solutions working on low level representations using low level mechanisms.

> While I agree that strong ownership and message passing is robust and
> worth having in any async ecosystem, it's also tremendously
> inefficient for certain workloads, and if that were all we had baked
> into the language, it excludes other (useful) possibilities.

Ok, but I didn't argue for it being built into the language...

I am arguing that if it is built into the language then there should be a semantic reason for doing so. And static analysis of locking patterns might be one such reason.

But in general I think D would be better off by providing generic mechanisms rather than adding more and more limited features.

So maybe not add borrowing, but perhaps add some sort of limited linear/affine typing mechanism than can be used for implementing libraries that provides it to some extent. So people can write libraries that statically check that files that are opened also are closed. Or that you follow specific patterns when setting up hardware (register protocols or something like that).

If meta-programming and library-authoring is D2's main selling point then that ought to be front and center.

>> Oh. People do it. Sure, it is hard in the most general case, but you do have languages with high level concurrency constructs that provide type systems that makes it possible to write provably correct concurrent programs.
>
> That's not so much language at that point, it becomes blessed library,
> or a 'framework' even.

No, you cannot run a solver from a library. Well, I guess you could, but that is not how it is usually done.

> The way you initially described the premise was something along the lines of "the language knows the *state* of interacting threads and does magic thread-safe stuff", which I'm skeptical exists anywhere without a very elaborate framework to manage any such orchestration.

Not sure what you mean by elaborate framework. You need a language for expressing it. Then you need a solver for solving the constraints (analyzing/checking the model).

> I think now that you might be talking about ownership and message passing...

It could be, but I am talking about formal verification of concurrency models. Or languages that are designed in such a way that they translates into such models that can be verified.

What you are talking about might be the other direction where you have languages that are designed in such a way that some of the issues provably cannot arise (e.g. Pony, maybe Go).

But that is not what I am referring to. I am referring to inference using a solver.

> Incidentally, my code that I linked to back up this thread is effectively an implementation of a rust-like shared object where it maintains a 'borrow' ref count, and guards access safely... it's easy to write those tools in D.

Alright, but if you had access to more advanced typing machinery in D then you could go further down that road yourself. Maybe not to the extent that it could keep up with Rust, but closer.

> if a _language_ did that, it almost certainly
> comes at a high cost, and it's almost certainly not what I want it to
> do... but I'm willing to be surprised.

I don't think you want to do it. You'd have to change how you approach it, and it is not really needed for more mundane tasks.

>> That's right. So those benefits are not avilable... which was my point. :-)
>
> I can't predict what you're talking about, and you haven't shown anything substantial. I suspect if you show what you mean, I may find the term 'benefits' to be problematic ;)

What do you mean by substantial?

The benefits is that you prove that the code follows the model, which in turn can be fed into the optimizer and assumed to always hold.

It is not realistic that D will ever go there.

It is more realistic that D would expand the type system or meta-programming mechanisms in ways that are useful to library authors and that they could create concurrent frameworks.

October 14, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Manu
in reply to Ola Fosheim Grøstad

Manu

Posted in reply to Ola Fosheim Grøstad

On Mon, Oct 14, 2019 at 2:50 PM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Monday, 14 October 2019 at 18:04:28 UTC, Manu wrote:
> > And it's not at hominem, I really do think you're just making
> > stuff up
> > (although I'm sure informed by ideas in competing ecosystems)...
>
> If you go after the person and not the statement then that is "ad hominem", isn't it?  Claiming that people are making stuff up does not compel me to expand on the topic, but I'll try to give you an idea of the landscape as I see it, anyway.
>
> I'm not sure what you mean by «competing eco systems», but I am talking about theoretical and applied research. And it is not a field I am highly interested in, but I have a passing interest in it. It makes a lot of sense to use such approaches if you are designing communication systems or critical control systems. It might be wasted on games.
>
> Anyway, there are various ways of reasoning formally about concurrency, and it has been done for a long time... And there are many approaches.  Ranging from dedicated tools for programming in various niches where it is beneficial, to formal methods and tools for analyzing distributed protocols. It is a big field, and I think the latter theme is easier to get into, and you'll find plenty on this if you search on keywords like: formal methods, formally verified, distributed computing, distributed protocols...

I'm aware of a lot of work in this area, but I haven't seem how it
influences a language at the lowest level (beyond ownership based
solutions, where transfer of unique ownership is used to model
thread-safe data transitions); it's usually in the realm of what I
would call 'framework', although there are many examples of such being
blessed by the language.
I don't think D has a great history with blessing an implementation as
sacred and installing it into the syntax, and D tends to be expressive
enough to make library solutions for this sort of thing quite
satisfying.

> > The closest thing you've said to a real implementation proposal
> > so far
> > is strong-ownership based message passing like rust, but that's
> > not
> > really in contest with this DIP, and I can't imagine how
> > `shared`
> > would have any influence on that toolbox.
>
> My argument was that D would not have it, so it would not be possible to make significant optimization based on modelling concurrency. Therefore "shared" as a library type ought to be sufficient.  That was my line of argument.
>
> I do not think D should model concurrency formally.

Then how do you define thread-local by default, and how do you
mark/identify data that violates that axiom?
D doesn't "model concurrency", that's not what `shared` does, it
simply establishes thread-locality.

> I belive such languages should be created from a clean slate. Though, some appears to do stuff with annotated C, but it does not sound very compelling to just slap it onto an existing language.

Are you talking about D here, or just a hypothetical application of your POV?

> > That's all thread-local concepts. There's nothing reasonable
> > you can
> > determine between threads without synchronising the threads.
>
> You might need synchronization, but you can potentially have less of it if you have a proof that thread 1 is always in state B when thread 2 is in state A, or that thread 1 is never in state B if thread 2 is not in state D, etc.

This is well into 'framework' territory; a thread is associated with
static knowledge of a state machine, known threads are collected into
a registry, and then state transitions can assert validity against the
other threads in the pool.
This sounds super complex, hard to express, and not worth of language
syntax. It also has nothing to do with `shared`.

> It is not uncommon to modell processes as statemachines and analyze them formally.

Sure, but we don't do that, or anything that's even remotely in that ballpark. It's not a functional language, and we don't have *code* or data access introspection like that.

> When you have that model, you can
> implement it at a lower level if you want, and make a
> correspondence proof that tie each chunk of low level code to the
> high level model. Then there are languages where everything is
> handled for you in a "regular" programming language, but they
> will have a particular focus, with a particular set of trade-offs.

That sort of language goes all in on one very restrictive programming
model. It's not anything like D, and I don't think it's a
consideration when talking about `shared`.
If you want to resist `shared`'s existence, then I don't think ideas
like this will move us forward.

> > The only reasonable CS model of threading I've ever seen
> > requires you
> > to start talking about strong ownership and transfering
> > ownership
> > between threads.
>
> There are many ways to model and reason about concurrency. One approach was actually inspired by the modelling of chemical reactions using PetriNets.
>
> One "impractical", but interesting, low level instruction-by-instruction approach uses Hoare logic. While possible, you might be limited to very few instructions. Anyway, with this approach you might in theory be able to avoid explicit synchronization in some cases as you can then prove whether or not all possible interleaving of instructions during execution comply with the specification! Still, the combinatorial explosion makes this very limiting and hard to reason about.  Anyway, Hoare logic is a very low level approach that has been researched in the 70s or 80s, so as you can see there is a history of various approaches.  Maybe applicable to a tight low level scope like a simple high performance hardware driver.

I've read about ideas like this before, but like, it's strictly fantasy. I'm interested in what's real, where we are, what we have, and where we need to be in practical terms.

> > While I agree that strong ownership and message passing is
> > robust and
> > worth having in any async ecosystem, it's also tremendously
> > inefficient for certain workloads, and if that were all we had
> > baked
> > into the language, it excludes other (useful) possibilities.
>
> Ok, but I didn't argue for it being built into the language...

There's gotta be something at the bottom of the stack, or you can't write anything above.

> I am arguing that if it is built into the language then there should be a semantic reason for doing so. And static analysis of locking patterns might be one such reason.

There is; D specifies thread-local by default, and that absolutely requires a marker (and associated semantics) to attribute the data that violates that principle.

This has nothing to do with locking, that's a much higher-level
problem and completely outside the language.
Locking is this:
  * I have this shared data, I want to access it...
  * I must create a context where I have an exclusive window on the data.
  * One such implementation involves a mutex of some form, where I
assert that I am the unique lease-holder, and then in effect the data
becomes thread-local for the duration of that lease (ie, cast shared
away, because the data is thread-local for that duration).

In D, we can implement semantics like this by scope-guarded lock tools
of various kinds. It is also possible to achieve thread-locality by
deferring to a scheduling infrastructure with knowledge of
access-control. Such frameworks will cast to thread-local when issuing
access controlled data to the user.
We should make use of D's recent escape analysis tricks to guarantee
that thread-local leases remain properly scoped, and mutually
exclusive with other leases.

> But in general I think D would be better off by providing generic mechanisms rather than adding more and more limited features.

What does this mean to you? We can write very rich libraries with `shared` as a safety mechanism. Nothing more is needed from the language, except perhaps some future implicit cast rules.

> So maybe not add borrowing, but perhaps add some sort of limited linear/affine typing mechanism than can be used for implementing libraries that provides it to some extent. So people can write libraries that statically check that files that are opened also are closed. Or that you follow specific patterns when setting up hardware (register protocols or something like that).

What does this have to do with `shared`, and why is the DIP at odds with this?

> If meta-programming and library-authoring is D2's main selling point then that ought to be front and center.

I agree. I wrote that borrow-inspired guarded access type which I lank above.

> >> Oh. People do it. Sure, it is hard in the most general case, but you do have languages with high level concurrency constructs that provide type systems that makes it possible to write provably correct concurrent programs.
> >
> > That's not so much language at that point, it becomes blessed
> > library,
> > or a 'framework' even.
>
> No, you cannot run a solver from a library. Well, I guess you could, but that is not how it is usually done.

If this relates to the ideas you mentioned at the top; it's dependent
on threads with rich statically-associated data.
D could do this sort of thing in library; we have user-defined
attributes, and heavy-weight thread definition objects could carry
that around and be consumed by solvers.

In some ways it's a bit like Andrei's Big-O library, which carries metadata around with definitions and used for static analysis.

> > The way you initially described the premise was something along the lines of "the language knows the *state* of interacting threads and does magic thread-safe stuff", which I'm skeptical exists anywhere without a very elaborate framework to manage any such orchestration.
>
> Not sure what you mean by elaborate framework. You need a language for expressing it. Then you need a solver for solving the constraints (analyzing/checking the model).

This is exactly what I would call a framework library. An elaborated and opinionated implementation of a particular model.

> > I think now that you might be talking about ownership and message passing...
>
> It could be, but I am talking about formal verification of concurrency models. Or languages that are designed in such a way that they translates into such models that can be verified.
>
> What you are talking about might be the other direction where you have languages that are designed in such a way that some of the issues provably cannot arise (e.g. Pony, maybe Go).
>
> But that is not what I am referring to. I am referring to inference using a solver.

Your ideas rely on intensive attribution or extremely sophisticated
introspection. We can easily do such attribution in D, but not
introspection of code, and where to be practical, like you say, the
low-level set of operations would need to be radically simplified into
a small and reasonable set.
You're not talking about D at that point.

> > Incidentally, my code that I linked to back up this thread is effectively an implementation of a rust-like shared object where it maintains a 'borrow' ref count, and guards access safely... it's easy to write those tools in D.
>
> Alright, but if you had access to more advanced typing machinery in D then you could go further down that road yourself. Maybe not to the extent that it could keep up with Rust, but closer.

I'm not sure what rust offers to my little lib up there that I can't
do; I mean, rust basically does exactly that, in basically the same
way.
The only advantage of rust is that it has some language to track
lifetime knowledge through function calls which can be used for escape
analysis of the scoped lease object.

> > if a _language_ did that, it almost certainly
> > comes at a high cost, and it's almost certainly not what I want
> > it to
> > do... but I'm willing to be surprised.
>
> I don't think you want to do it. You'd have to change how you approach it, and it is not really needed for more mundane tasks.

Is it needed for *any* task? I can't think of a more complex
interactive environment in software than simulation/video games.
Nowhere else are so many completely different distinct solutions all
interacting in an asynchronous (and realtime) environment.
Maybe you want a proof-able solution that guarantees progress, and
willing to sacrifice efficiency for it... but in that world, why do
you need threads at all? If such an environment exists, I think it's
very niche.

> >> That's right. So those benefits are not avilable... which was my point. :-)
> >
> > I can't predict what you're talking about, and you haven't shown anything substantial. I suspect if you show what you mean, I may find the term 'benefits' to be problematic ;)
>
> What do you mean by substantial?

By 'substantial' I mean 'at all'. You didn't reference examples or specifics, or describe your vision until now.

> The benefits is that you prove that the code follows the model, which in turn can be fed into the optimizer and assumed to always hold.
>
> It is not realistic that D will ever go there.
>
> It is more realistic that D would expand the type system or meta-programming mechanisms in ways that are useful to library authors and that they could create concurrent frameworks.

Right. That's exactly what we're doing. I'm glad you agree :)

October 15, 2019

Re: DIP 1024--Shared Atomics--Community Review Round 1

Posted by Ola Fosheim Grøstad
in reply to Manu

Ola Fosheim Grøstad

Posted in reply to Manu

On Tuesday, 15 October 2019 at 04:09:52 UTC, Manu wrote:
> On Mon, Oct 14, 2019 at 2:50 PM Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> I'm aware of a lot of work in this area, but I haven't seem how it
> influences a language at the lowest level (beyond ownership based
> solutions, where transfer of unique ownership is used to model
> thread-safe data transitions);

Well, that's because the research topic is «verification». So that is where the culture is, but most programmers have no interest in verification, because then they would have to think about what they are doing twice... They probably should!! But that takes time and we have a global programming culture of «fudging» things together :-).  Doesn't mean the ideas cannot be leveraged for more mundane purposes with more limited scope than verification though.

Rust is the outlier, and as such is quite impressive in the sense that they have managed to established a culture where programmers accept having to do extra (unnecessary) work to make their programs run in exchange for higher confidence of correctness. But as you can see in any discussion about Rust vs C++, a large number of programmers reject that and hold it against Rust. So establishing such a culture is not an everyday phenomenon.

You seemed to suggest that I use Rust, but I personally have no need for Rust, as I currently don't have issues that Rust can solve.  I might pick it up for mundane reasons like tight webasm code generation.

What I find interesting about Rust is the culture that builds up around the language semantics. Which certainly is a shift from programming cultures that preceded it as far as I can tell.

> it's usually in the realm of  what I
> would call 'framework', although there are many examples of such being blessed by the language.

Ok, so basically anything that is unfamiliar in the D context is a «framework», including type systems. Fair enough. («framework» is usually used for denoting a set of library types and possibly runtime with tight coupling that aims to provide a foundation for writing applications in a limited category.)

> Then how do you define thread-local by default, and how do you
> mark/identify data that violates that axiom?

The only reason you need to mark TLS is because it is referenced using a different instruction sequence.

So to repeat myself:

Other than that, D does not provide anything beyond atomics/barriers. Does it? Everything else is purely syntactical. Slam a @-tag around your code and you get to reinterpret and thwart the type system. That's a syntactical approach in my mind. Not a sound type system approach. Therefore it offers nothing over a library type.

> static knowledge of a state machine, known threads are collected into
> a registry, and then state transitions can assert validity against the
> other threads in the pool.

You could do it dynamically, but also statically. Simple example, if you can establish that all Reader threads are spawned after the Writer thread has ended then you can ellide all the locking in the Reader threads. (without the programmer knowing about it).

> that ballpark. It's not a functional language, and we don't have *code* or data access introspection like that.

Doesn't have to be FPL, people do interesting things on the IR level, but I am not saying D should, the manpower and knowhow isn't here.

> That sort of language goes all in on one very restrictive programming model.

It might, but doesn't have to, programmers can restrict themselves to satisfy the verifier. It appears to work for Rust.

> If you want to resist `shared`'s existence, then I don't think ideas like this will move us forward.

I am not resisting anything.

I am trying to figure out why you cannot do «shared» equally well as a library type and what is missing in the language that prevents that from being possible.

Because if that is not possible then I sense that there must be some glaring flaws in the D meta-programming capabilities that ought to be fixed.

For me this is an exercise in figuring out what works and does not work in D's meta programming model.

> I've read about ideas like this before, but like, it's strictly fantasy. I'm interested in what's real, where we are, what we have, and where we need to be in practical terms.

Ok, so I presented two ends of the spectrum that has a long history, one from 1939/1960s (PetriNets) and one from 1980 (concurrent Hoare logic).  In between those ends you have many different approaches.

You wanted to discuss what is possible/not possible, and these two end points does say something about what people have been researching and therefore what might/might not be possible.

There are probably future applications for concurrent Hoare logic in software synthesis (let the computer find instruction sequences that satisfy a specification.). At least that seems plausible, so I wouldn't call it fantasy or useless.

> There is; D specifies thread-local by default, and that absolutely requires a marker (and associated semantics) to attribute the data that violates that principle.

That's just a claim. No, you only need lower level constructs like barriers.

You should be able to implement the marker in a library, if not then someone ought to take a hard look at the meta-programming capabilities.

> What does this mean to you? We can write very rich libraries with `shared` as a safety mechanism. Nothing more is needed from the language, except perhaps some future implicit cast rules.

That is also not true. If you are allowed to turn "shared" into proper "local"  then you've got a type system that is so weak that there is no need for it to be in the language. It is not sound.

To get a proper and sound type system that requires language support you need something stronger than «D shared». Pony's take on this is to use more types to mark the legal transitions. Too late for D, so you might as well just figure out WHY the meta programming capabilities prevent you from implementing «shared» as a library type.

I have not been able to figure that out, because nobody has showed me why it cannot be done as a library type.

GOLDEN RULE OF THUMB FOR PROGRAMMING LANGUAGE DESIGN:

Before you implement at feature:

1. Implement it as a library solution.

2. Use it and describe why it does not feel right/works right.

3. Figure out what weaknesses the language has that prevents it from being done properly as a library solution.

4. Figure out how that can be improved then go back to 1.

5. Give up, evaluate the situation and start planning a language feature.

>> It is more realistic that D would expand the type system or meta-programming mechanisms in ways that are useful to library authors and that they could create concurrent frameworks.
>
> Right. That's exactly what we're doing. I'm glad you agree :)

I agree with myself, but that is NOT exactly what you are doing... :) See the «golden rule of thumb» above.

Before adding a feature, provide an in-depth analysis for why a library solution is insufficient.

I don't see how any skilled language designer could disagree with that statement.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation