July 13, 2018
On Wednesday, 11 July 2018 at 12:45:40 UTC, crimaniak wrote:
> This error handling policy makes D not applicable for creating WEB applications and generally long-running services.

You use process isolation so it is easy to restart part of it without disrupting others. Then it can crash without bringing the system down. This is doable with segfaults and range errors, same as with exceptions.

This is one of the most important systems engineering principles: expect failure from any part, but keep the system as a whole running anyway.
July 13, 2018
On 7/13/18 8:55 AM, Adam D. Ruppe wrote:
> On Wednesday, 11 July 2018 at 12:45:40 UTC, crimaniak wrote:
>> This error handling policy makes D not applicable for creating WEB applications and generally long-running services.
> 
> You use process isolation so it is easy to restart part of it without disrupting others. Then it can crash without bringing the system down. This is doable with segfaults and range errors, same as with exceptions.
> 
> This is one of the most important systems engineering principles: expect failure from any part, but keep the system as a whole running anyway.

But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK.

-Steve
July 13, 2018
On Friday, 13 July 2018 at 13:15:39 UTC, Steven Schveighoffer wrote:
> On 7/13/18 8:55 AM, Adam D. Ruppe wrote:
>> On Wednesday, 11 July 2018 at 12:45:40 UTC, crimaniak wrote:
>>> This error handling policy makes D not applicable for creating WEB applications and generally long-running services.
>> 
>> You use process isolation so it is easy to restart part of it without disrupting others. Then it can crash without bringing the system down. This is doable with segfaults and range errors, same as with exceptions.
>> 
>> This is one of the most important systems engineering principles: expect failure from any part, but keep the system as a whole running anyway.
>
> But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK.
>
> -Steve

Came on, Steve...  100 concurrent connections?

/P
July 13, 2018
On 7/13/18 3:53 PM, Paolo Invernizzi wrote:
> On Friday, 13 July 2018 at 13:15:39 UTC, Steven Schveighoffer wrote:
>> On 7/13/18 8:55 AM, Adam D. Ruppe wrote:
>>> On Wednesday, 11 July 2018 at 12:45:40 UTC, crimaniak wrote:
>>>> This error handling policy makes D not applicable for creating WEB applications and generally long-running services.
>>>
>>> You use process isolation so it is easy to restart part of it without disrupting others. Then it can crash without bringing the system down. This is doable with segfaults and range errors, same as with exceptions.
>>>
>>> This is one of the most important systems engineering principles: expect failure from any part, but keep the system as a whole running anyway.
>>
>> But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK.
> 
> Came on, Steve...  100 concurrent connections?

Huh? What'd I say?

-Steve
July 13, 2018
On Friday, 13 July 2018 at 20:12:36 UTC, Steven Schveighoffer wrote:
> On 7/13/18 3:53 PM, Paolo Invernizzi wrote:
>> On Friday, 13 July 2018 at 13:15:39 UTC, Steven Schveighoffer wrote:
>>> On 7/13/18 8:55 AM, Adam D. Ruppe wrote:
>>>> [...]
>>>
>>> But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK.
>> 
>> Came on, Steve...  100 concurrent connections?
>
> Huh? What'd I say?
>
orders of magnitudes too small. 100 concurrent connections you can handle with a sleeping arduino... :-)

July 14, 2018
On 7/13/18 4:47 PM, Patrick Schluter wrote:
> On Friday, 13 July 2018 at 20:12:36 UTC, Steven Schveighoffer wrote:
>> On 7/13/18 3:53 PM, Paolo Invernizzi wrote:
>>> On Friday, 13 July 2018 at 13:15:39 UTC, Steven Schveighoffer wrote:
>>>> On 7/13/18 8:55 AM, Adam D. Ruppe wrote:
>>>>> [...]
>>>>
>>>> But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK.
>>>
>>> Came on, Steve...  100 concurrent connections?
>>
>> Huh? What'd I say?
>>
> orders of magnitudes too small. 100 concurrent connections you can handle with a sleeping arduino... :-)
> 

Meh, I admit I don't know the specifics. I just know that there is a reason async i/o is used for web services.

Let's say there is a number N concurrent connections that processes are OK to use. Then you can scale to 100xN if you use something better.

-Steve
July 15, 2018
On Wednesday, 11 July 2018 at 22:35:06 UTC, crimaniak wrote:
> The people who developed Erlang definitely have a lot of experience developing services.

Yes, it was created for telephone-centrals. You don't want a phone central to go completely dead just because there is a bug in the code. That would be a disaster. And very dangerous too (think emergency calls).

>> The crucial point is whether you can depend on the error being isolated, as in Erlang's lightweight processes. I guess D assumes it isn't.
>  I think if we have a task with safe code only, and communication with message passing, it's isolated good enough to make error kill this task only. In any case, I still can drop the whole application myself if I think it will be the more safe way to deal with errors. So paranoids do not lose anything in the case of this approach.

Yup, keep critical code that rarely change, such as committing transactions, in a completely separate service and keep constantly changing code where bugs will be present separate from it.

Anyway, completey idiotic to terminate a productivity application because an optional editing function (like a filter in a sound editor) generates a division-by-zero. End users would be very unhappy.

If people want access to a low-level programming language, then they should also be able to control error-handling. Make tradeoffs regarding denial-of-service attack-vectors and 100% correctness (think servers for entertainment services like game servers).

What people completely fail to understand is that if an assert trips then it isn't sufficient to reboot the program. So if an assert always should lead to shut-down, then it should also prevent the program from being run again, using the same line of argument. The bug is still there. That means that all bugs leads to complete service shutdown, until the bug has been fix, and that would make for a very shitty entertainment experience and many customers lost.



July 15, 2018
On Friday, 13 July 2018 at 12:55:33 UTC, Adam D. Ruppe wrote:
> You use process isolation so it is easy to restart part of it without disrupting others. Then it can crash without bringing the system down. This is doable with segfaults and range errors, same as with exceptions.
>
> This is one of the most important systems engineering principles: expect failure from any part, but keep the system as a whole running anyway.

If we are talking about something application-specific and in probabilistic terms, then yes certainly.

But that is not the absolutist position where any failure should lead to a shutdown (and consequently a ban on reboot as the failed assert might happen hours after the actual buggy code executed).

The absolutist position would also have to assume that all communicated state is corrupted so a separate process does not improve the situation. Since you don't know with a 100% certainty what the bug consists of you should not retain any state from any source after the _earliest_ time where the buggy logic in theory could have been involved. All databases should be assumed corrupted, no messages should be accepted etc (messages and databases are no different from memory in this regard).

In reality absolutist positions are usually not possible to uphold so you have to move to a probabilistic position. And the compiler cannot make probabilistic assumptions, you need a lot of contextual understanding to make those probabilistic assessment (e.g. the architect or programmer has to be involved).

Fully reactive systems does not retain state of course, and those would change the argument somewhat, but they are very rare... mostly limited to control systems (cars, airplanes etc).

The idea behind actor-based programming (e.g. Erlang) isn't that bugs don't occur or that the overall system will exhibit correct behaviour, but that it should be able to correct or adapt to situations despite bugs being present.  But that is really, predominantly, not available to us with the very "crisp" logic we use in current languages (true/false, all or nothing). Maybe something better will come out of probabilistic programming paradigms and software synthesis some time in the future. Within the current paradigms we are stuck with the judgment of the humans involved.

Interestingly biological systems are much better at robustness, fault tolerance and self-healing, but that involves a lot of overhead and also assumes that some failures are acceptable as long as the overall system can recover from it. Actor-programming is based on the same assumption, the health of the overall (big) system.
July 18, 2018
On Friday, 13 July 2018 at 13:15:39 UTC, Steven Schveighoffer wrote:

> But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK.

I think you may have fallen for Microsoft FUD.

In the Early Days of Windows Microsoft was appalling Bad at multiple processes....

Rather than fix their OS, they cranked up their Marketing machine and Hyped threads as "Light Weight Processes".

Unixy land has had "COW" (Copy on Write) page handling for years and years and process creation and processes are light weight.

There are very very few Good reasons for threads, but threads being "light weight processes" is definitely not one of them



July 19, 2018
On 7/18/18 1:58 AM, John Carter wrote:
> On Friday, 13 July 2018 at 13:15:39 UTC, Steven Schveighoffer wrote:
> 
>> But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK.
> 
> I think you may have fallen for Microsoft FUD.
> 
> In the Early Days of Windows Microsoft was appalling Bad at multiple processes....
> 
> Rather than fix their OS, they cranked up their Marketing machine and Hyped threads as "Light Weight Processes".

Wikipedia [1] seems to have a lot of references to "Light weight processes" from Unixy sources. Seems more like a good definition of thread than FUD.

> Unixy land has had "COW" (Copy on Write) page handling for years and years and process creation and processes are light weight.

That depends on how much memory has to be marked as COW. It's definitely more heavyweight than thread creation, which does none of that.

> There are very very few Good reasons for threads, but threads being "light weight processes" is definitely not one of them

Interesting, but I wasn't talking about using threads, vibe.d uses fibers, and can scale much better than using processes or threads alone.

See dconf presentations from Vladimir Panteleev [2] and Ali Chereli [3] to see why I was drawn to this conclusion.

Besides, using processes if you are ONLY going to read from the shared state makes some sense, but as soon as you need to change the shared state, you need to devise some mechanism to communicate that back to the main process. With threads/fibers, it's trivial.

With web services, most of the time the shared state you want elsewhere anyway (to make it persistent), so it's a better fit for processes than most program domains.

-Steve

[1] https://en.wikipedia.org/wiki/Light-weight_process
[2] https://www.youtube.com/watch?v=Zs8O7MVmlfw
[3] https://www.youtube.com/watch?v=7FJYc0Ydewo