Thread overview | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
July 25, 2013 A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Once in a while, a thread pops up in the newsgroups pitting D against some other language. More often than not, these comparisons are flawed, non-encompassing, and uninformative. Most recently with the article comparing D with Go and Rust, the community pointed out a few flaws involving a late addition of one of the D compilers, build configurations (-noboundscheck?), and the random number generator used. Then when I think about how web browsers are compared, there are conventional measures and standard benchmarking tools (e.g. sunspider). They measure performance for javascript, rendering, HTML5, etc. They also measure startup times (hot/cold boot), memory usage, etc. Finally, there are feature comparisons, such as what HTML5 features each browser supports. These are the type of comparisons I'd like to see with programming languages. For starters, there should be standard "challenges" (algorithms and such) implemented in each language designed to measure various aspects of the language, such as sorting, number crunching, and string processing. However, rather than leave it to a single individual to implement the algorithm in several different languages, it should be left to the community to collaborate and produce an "ideal" implementation of the algorithm in their language. We could analyze factors other than performance, such as the ease of implementation (how many lines? does it use safe/unsafe features? Was it optimized using unsafe / difficult features?). What can we do about it? I propose we come together as a community, design challenges that are actually relevant and informative, and release the first implementations in D. Then we let the battle commence and invite other communities to contribute their own implementations in other languages. I think we should give it a try; start off small with just a few moderate challenges (not too simple or complex) and see where it goes from there. |
July 25, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Xinok | On Thursday, 25 July 2013 at 18:23:19 UTC, Xinok wrote: > Once in a while, a thread pops up in the newsgroups pitting D against some other language. More often than not, these comparisons are flawed, non-encompassing, and uninformative. Most recently with the article comparing D with Go and Rust, the community pointed out a few flaws involving a late addition of one of the D compilers, build configurations (-noboundscheck?), and the random number generator used. > > Then when I think about how web browsers are compared, there are conventional measures and standard benchmarking tools (e.g. sunspider). They measure performance for javascript, rendering, HTML5, etc. They also measure startup times (hot/cold boot), memory usage, etc. Finally, there are feature comparisons, such as what HTML5 features each browser supports. > > These are the type of comparisons I'd like to see with programming languages. For starters, there should be standard "challenges" (algorithms and such) implemented in each language designed to measure various aspects of the language, such as sorting, number crunching, and string processing. However, rather than leave it to a single individual to implement the algorithm in several different languages, it should be left to the community to collaborate and produce an "ideal" implementation of the algorithm in their language. We could analyze factors other than performance, such as the ease of implementation (how many lines? does it use safe/unsafe features? Was it optimized using unsafe / difficult features?). > > > What can we do about it? I propose we come together as a community, design challenges that are actually relevant and informative, and release the first implementations in D. Then we let the battle commence and invite other communities to contribute their own implementations in other languages. I think we should give it a try; start off small with just a few moderate challenges (not too simple or complex) and see where it goes from there. Sounds somewhat like Rosetta Code. http://rosettacode.org/wiki/Rosetta_Code Bearophile spends a lot of time adding D entries there. |
July 25, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Xinok | On Thursday, 25 July 2013 at 18:23:19 UTC, Xinok wrote: > Once in a while, a thread pops up in the newsgroups pitting D against some other language. More often than not, these comparisons are flawed, non-encompassing, and uninformative. Most recently with the article comparing D with Go and Rust, the community pointed out a few flaws involving a late addition of one of the D compilers, build configurations (-noboundscheck?), and the random number generator used. > > Then when I think about how web browsers are compared, there are conventional measures and standard benchmarking tools (e.g. sunspider). They measure performance for javascript, rendering, HTML5, etc. They also measure startup times (hot/cold boot), memory usage, etc. Finally, there are feature comparisons, such as what HTML5 features each browser supports. > > These are the type of comparisons I'd like to see with programming languages. For starters, there should be standard "challenges" (algorithms and such) implemented in each language designed to measure various aspects of the language, such as sorting, number crunching, and string processing. However, rather than leave it to a single individual to implement the algorithm in several different languages, it should be left to the community to collaborate and produce an "ideal" implementation of the algorithm in their language. We could analyze factors other than performance, such as the ease of implementation (how many lines? does it use safe/unsafe features? Was it optimized using unsafe / difficult features?). Sounds very much like this: http://benchmarksgame.alioth.debian.org/ You can compare code size, memory need, execution time for various programs and lots of languages. Safety is not considered though, but how would you measure that? It is called a "game", because you can adapt the weights until your favorite language is the winner. ;) D entries were provided, but removed at some point, because it looked like the C code. |
July 25, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Xinok | On Thursday, 25 July 2013 at 18:23:19 UTC, Xinok wrote:
> These are the type of comparisons I'd like to see with programming languages. For starters, there should be standard "challenges" (algorithms and such) implemented in each language designed to measure various aspects of the language, such as sorting, number crunching, and string processing. However, rather than leave it to a single individual to implement the algorithm in several different languages, it should be left to the community to collaborate and produce an "ideal" implementation of the algorithm in their language. We could analyze factors other than performance, such as the ease of implementation (how many lines? does it use safe/unsafe features? Was it optimized using unsafe / difficult features?).
The problem is all those last bits:
- Line counts aren't a good measure of anything.
- What's safe and unsafe is very subjective.
- What's difficult is even more subjective.
There's also other variables:
- How idiomatic is the code?
- How well does it scale?
- How much headroom is there for more optimisation?
- How predictable is the performance?
- How well do you need to know the language's implementation to do optimisations?
For example, optimised Haskell will often added strict evaluation hints, and strict type hints, but these are very non-idiomatic, and work against the lazy nature of the language.
D on the other hand does quite well with all these variables: idiomatic D code is generally quite fast, and its abstractions scale quite well. Performance is predictable and the language has plenty of features allowing you to do unsafe things for more performance if you so desire.
The problem with measuring all this stuff is that it's very subjective, so I don't think there can be any standardised way of assessing language performance.
|
July 25, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Peter Alexander | Peter Alexander:
> - What's safe and unsafe is very subjective.
There are large bodies of people that count bugs in code, and correlate them with coding practices. They have created language subsets like C for automotive industry, C++ for aviation, code for space missions, Ada language and its successive refinements like Ada2012, SPARK subset of Ada. There are lot of people trying sideways solutions, at Microsoft (Spec#, Liquid typing, etc), dependent typing (ATS language), and so on and on, even Haskell variants. Lot of this stuff is not based on statistical data, but there is also some hard data that has shaped some of those very strict coding guidelines. There are several serious studies in the field of coding safety. Dismissing all that decades old work with a 'very subjective' is unjust.
As usual D code safety is mostly correlated to the coding style you are using, how you write your unittests and code contracts, how much good are your code reviews, how much careful your programmers are, etc. But the language design is also a factor. To me D safety looks about intermediate between C and Ada-SPARK. D code normally has undetected integral overflows, it doesn't help a lot against null pointers (Nullable is not so good yet), there is no significant stack overflow protection, no variable-sized stack-allocated arrays that help a bit created bounded collections, the management of reference escaping is planned but not yet implemented (scope), and so on. Overall to me D coding seems significantly safer than C coding, and perhaps it's a little safer than C++11 coding too. I know no studies about the safety of D code compared to C++11 code or Ada2012 code, or compared to other languages.
Bye,
bearophile
|
July 25, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Thursday, 25 July 2013 at 20:28:54 UTC, bearophile wrote:
> Peter Alexander:
>
>> - What's safe and unsafe is very subjective.
>
> There are large bodies of people that count bugs in code, and correlate them with coding practices. They have created language subsets like C for automotive industry, C++ for aviation, code for space missions, Ada language and its successive refinements like Ada2012, SPARK subset of Ada. There are lot of people trying sideways solutions, at Microsoft (Spec#, Liquid typing, etc), dependent typing (ATS language), and so on and on, even Haskell variants. Lot of this stuff is not based on statistical data, but there is also some hard data that has shaped some of those very strict coding guidelines. There are several serious studies in the field of coding safety. Dismissing all that decades old work with a 'very subjective' is unjust.
Allow me to put it another way by way of analogy: health. We know from medical studies what kinds of things are healthy, and what things are unhealthy. However, if I were to present 10 people, and witness their actions for a week, would anyone be able to accurately order them on their "healthiness"? Would every medical expert arrive at the same ordering?
Maybe subjective is the wrong word to use. Maybe what I meant was "difficult to quantify".
|
July 25, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 7/25/2013 1:28 PM, bearophile wrote: > there is no significant stack overflow protection, It's done by the hardware (putting a "no-access" page at the end of the stack). There's nothing unsafe about it. > no variable-sized stack-allocated arrays that help a bit > created bounded collections, I don't see how that is a safety issue. |
July 26, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to qznc | "qznc" <qznc@web.de> wrote in news:buxislkfauizlnrvoyzv@forum.dlang.org: > until your favorite language is the winner. This is a wrong path. The winner should be found by weighting the results in such a way that all tested languages are at least close to equal. If such equality can be achieved by some canonical weighting, then they are all equal in respect to the test battery, because all fell into the same equivalence class. Only if there is more than one equivalence class, at least one extreme participant can be declared, which might be the winner---or the looser as well. -manfred |
July 26, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright:
> It's done by the hardware (putting a "no-access" page at the end of the stack). There's nothing unsafe about it.
>
>
>> no variable-sized stack-allocated arrays that help a bit
>> created bounded collections,
>
> I don't see how that is a safety issue.
In my opinion where you allocate your data influences the "safety" of your program, but it's not easy for me to tell exactly in what direction such influence is.
If you allocate too much data on the stack this could cause stack overflow. As you say a stack overflow is memory safe, but if your program is doing something important, a sudden crash could be regarded as dangerous for the user. You don't want a stack overflow in the code that controls your car brakes (this is not a totally invented example).
Having variable-sized stack-allocated arrays encourages you to put more data on the stack, increasing the risk of stack overflows.
On the other hand, if you only have fixed-sized stack-allocated arrays, you could allocate a fixed size array on the stack and then use only part of it. This waste of stack space increases the probability of stack overflow. A variable-sized stack-allocated array allows you to waste no stack space, and avoid those stack overallocations.
If you are using a segmented stack as Rust, stack overflows become less probable (it becomes more like a malloc failure), because the stack is able to become very large when needed. I think Rust needs that elastic stack because in such language it's easy to allocate all kind of stuff on the stack (unlike in D).
- - - - - - - - - -
Ada is safer compared to D for other things. One of them is the little mess of integer division that D has inherited from C.
This is how the Ada compiler forces you to write a certain division:
procedure Strong_Typing is
Alpha : Integer := 1;
Beta : Integer := 10;
Result : Float;
begin
Result := Float (Alpha) / Float (Beta);
end Strong_Typing;
In D you could write:
int a = 1;
int b = 10;
double r = a / b;
and in r you will not see the 0.1 value. This is a C design mistake that I have seen bite even programmers with more than 2 years of programming experience with C-like languages. Perhaps having "/" and "div" for floating point and integer divisions in D could avoid those bugs.
Another mistake is D inheriting the C99 semantics of % that is suboptimal and bug-prone.
(Both mistakes are fixed in Python3, by the way, despite I don't fully like the Python3 division).
- - - - - - - - - -
In Ada there is 'others' to define the value of the array items that you are not specifying, this removes the bugs discussed in Issue 3849:
declare
type Arr_Type is array (Integer range <>) of Integer;
A1 : Arr_Type := (1, 2, 3, 4, 5, 6, 7, 8, 9);
begin
A1 := (1, 2, 3, others => 10);
end;
'others' is usable even for struct literals, when you don't want to specify all fields:
type R is record
A, B : Integer := 0;
C : Float := 0.0;
end record;
V3 : R => (C => 1.0, others => <>);
For D I suggested array syntax like:
int[$] a1 = [1, 2, 3];
int[10] a2 = [1, 2, 3, ...];
void main() {}
where the "..." tells the compiler the programmer wants it to fill those missing values with their default init.
Ada Concurrency is quite refined, and it's kind of safe.
Bye,
bearophile
|
July 26, 2013 Re: A proper language comparison... | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 7/25/2013 7:19 PM, bearophile wrote: > If you allocate too much data on the stack this could cause stack overflow. As > you say a stack overflow is memory safe, but if your program is doing something > important, a sudden crash could be regarded as dangerous for the user. You don't > want a stack overflow in the code that controls your car brakes (this is not a > totally invented example). If you are writing a program that, if it fails will cause your car to crash, then you are a bad engineer and you need to report to the woodshed. As I've written before, imagining you can write a program that cannot fail, coupled with coming up with a requirement that a program cannot fail, is BAD ENGINEERING. ALL COMPONENTS FAIL. The way you make a system safe is design it so that it can withstand failure BECAUSE THE FAILURE IS GOING TO HAPPEN. I cannot emphasize this enough. > Having variable-sized stack-allocated arrays encourages you to put more data on > the stack, increasing the risk of stack overflows. > > On the other hand, if you only have fixed-sized stack-allocated arrays, you > could allocate a fixed size array on the stack and then use only part of it. > This waste of stack space increases the probability of stack overflow. A > variable-sized stack-allocated array allows you to waste no stack space, and > avoid those stack overallocations. On the other hand, fixed size stack allocations are more predictable and hence a stack overflow is more likely to be detected during testing. > If you are using a segmented stack as Rust, stack overflows become less probable > (it becomes more like a malloc failure), because the stack is able to become > very large when needed. I think Rust needs that elastic stack because in such > language it's easy to allocate all kind of stuff on the stack (unlike in D). Segmented stacks are a great idea for 20 years ago. 64 bit code has rendered the idea irrelevant - you can allocate 4 billion byte stacks for each of 4 billion threads. You've got other problems that'll happen centuries before that limit is reached. (Segmented stacks are also a performance problem, and don't interact well with compiled C code.) |
Copyright © 1999-2021 by the D Language Foundation