March 27, 2015
On Friday, 27 March 2015 at 10:10:37 UTC, Dominikus Dittes Scherkl wrote:
> On Thursday, 26 March 2015 at 19:32:53 UTC, Idan Arye wrote:
>
>>
>> But when it comes to heavily templated functions - understanding the signature is HARD. It's hard enough for the top programmers that can handle the complex D features - it's much harder for the average programmers that could have easily used these functions if they could just understand the documentation.
> I think the documentation should simply contain the unittests - they show quite well how to call the template, from minimal cases to the complex ones.
> Ok, they tend to show a lot of edge-cases, but even the very simplest usages of a function should be unit-tested, so at least those have to be part of the documentation.

Unittests are helpful, but I think examples aimed at demonstrating
usage are of more value if possible. Unittests and examples have
subtly different purposes.  When writing a Unittest the programmer
will likely take whatever shortcuts they can to get data into the
tested function in the correct format, with the least amount of
code/effort legally possible. Whereas examples (hopefully) will try
to present inputs in 'real life' scenarios.

If the function inputs are trivial (ie. takes an integer or basic
string) then this isn't an issue. But for functions taking complex
inputs it can be a bit baffling for someone new to the language.

I must admit when I was new to phobos I struggled with the use of
unittests as examples, especially for functions taking

Anyway, unittests are better than nothing - if that is the other
option.

March 27, 2015
On Friday, 27 March 2015 at 14:02:55 UTC, CraigDillabaugh wrote:
clip
>
> If the function inputs are trivial (ie. takes an integer or basic
> string) then this isn't an issue. But for functions taking complex
> inputs it can be a bit baffling for someone new to the language.
>
> I must admit when I was new to phobos I struggled with the use of
> unittests as examples, especially for functions taking
for functions taking ranges or other non-trivial inputs.

>
> Anyway, unittests are better than nothing - if that is the other
> option.

March 27, 2015
> [skip]
> I'm not a native English speaker, but a range can start with a needle? Where is the haystack? :)

Yes the output is awful, but that does not imply some kind of concepts is needed to make the documentation easier to understand. And Python's startsWith makes no use of a protocol. See here http://www.rafekettler.com/magicmethods.html for an explanation.

> Now that we're talking about creating your own sequences in Python, it's time > to talk about protocols. Protocols are somewhat similar to interfaces in other > languages in that they give you a set of methods you must define. However, in > Python protocols are totally informal and require no explicit declarations to > implement. Rather, they're more like guidelines.

It's similar to what we have, just with even less help from the language. (We can at least statically check that some arguments is e.g. a range).

> Anyway, in the Python built-in lib I didn't find any levenshtheinDistance
Please calculate the levenshtein distance of "aaabc" and "ababc" both in python and D.

Then come back and tell me it was too hard in D compared to Python.

> , boyermooreFinder or schschchcscshshscscshshscscssscsshcswarzSort.

Both are well known and useful algorithms. You might not know them but that's actually no argument not to include them in phobos. Their presence does not make the documentation any harder to understand or harder to find a startsWith function.

I don't buy this "There is a function I don't know, it's so hard to understand thing". That's like saying: I only need XML, don't put JSON in the stdlib because I will get confused! How should I anticipate that I don't need std.json when I just want to parse XML? I cannot even google that for me.




March 27, 2015
> Looked also in the source code to find out that startsWith is locale sensitive, something ignored in phobos.

Why would I need a locale for startsWith? Please file a bug, if that's actually needed for unicode strings.
March 27, 2015
On 2015-03-26 19:32:51 +0000, Idan Arye said:

> There is a discussion about D vs Go going on in several threads(yey for multithreading!), and one thread is about an article by Gary Willoughby that claims that Go is not suitable for sophisticated programmers(http://forum.dlang.org/thread/mev7ll$mqr$1@digitalmars.com). What's interesting about this one is the reddit comments, which turned into an argument between simple languages that average programmers can use and complex languages that only the top 1% of intelligent programmers can use, but they can extract more out of them.
> 
> But the thing is - the world of the top programmers is not really separate from that of average programmers. Professional development teams can have a few top programmers and many average one, all be working on the same project. Open source projects can have top programmers working on the core while allowing average programmers to contribute some simple features. Top programmers can write libraries that can be used by average programmers.
> 
> To allow these things, top programmers and average programmers should be able to work on the same language. Of course, any language that average programmers can master should be easy for a top programmer to master - but the thing is, we also want the top programmer to be able to bring more out of the language, without limiting them by it's over-simplicity. This will also benefit the average programmers, since they also improve the quality of the libraries and modules they are using.
> 
> This idea is nothing new, and was mentioned in the main(=biggest) current D vs Go thread(http://forum.dlang.org/thread/mdtago$em9$1@digitalmars.com?page=3#post-jeuhtlocousxtezoaqqh:40forum.dlang.org). What I want to talk about here is the seams. The hurdles that in practice make this duality harder.
> 
> Let's compare it to another duality that D(and many other languages, mainly modern systems languages) promotes - the duality between high-level and low-level. Between write-code-fast and write-fast-code.
> 
> The transition between high-level and low-level code in D consists by a change of the things uses - which language constructs, which idioms, which functions. But there aren't any visible seams. You don't need to use FFI or to dynamically load a library file written in another language or anything like that - you simply write the high-level parts like you would write high-level code and the low-level parts like you would write low-level code, and they just work together.
> 
> The duality between high-level D and low-level D is seamless. The duality between simple D and complex D - not so much.
> 
> The seams here exist mainly in understanding how to use complex code from simple code. Let's take std.algorithm(.*) for example. The algorithms' implementations there are complex and use advanced D features, but using them is very simple. Provided, of course, that you know how to use them(and no - not everything that you know becomes simple. I know how to solve regular differential equations, but it's still very complex to do so).
> 
> The problem, as Andrei Alexandrescu pointed out(http://forum.dlang.org/thread/mdtago$em9$1@digitalmars.com?page=6#post-mduv1i:242169:241:40digitalmars.com), is learning how to use them. Ideally you'd want to be able to look at a function's signature and learn from that how to use it. It's name and return type should tell you what it does and it's argument names and types should tell you what to send to it. The documentation only there for a more through description and to warn you about pitfalls and edge cases.
> 
> But when it comes to heavily templated functions - understanding the signature is HARD. It's hard enough for the top programmers that can handle the complex D features - it's much harder for the average programmers that could have easily used these functions if they could just understand the documentation.
> 
> Compare it, for example, to Jave. Even if a library doesn't contain a single documentation comment, the auto-generated javadoc that contains just the class tree and method signatures is usually enough to get an idea of what's going where. In D, unless the author has provided some actual examples, you are going to have a hard time trying to sort out these complex templated signatures...
> 
> That's quite an hurdle to go though when wanting to use complex code from simple code(or even from other complex code). That's the ugly seam I'm talking about.
> 
> Now, if you are working on a big project(be it commercial or open-source), you can find lot's of examples how to use these complex functions, and that's probably how you'd tackle the problem. When you are using some library you usually don't have that luxury - but these libraries usually have the generated ddoc at their website. Of course - that generated ddoc is full with complex templated signatures, so that's not very helpful...
> 
> So, what can be done? Maybe the ddoc generator, instead of writing the whole signature as-is, can emit a more human-readable version of it?
> 
> Let's look at the example Andrei mentioned - startsWith. Let's take a look at the first overloaded signature:
> 
> uint startsWith(alias pred = "a == b", Range, Needles...)(Range doesThisStart, Needles withOneOfThese) if (isInputRange!Range && Needles.length > 1 && is(typeof(.startsWith!pred(doesThisStart, withOneOfThese[0])) : bool) && is(typeof(.startsWith!pred(doesThisStart, withOneOfThese[1..$])) : uint));
> 
> Let's break it down and see what the user needs in order to use the function:
> 
> `uint` - the return type. Needed.
> `startsWith` - the function name. Needed.
> `(alias pred = "a == b",` - a template argument that the user might want to supply - Needed.
> `Range, Needles...)` - template arguments that should usually be inferable. The function won't work without them, but since the user doesn't actually supply them - I'll mark them as not needed.
> `(Range doesThisStart, Needles withOneOfThese)` - the function's arguments. Needed.
> 
> The rest are constraints that check the template arguments. They aren't needed when you try to use the function - though they might be helpful at figuring out why the compiler yells at you when you use it wrong.
> 
> So, if we take only the needed parts, we get this signature:
> 
> uint startsWith(alias pred = "a == b")(Range doesThisStart, Needles withOneOfThese);
> 
> Well, doesn't this look much easier to grasp? Of course, it omits some very critical information. It doesn't tell you what are `Range` and `Needles` - you can look for these types in the docs and find nothing. It also doesn't tell you that `Needles` is variadic.
> 
> Well - what's stopping us from adding this information *below* the signature? What if ddoc would generate something like this:
> 
> uint startsWith(alias pred = "a == b")(Range doesThisStart, Needles... withOneOfThese);
>    where:
>      Range is an inferred template argument
>      Needles is a variadic inferred template argument
>      isInputRange!Range
>      Needles.length > 1
>      is(typeof(.startsWith!pred(doesThisStart, withOneOfThese[0])) : bool)
>      is(typeof(.startsWith!pred(doesThisStart, withOneOfThese[1..$])) : uint)
> 
> We've broken the signature into the parts required to use the function and the parts required to FULLY understand the previous parts. The motivation is that the second group of parts is also important, so it needs to be there, but it creates a lot of unneeded noise so it shouldn't be a direct part of the signature(at least not in the doc). It's similar to the docs of other types used in the signature - it's important to have these docs somewhere accessible, but you don't want to add them in the middle of the signature because it'll make it unreadable.
> 
> 
> 
> This idea, of course, is not a finally cooked proposal yet. We need a way to tell ddoc which template arguments are supposed to be inferred(can this always be done automatically?) and the last two entries in my example are not super-trivial to grok(I can rewrite them by-hand to make them super-simple - but can ddoc do it automatically? and how?). The point of this thread is to start a discussion about making ddoc generate documentations that are more... well... human readable.

This is very good.  I consider myself well-versed in D, and able to write these sorts of functions.    However, reading the signatures is next to impossible.

-Shammah

March 27, 2015
On Friday, 27 March 2015 at 14:39:48 UTC, Tobias Pankrath wrote:
>> Looked also in the source code to find out that startsWith is locale sensitive, something ignored in phobos.
>
> Why would I need a locale for startsWith? Please file a bug, if that's actually needed for unicode strings.

Because in German, "Köln" can start with "Kö", but also with "Koe" and "K\u0308o" ?


Regarding the scscscshghshshshhswarzThing, here we discuss the readability and accessibility of the documentation, not the power of the library. Every other language will use a variation of  "sortBy" instead of the scscshwcwscThing. I'm happy that D has in the default lib functions like levenshteinDistance, but this will not attract the average or "just starting to learn" developer. On the contrary, sorting correctly some names is a more common task than calculating the Levenshtein distance, but there is no function for it in phobos.
March 27, 2015
> Regarding the scscscshghshshshhswarzThing, here we discuss the readability and accessibility of the documentation, not the power of the library. Every other language will use a variation of  "sortBy" instead of the scscshwcwscThing. I'm happy that D has in the default lib functions like levenshteinDistance, but this will not attract the average or "just starting to learn" developer. On the contrary, sorting correctly some names is a more common task than calculating the Levenshtein distance, but there is no function for it in phobos.

You may have a point that schwartzSort has a bad name (I disagree), but putting another algorithm does not make the documentation worse per se. Dunno, what problem you have with the levenshteinDistance.

> On the contrary, sorting correctly some names is a more common task than calculating the Levenshtein distance, but there is no function for it in phobos.

What do you mean by correct? http://unicode.org/reports/tr10/? "We even have something obscure like levenshteinDistance but no implementation for the unicode collation algorithm, which all newcomers are looking for!" is a) a questionable comparison between a relative simple algorithm and a monster and b) wrong, because 99% of programmers don't even know about the algorithm itself, thus they aren't looking for it.

BTW. python's startwith does the Köln example wrong. Kö and Ko\u0308 dont match.
March 27, 2015
On Thursday, 26 March 2015 at 19:45:19 UTC, Alex Parrill wrote:
> On Thursday, 26 March 2015 at 19:32:53 UTC, Idan Arye wrote:
>> ...snip...
>
> So tl;dr; make the template constraints in ddoc less prominent?
>
> The "new library reference preview" under Resources seems to already have this (example: http://dlang.org/library/std/algorithm/searching/starts_with.html)

https://w0rp.com/project/dstruct/dstruct/weak_reference/

I've got a "Toggle Contraints" button on my site for showing and hiding them dynamically. It kind of works.
March 27, 2015
On Friday, 27 March 2015 at 16:55:34 UTC, Tobias Pankrath wrote:
>> Regarding the scscscshghshshshhswarzThing, here we discuss the readability and accessibility of the documentation, not the power of the library. Every other language will use a variation of  "sortBy" instead of the scscshwcwscThing. I'm happy that D has in the default lib functions like levenshteinDistance, but this will not attract the average or "just starting to learn" developer. On the contrary, sorting correctly some names is a more common task than calculating the Levenshtein distance, but there is no function for it in phobos.
>
> You may have a point that schwartzSort has a bad name (I disagree), but putting another algorithm does not make the documentation worse per se. Dunno, what problem you have with the levenshteinDistance.

schwartzSort it's a nice name only if you are German. You got me wrong, I have nothing against the function itself, nor against the presence in the stdlib, but I play the role of the devil's advocate and I try to walk in the D's new user shoes who's expecting probably something more readable than 4 stuffed consonants. Personally, I know what a Schwartz transform is and I speak some German :)

>
>> On the contrary, sorting correctly some names is a more common task than calculating the Levenshtein distance, but there is no function for it in phobos.
>
> What do you mean by correct? http://unicode.org/reports/tr10/?

From the link you posted:

"Collation is not aligned with character sets or repertoires of characters.
Swedish and German share most of the same characters, for example, but have *very different* sorting orders."


> "We even have something obscure like levenshteinDistance but no implementation for the unicode collation algorithm, which all newcomers are looking for!" is a) a questionable comparison between a relative simple algorithm and a monster and b) wrong, because 99% of programmers don't even know about the algorithm itself, thus they aren't looking for it.

I explained this in another topic regarding portability obsession, it's a monster probably on Linux, on Windows/Mac it's exactly one line of code, already done it:

https://github.com/rumbu13/sharp/blob/master/src/internals/locale.d#L518

>
> BTW. python's startwith does the Köln example wrong. Kö and Ko\u0308 dont match.

Today was my first contact with Python, I cannot comment on this, may be Python is not aware of normalization forms and is not combining o with the diaeresis (u0308).
March 27, 2015
Le 27/03/2015 15:02, CraigDillabaugh a écrit :
> On Friday, 27 March 2015 at 10:10:37 UTC, Dominikus Dittes Scherkl wrote:
>> On Thursday, 26 March 2015 at 19:32:53 UTC, Idan Arye wrote:
>>
>>>
>>> But when it comes to heavily templated functions - understanding the
>>> signature is HARD. It's hard enough for the top programmers that can
>>> handle the complex D features - it's much harder for the average
>>> programmers that could have easily used these functions if they could
>>> just understand the documentation.
>> I think the documentation should simply contain the unittests - they
>> show quite well how to call the template, from minimal cases to the
>> complex ones.
>> Ok, they tend to show a lot of edge-cases, but even the very simplest
>> usages of a function should be unit-tested, so at least those have to
>> be part of the documentation.
>
> Unittests are helpful, but I think examples aimed at demonstrating
> usage are of more value if possible. Unittests and examples have
> subtly different purposes.  When writing a Unittest the programmer
> will likely take whatever shortcuts they can to get data into the
> tested function in the correct format, with the least amount of
> code/effort legally possible. Whereas examples (hopefully) will try
> to present inputs in 'real life' scenarios.
>
> If the function inputs are trivial (ie. takes an integer or basic
> string) then this isn't an issue. But for functions taking complex
> inputs it can be a bit baffling for someone new to the language.
>
> I must admit when I was new to phobos I struggled with the use of
> unittests as examples, especially for functions taking
>
> Anyway, unittests are better than nothing - if that is the other
> option.
>

Just a little point :
I find some much examples not tested on Internet, so to me examples have to by played by the compiler in the same time as unittest. Maybe it must have a new keyword?

But that true unittest doesn't have exactly the same purpose and some risk to be hard to read for a beginner.