May 27, 2013
On Sunday, May 26, 2013 22:10:58 Andrei Alexandrescu wrote:
> On 5/26/13 9:04 PM, Borden wrote:
> > Before we get too off topic in this thread, is there demand for an xhtml5.ddoc file? If so, I'd like to make some changes to the other DDoc files as to minimise code reuse and minimise ambiguity in 'inherited' macro definitions. I'm willing to put in the time but I can't do it alone.
> > 
> > If there's no demand, that's OK, too, and I'll put the matter to rest.
> 
> I think it would be great. In particular, an ebook format would be good.
> 
> You may want to wait until https://github.com/D-Programming-Language/dlang.org/pull/271 is in. It systematizes macros a lot and it may offer answers to many of your questions.

What's required for that to be merged? Someone to review it? I actually don't have commit privileges to dlang.org (even though all of the newer Phobos committers seem to), so the most that I can do is look it over. But I've generally ignored the dlang.org repo, since I don't have commit rights, and these days I do a poor enough job of review druntime and Phobos pull requests as it is.

- Jonathan M Davis
May 27, 2013
On Monday, 27 May 2013 at 02:11:00 UTC, Andrei Alexandrescu wrote:
> I think it would be great. In particular, an ebook format would be good.
>
> You may want to wait until https://github.com/D-Programming-Language/dlang.org/pull/271 is in. It systematizes macros a lot and it may offer answers to many of your questions.
>
> Andrei

I appreciate the direct answer to my question, Professor. I would start anyway, in my own source copy, checking the existing .ddoc files and updating, in the few places necessary, the tags from HTML4 to HTML5 - most of these changes are to the HEAD section, anyway, and shouldn't require changes.

There are two problems that I've already run into, which I'll need experienced help with:
1) doc.ddoc and html.ddoc define many of the macros that I need, but some of them I'll need to redefine for HTML5. Walter's response to how dmd resolves 'macro inheritence' doesn't clarify for me whether I should override the non-HTML5-compliant macros or rewrite the whole file. I hope it's not the latter.

Also, I don't understand the difference between doc.ddoc and html.ddoc - what is each file supposed to do, exactly?

2) One I have my xhtml5.ddoc, it won't compile the .dd sources correctly because many of the .dd files aren't written in a manner where simple macro expansion will generate HTML5 compliant code. To solve this, I'll need guidance on how to change the .dd files to get xhtml.ddoc to work without breaking the other files.

To this end it would be most helpful to develop a standard list of macros to use in the DLang spec sources and edit the non-conforming .dd files to follow it. It seems right now that the source files define whatever macros they like and leaves the onus on figuring out what each means on the .ddoc files.
May 27, 2013
On 5/26/13 10:45 PM, Borden wrote:
> 1) doc.ddoc and html.ddoc define many of the macros that I need, but
> some of them I'll need to redefine for HTML5. Walter's response to how
> dmd resolves 'macro inheritence' doesn't clarify for me whether I should
> override the non-HTML5-compliant macros or rewrite the whole file. I
> hope it's not the latter.

Just define the macros that differ and when compiling docs do this:

dmd $FLAGS doc.ddoc html.ddoc html5.ddoc myfile.dd

That way the macros defined in html5.ddoc will override those in the previous files.

> Also, I don't understand the difference between doc.ddoc and html.ddoc -
> what is each file supposed to do, exactly?

doc.ddoc is the general skeleton file for defining the online documentation. html.ddoc contains HTML-specific macros only, without having anything to do with our site's specific format.

> 2) One I have my xhtml5.ddoc, it won't compile the .dd sources correctly
> because many of the .dd files aren't written in a manner where simple
> macro expansion will generate HTML5 compliant code. To solve this, I'll
> need guidance on how to change the .dd files to get xhtml.ddoc to work
> without breaking the other files.
>
> To this end it would be most helpful to develop a standard list of
> macros to use in the DLang spec sources and edit the non-conforming .dd
> files to follow it. It seems right now that the source files define
> whatever macros they like and leaves the onus on figuring out what each
> means on the .ddoc files.

Yup, you got your work cut for you. Then again, wait til that diff is merged. It fixes a bunch of problems.


Andrei
May 27, 2013
On Monday, 27 May 2013 at 03:32:54 UTC, Andrei Alexandrescu wrote:
> Yup, you got your work cut for you. Then again, wait til that diff is merged. It fixes a bunch of problems.

That's OK. As long as I have some guidance on what to do I should manage. This effort isn't entirely selfless - part of tidying up the DLang spec is to help me learn D, too.
May 27, 2013
On Monday, 27 May 2013 at 03:32:54 UTC, Andrei Alexandrescu wrote:
> doc.ddoc is the general skeleton file for defining the online documentation. html.ddoc contains HTML-specific macros only, without having anything to do with our site's specific format.

For greater clarity, html.ddoc will produce a generic, HTML-compliant file. In contrast, doc.ddoc will add all of the dlang.org-specific decorations and boilerplate?

That being the case, would it make more sense for me to upgrade html.ddoc to HTML5 (since it's in candidate rec status over at W3C)?
May 27, 2013
On 5/27/13 12:09 AM, Borden wrote:
> On Monday, 27 May 2013 at 03:32:54 UTC, Andrei Alexandrescu wrote:
>> doc.ddoc is the general skeleton file for defining the online
>> documentation. html.ddoc contains HTML-specific macros only, without
>> having anything to do with our site's specific format.
>
> For greater clarity, html.ddoc will produce a generic, HTML-compliant
> file. In contrast, doc.ddoc will add all of the dlang.org-specific
> decorations and boilerplate?

No. Think of html.ddoc as a library of macros for HTML. They lack the "main" file and other things.

Andrei


May 27, 2013
Oh, and another thing: XHTML adopts the XML practice of only defining the lt, gt and amp entities and no others (like nbsp, mdash, accented, or non-Latin characters).

Since Unicode is, by and large, universal, I've read that the recommended practice for including characters not on a standard US keyboard is to copy them from a character map and save the file in a Unicode encoding. I intend to follow this guidance in writing the (x)html.ddoc template.

As such, should I keep the existing 'entity' macros or use the Unicode characters in the DLang spec source files? I imagine that Andrei will immediately comment that .tex files are supposed to be in ASCII. Suggestions?
May 28, 2013
On Tuesday, May 28, 2013 00:48:02 Borden wrote:
> Oh, and another thing: XHTML adopts the XML practice of only defining the lt, gt and amp entities and no others (like nbsp, mdash, accented, or non-Latin characters).
> 
> Since Unicode is, by and large, universal, I've read that the recommended practice for including characters not on a standard US keyboard is to copy them from a character map and save the file in a Unicode encoding. I intend to follow this guidance in writing the (x)html.ddoc template.
> 
> As such, should I keep the existing 'entity' macros or use the Unicode characters in the DLang spec source files? I imagine that Andrei will immediately comment that .tex files are supposed to be in ASCII. Suggestions?

Well, it's more user-friendly to have macros for Unicode than having to figure out how to input the actual Unicode character in there (since it's not on the keyboard), and it's trivial to turn the macro into the actual character with the macro, so I'd think that it would be more user-friendly to just use the macros, especially if we're already using them. And if laTeX has to be ASCII (I don't know if it has to be or not), then that's all the more reason to not use Unicode directly. But regardless, if we're already using macros, why bother changing it? Just change what the macros convert to in the XHTML generation.

- Jonathan M Davis
May 28, 2013
On Mon, May 27, 2013 at 05:30:27PM -0700, Jonathan M Davis wrote: [...]
> Well, it's more user-friendly to have macros for Unicode than having to figure out how to input the actual Unicode character in there (since it's not on the keyboard), and it's trivial to turn the macro into the actual character with the macro, so I'd think that it would be more user-friendly to just use the macros, especially if we're already using them. And if laTeX has to be ASCII (I don't know if it has to be or not), then that's all the more reason to not use Unicode directly. But regardless, if we're already using macros, why bother changing it? Just change what the macros convert to in the XHTML generation.
[...]

Plain vanilla LaTeX assumes ASCII input, and will do odd things if fed 8-bit data (much less UTF-8). I think macros for HTML entities is the way to go, given the current setup.

However, it is not a straightforward 1-to-1 mapping between &entity; and macro; to truly support LaTeX properly, one should be aware of some of its idiosyncrasies. For example, in Unicode, a character like αΊƒ can be represented by w *followed* by a combining diacritic; in LaTeX, however, the combining diacritic must *precede* the modified character (that is, \'w). So such characters should be represented by a single macro, say $(WACUTE), rather than w followed by a general $(ACUTE), which will be impossible to translate to LaTeX correctly.

LaTeX also has some special sequences for different kinds of spacings: an abbreviation like "Mr." requires the interspersing space to be escaped, i.e., "Mr.\ X", otherwise it will treat the "." as a sentence terminator and give it an overly-wide space in the output. This may make it a bit annoying to write in Ddoc, though, 'cos you'll need a macro of some sort to indicate this non-terminating ".".

The correct way to represent quotation marks in LaTeX is `` and '' for double quotes, and ` and ' for single quotes. Writing " or ' will still work, but it will just be ugly in the output.

If there are math formulae involved, then they need to be enclosed with $, for example: "This sentence contains $2+2=4$ words." Inside math formulae, a slightly different syntax is used, but for the purposes of Ddoc, I think that can probably be ignored for now.

A bunch of metacharacters need to be escaped; I can't recall the list off the top of my head, but they include at the very least:

	~ # $ % ^ & { } _ \

The escape sequences required for these metacharacters are not all obvious; for example, \\ is NOT an escaped backslash, it's a linebreak. I forgot what a literal backslash is... And \^ is NOT a literal caret; it's a circumflex accent on the next letter; ditto with \~. Though IIRC \$ does represent a literal $. So, some care is required to make things work correctly. :)


T

-- 
It is impossible to make anything foolproof because fools are so ingenious. -- Sammy
May 28, 2013
On 5/27/13 6:48 PM, Borden wrote:
> Oh, and another thing: XHTML adopts the XML practice of only defining
> the lt, gt and amp entities and no others (like nbsp, mdash, accented,
> or non-Latin characters).
>
> Since Unicode is, by and large, universal, I've read that the
> recommended practice for including characters not on a standard US
> keyboard is to copy them from a character map and save the file in a
> Unicode encoding. I intend to follow this guidance in writing the
> (x)html.ddoc template.
>
> As such, should I keep the existing 'entity' macros or use the Unicode
> characters in the DLang spec source files? I imagine that Andrei will
> immediately comment that .tex files are supposed to be in ASCII.
> Suggestions?

The LaTeX configuration won't use your ddoc template. Knock yourself out.

Andrei