Thread overview
diet-ng 1.8.3: source/diet/internal/html.d:185: data modification (truncation) in filterHTMLEscape?
Apr 27
kdevel
April 27

The actual modification happens in the line marked as (1). But this is not the location of the root cause.

The variable ch is of type dchar. Does anybody see the problem?

    switch (ch) {
		default:
			if (flags & HTMLEscapeFlags.escapeUnknown) {
				dst.put("&#");
				dst.put(to!string(cast(uint)ch));
				dst.put(';');
			} else dst.put(ch);
			break;
		case '"':
			if (flags & HTMLEscapeFlags.escapeQuotes) dst.put(""");
			else dst.put('"');
			break;
		case '\'':
			if (flags & HTMLEscapeFlags.escapeQuotes) dst.put("'");
			else dst.put('\'');
			break;
		case '\r', '\n':
			if (flags & HTMLEscapeFlags.escapeNewline) {
				dst.put("&#");
				dst.put(to!string(cast(uint)ch));
				dst.put(';');
			} else dst.put(ch);
			break;
		case 'a': .. case 'z': goto case;
		case 'A': .. case 'Z': goto case;
		case '0': .. case '9': goto case;
		case ' ', '\t', '-', '_', '.', ':', ',', ';',
			 '#', '+', '*', '?', '=', '(', ')', '/', '!',
			 '%' , '{', '}', '[', ']', '`', '´', '$', '^', '~':
			dst.put(cast(char)ch); // <<----- (1)
			break;
		case '<': dst.put("&lt;"); break;
		case '>': dst.put("&gt;"); break;
		case '&': dst.put("&amp;"); break;
	}

[1] https://github.com/rejectedsoftware/diet-ng/blob/master/source/diet/internal/html.d

April 28
Am 27.04.2025 um 23:01 schrieb kdevel:
> The actual modification happens in the line marked as `(1)`. But this is not the location of the root cause.
>
> The variable `ch` is of type `dchar`. Does anybody see the problem?
>
> ```d
>          case 'a': .. case 'z': goto case;
>          case 'A': .. case 'Z': goto case;
>          case '0': .. case '9': goto case;
>          case ' ', '\t', '-', '_', '.', ':', ',', ';',
>               '#', '+', '*', '?', '=', '(', ')', '/', '!',
>               '%' , '{', '}', '[', ']', '`', '´', '$', '^', '~':
>              dst.put(cast(char)ch); // <<----- (1)
>              break;
> ```

I was going to say that these are all ASCII characters, but it appears that the '´' has slipped in there erroneously (0xB4, outside of the ASCII range).
April 28
https://github.com/vibe-d/vibe-inet/pull/12