Thread overview
SAOC LLDB D integration: 2nd Weekly Update
Sep 30, 2021
Luís Ferreira
Oct 01, 2021
James Blachly
Oct 03, 2021
Luís Ferreira
Oct 01, 2021
WebFreak001
Oct 03, 2021
Luís Ferreira
Oct 01, 2021
James Blachly
Oct 03, 2021
Luís Ferreira
October 01, 2021
Hi D community!

Sorry for being late this week.

I'm here again, to describe what I've done during the second week of
Symmetry
Autumn of Code.

## Finalizing the LLDB integration

Last week some missing pieces on the test suite and on the LLDB side
was
concluded at the beginning of this week and put everything together.

## Restructure the code to be more C++ish

After successfully integrating `libiberty` D demangler into LLVM and
before
sending the patches to LLVM, code style needed to be properly handled
to
conform with `clang-format` style of LLVM, so, I decided to transform
the code
to be more C++ like:

- Move functions with `struct dlang_info` context to a struct making
them
  member functions to implicitly pass the context
- Make string handling on the demangler a bit more C++ish (class
OutputString)
- Fix structural codestyle to conform with clang formatting such as
variables
  names, spaces between identifiers, etc...

I also ended up writing documentation for everything inside the string
and
demangler struct for future understanding.

## Send patches to the LLVM review platform

Right after having the codestyle finished, I submitted the patches into
the
LLVM review platform. In the meantime, I'm striving for acceptance and
proactively changing the patches to accomplish with the LLVM
maintainers
requests.

The first patch introduces the demangler codebase with the ported code,
available [here](https://reviews.llvm.org/D110578). The second patch
enables
support for `llvm-cxxfilt` tool, similar to `c++filt` from GNU
binutils,
available [here](https://reviews.llvm.org/D110576). Finally, the last
patch
enables the most important part for the users, the LLDB part. The patch
is
available [here](https://reviews.llvm.org/D110577).

## Reflected GCC changes

Meanwhile, I found some things to improve on the `libiberty` side that
I
changed on my patches to LLVM:

- Use appendc for single chars append:

[patch](https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580512.html
).
- Remove parenthesis where it is not needed:

[patch](https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580525.html
).
- Rename function symbols to be more consistent:

[patch](https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580542.html
).
- Use switch instead of if-else:

[patch](https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580545.html
).

I also made this patch which fixes the testsuite that I previously
broke on the
security patches.

- Add missing format on d-demangle-expected:

[patch](https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580544.html
).

## About the security issues

I made a thread on the GCC mailing list to encourage more fuzzing.
Currently
the demangler is being fuzzed without any heuristics which makes it
inefficient
to search for real security vulnerabilities. Instead, AFL and libfuzzer
should
be taken to consideration. My idea is to also add support for
GCC/libiberty to
OSS Fuzz. You can check the thread
[here](https://gcc.gnu.org/pipermail/gcc/2021-September/237442.html)
and
participate if you have any questions or suggestions on that topic.

About the exponential time complexity issue, I don't have any news,
since I
still don't have the full picture of it. I'm probably not going to
dedicate
much time to that since it's kinda out of the scope of this project.
Although,
if anyone wants to have a look and discuss hints and suggestions to
improve the
current demangler, I appreciate it.
[Here](http://ipfs.io/ipfs/bafybeihw6bk46r7gnkp6estkwk7ucilxb2swlwzzi2izpytaclypxeu2wq/
)
are the blobs generated by the fuzzer for timeout and slow-unit
triggers.

## What's next?

For now, I'm going to proactively fix the requested changes in the LLVM
patches. They seem to require smaller patches and probably the next
week will
be dedicated to that.

You can see this also on my blog, since my email client doesn't like 80 line splitted text: https://lsferreira.net/posts/d-saoc-2021-02

-- 
Sincerely,
Luís Ferreira @ lsferreira.net



September 30, 2021
On 9/30/21 7:05 PM, Luís Ferreira wrote:
> The first patch introduces the demangler codebase with the ported code,
> available [here](https://reviews.llvm.org/D110578). The second patch
> enables
> support for `llvm-cxxfilt` tool, similar to `c++filt` from GNU
> binutils,
> available [here](https://reviews.llvm.org/D110576). Finally, the last
> patch
> enables the most important part for the users, the LLDB part. The patch
> is
> available [here](https://reviews.llvm.org/D110577).
> 

Congratulations on getting patches accepted to LLVM!

Your LLDB work is incredibly important and exciting -- thank you and keep going!
October 01, 2021
On Thursday, 30 September 2021 at 23:05:21 UTC, Luís Ferreira wrote:
> Hi D community!
>
> Sorry for being late this week.
>
> I'm here again, to describe what I've done during the second week of
> Symmetry
> Autumn of Code.
>
> [...]

Awesome! I had looked at trying to implement this before, but haven't really gotten further than seeing where to add the enum entry. Great to see you tackle this, I think this is already making LLDB the best D debugger on Linux.

BTW I have made pretty printers for LLDB in the past (https://github.com/Pure-D/dlang-debug/) to print objects, arrays, strings, etc. much better. If you want to implement something like that, might be worth looking at that.
October 01, 2021
On 9/30/21 7:05 PM, Luís Ferreira wrote:
> ## What's next?
> 
> For now, I'm going to proactively fix the requested changes in the LLVM
> patches. They seem to require smaller patches and probably the next
> week will
> be dedicated to that.

Luís:

I think it's a little bit surprising and disappointing that they want such granular breakdown, but they required the same of the recent Rust demangler [0] as well, so at least they are applying their rule fairly consistently.

There is also the licensing issue which raised their hackles and you'll have to deal with.

One potential strategy to sidestep the licensing issue AND to make the breakdown task much easier, is to abandon the libiberty code and literally take each of the Rust patches [1] and straight port (adding and subtracting cases as needed) for D demangling.

Rust Demangler:
[0] https://github.com/llvm/llvm-project/blob/main/llvm/lib/Demangle/RustDemangle.cpp

Consecutive patch history:
[1] https://github.com/llvm/llvm-project/commits/main/llvm/lib/Demangle/RustDemangle.cpp
October 03, 2021
On Thu, 2021-09-30 at 20:38 -0400, James Blachly via Digitalmars-d wrote:
> On 9/30/21 7:05 PM, Luís Ferreira wrote:
> > The first patch introduces the demangler codebase with the ported
> > code,
> > available [here](https://reviews.llvm.org/D110578). The second
> > patch
> > enables
> > support for `llvm-cxxfilt` tool, similar to `c++filt` from GNU
> > binutils,
> > available [here](https://reviews.llvm.org/D110576). Finally, the
> > last
> > patch
> > enables the most important part for the users, the LLDB part. The
> > patch
> > is
> > available [here](https://reviews.llvm.org/D110577).
> > 
> 
> Congratulations on getting patches accepted to LLVM!
> 
> Your LLDB work is incredibly important and exciting -- thank you and keep going!

Thanks for your inspiring words!

-- 
Sincerely,
Luís Ferreira @ lsferreira.net



October 03, 2021
On Fri, 2021-10-01 at 07:15 +0000, WebFreak001 via Digitalmars-d wrote:
> On Thursday, 30 September 2021 at 23:05:21 UTC, Luís Ferreira wrote:
> > Hi D community!
> > 
> > Sorry for being late this week.
> > 
> > I'm here again, to describe what I've done during the second
> > week of
> > Symmetry
> > Autumn of Code.
> > 
> > [...]
> 
> Awesome! I had looked at trying to implement this before, but haven't really gotten further than seeing where to add the enum entry. Great to see you tackle this, I think this is already making LLDB the best D debugger on Linux.
> 
> BTW I have made pretty printers for LLDB in the past (https://github.com/Pure-D/dlang-debug/) to print objects, arrays, strings, etc. much better. If you want to implement something like that, might be worth looking at that.

Thanks for your words and valuable resources! I already took a quick look at it. This is my plan to implement on the second milestone. The thing I'm kinda skeptical is some ABI assumptions that are currently not standerdized, such as the associative arrays and the symbol name exported on DWARF by DMD. Currently LDC uses fully qualified names for array types (in fact, every symbol) but DMD uses _Array_<primitive_type> or _Array_struct for custom agreggate types and so on. I'm studying this and other similar stuff and push some more standerization in that regard or, at least, consistency among the existing compilers.

My idea is to at least support struct and strings pretty print. I can trigger a discussion about standerdizing the AA's ABI. I'm also searching if anything can be done to read vtables correctly, but I don't know how that is handled by DWARF and don't have a full picture of the ABI structure for that. Anyway, that is definitely something tackle next.

-- 
Sincerely,
Luís Ferreira @ lsferreira.net



October 03, 2021
On Fri, 2021-10-01 at 14:02 -0400, James Blachly via Digitalmars-d wrote:
> On 9/30/21 7:05 PM, Luís Ferreira wrote:
> > ## What's next?
> > 
> > For now, I'm going to proactively fix the requested changes in the
> > LLVM
> > patches. They seem to require smaller patches and probably the next
> > week will
> > be dedicated to that.
> 
> Luís:
> 
> I think it's a little bit surprising and disappointing that they want
> such granular breakdown, but they required the same of the recent
> Rust
> demangler [0] as well, so at least they are applying their rule
> fairly
> consistently.

I kinda understand their point, since huge changes can be difficult to review, although, for my side, this can also be time consuming, but I guess I have no choice. It is at least a reasonable rationale since LLVM is a big project and code introduced to it should be take with a bit of caution.

> There is also the licensing issue which raised their hackles and
> you'll
> have to deal with.
> 
> One potential strategy to sidestep the licensing issue AND to make
> the
> breakdown task much easier, is to abandon the libiberty code and
> literally take each of the Rust patches [1] and straight port (adding
> and subtracting cases as needed) for D demangling.
> 
> Rust Demangler:
> [0]
> https://github.com/llvm/llvm-project/blob/main/llvm/lib/Demangle/RustDemangle.cpp
> 
> Consecutive patch history:
> [1]
> https://github.com/llvm/llvm-project/commits/main/llvm/lib/Demangle/RustDemangle.cpp

They are not being against relicensing, which is good, and since Iain is also on our side, I think that this is no longer a problem, I hope. I also have half of a demangler written in D kinda from scratch to substitute the current core.demangle, which I can implement it as a ultimate plan Z but that choice is also very time consuming and probably not feasable in the proposed time range.

-- 
Sincerely,
Luís Ferreira @ lsferreira.net