Thread overview
[SAOC 2024] SARIF Library and Outputs - Weekly Update #3
Oct 07
ryuukk_
October 06

Summary of Progress (September 30 – October 6)

In the third week of Milestone 1, I worked on building a library to represent SARIF and serialize it into JSON, following my mentor’s suggestion. I created multiple drafts of SARIF outputs, refining the structure and improving the details with each iteration. These drafts have helped me fine-tune the SARIF format for DMD’s error reporting, and I ran unit tests to validate the structure along the way.

What I Worked On:

1. Building a SARIF Library for JSON Serialization

  • I created an initial version of the SARIF template that includes key components to represent and serialize error information.
  • Key Components:
    • LogicalLocation and PhysicalLocation structs: Represent logical (e.g., function or method) and physical (e.g., file, line, column) locations in code.
    • Result struct: Stores rule violation details and uses SumType to handle both PhysicalLocation and LogicalLocation flexibly.
    • JSON Serialization: Implemented toJson methods for all structs, enabling easy conversion of SARIF objects into JSON format.
  • Unit Tests: I created unit tests to ensure robustness, covering various scenarios like unusual URIs, empty values, and different combinations of logical and physical locations.

2. Refining SARIF Outputs Over Four Drafts

3. Early Returns in the Main Function

  • I applied early returns in the main function to simplify the code and reduce unnecessary nesting, following my mentor's earlier suggestion. This keeps the code cleaner and more maintainable.

Challenges:

  • Feasibility of suggestedFix: While experimenting with the suggestedFix section in the third draft, I found that it wasn’t feasible without a deeper integration with DMD’s existing error-handling system. This led to the decision to remove it in the fourth draft.
  • Balancing SARIF Details: Finding the right level of detail in the SARIF outputs was challenging, but after refining the structure across four drafts, I achieved a balance between comprehensive reporting and simplicity.

Next Week’s Plan:

  • Begin integrating the SARIF library with the DMD codebase, following my mentor’s guidance. This will involve mapping DMD’s error reporting system to the SARIF schema using the library I built.
  • Continue refining the SARIF integration based on real test cases from DMD.

This week was focused on building a library for SARIF representation and refining the output structure. I’m looking forward to integrating the SARIF library with the DMD codebase and making further progress on this exciting task!

October 07
On 07/10/2024 10:34 AM, Royal Simpson Pinto wrote:
> *Result struct*: Stores rule violation details and uses |SumType|

If this is going inside of dmd, you cannot use Phobos.

But if you know what you need it to do, it won't be too hard to replicate what you need.

I assume this has already been covered, but worth mentioning!
October 07
On Sunday, 6 October 2024 at 23:05:54 UTC, Richard (Rikki) Andrew Cattermole wrote:
> If this is going inside of dmd, you cannot use Phobos.

Indeed, I think the best way to do this is to make SARIF a new `MessageStyle`.
October 07
On Sunday, 6 October 2024 at 23:05:54 UTC, Richard (Rikki) Andrew Cattermole wrote:

> I assume this has already been covered, but worth mentioning!

We've discussed it. I'd rather have a prototype that works that can be de-phobosed than having to make excessive decisions from the start of the project.

On sumtype in particular presumably we could also vendor a copy into the compiler. Don't really see why not, it's very good after all.
October 07
On Monday, 7 October 2024 at 11:04:00 UTC, max haughton wrote:
> On Sunday, 6 October 2024 at 23:05:54 UTC, Richard (Rikki) Andrew Cattermole wrote:
>
>> I assume this has already been covered, but worth mentioning!
>
> We've discussed it. I'd rather have a prototype that works that can be de-phobosed than having to make excessive decisions from the start of the project.
>
> On sumtype in particular presumably we could also vendor a copy into the compiler. Don't really see why not, it's very good after all.

Good thing multiple people wrote tagged union DIPs, it would have been handy