Want to help DMD bugfixing? Write a simple utility. (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Want to help DMD bugfixing? Write a simple utility. (page 2)

March 20, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Kai Meyer
in reply to Don

Kai Meyer

Posted in reply to Don

On 03/19/2011 06:11 PM, Don wrote:
> Here's the task:
> Given a .d source file, strip out all of the unittest {} blocks,
> including everything inside them.
> Strip out all comments as well.
> Print out the resulting file.
>
> Motivation: Bug reports frequently come with very large test cases.
> Even ones which look small often import from Phobos.
> Reducing the test case is the first step in fixing the bug, and it's
> frequently ~30% of the total time required. Stripping out the unit tests
> is the most time-consuming and error-prone part of reducing the test case.
>
> This should be a good task if you're relatively new to D but would like
> to do something really useful.
> -Don

Is there a copy of the official D grammar somewhere online? I wrote a lexer for my Compiler class and would love to try and apply it to another grammar.

-Kai Meyer

March 20, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Zirneklis
in reply to Kai Meyer

Zirneklis

Posted in reply to Kai Meyer

On 20/03/2011 19:55, Kai Meyer wrote:
> On 03/19/2011 06:11 PM, Don wrote:
>> Here's the task:
>> Given a .d source file, strip out all of the unittest {} blocks,
>> including everything inside them.
>> Strip out all comments as well.
>> Print out the resulting file.
>>
>> Motivation: Bug reports frequently come with very large test cases.
>> Even ones which look small often import from Phobos.
>> Reducing the test case is the first step in fixing the bug, and it's
>> frequently ~30% of the total time required. Stripping out the unit tests
>> is the most time-consuming and error-prone part of reducing the test
>> case.
>>
>> This should be a good task if you're relatively new to D but would like
>> to do something really useful.
>> -Don
>
> Is there a copy of the official D grammar somewhere online? I wrote a
> lexer for my Compiler class and would love to try and apply it to
> another grammar.
>
> -Kai Meyer

As far as I know the documentation /is/ the official grammar
http://digitalmars.com/d/2.0/lex.html

March 21, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Ary Manzana
in reply to Don

Ary Manzana

Posted in reply to Don

On 3/19/11 9:11 PM, Don wrote:
> Here's the task:
> Given a .d source file, strip out all of the unittest {} blocks,
> including everything inside them.
> Strip out all comments as well.
> Print out the resulting file.
>
> Motivation: Bug reports frequently come with very large test cases.
> Even ones which look small often import from Phobos.
> Reducing the test case is the first step in fixing the bug, and it's
> frequently ~30% of the total time required. Stripping out the unit tests
> is the most time-consuming and error-prone part of reducing the test case.
>
> This should be a good task if you're relatively new to D but would like
> to do something really useful.
> -Don

Can it be done in Ruby? Or you need it in D?

March 21, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Simen kjaeraas
in reply to Ary Manzana

Simen kjaeraas

Posted in reply to Ary Manzana

On Mon, 21 Mar 2011 01:52:45 +0100, Ary Manzana <ary@esperanto.org.ar> wrote:

> On 3/19/11 9:11 PM, Don wrote:
>> Here's the task:
>> Given a .d source file, strip out all of the unittest {} blocks,
>> including everything inside them.
>> Strip out all comments as well.
>> Print out the resulting file.
>>
>> Motivation: Bug reports frequently come with very large test cases.
>> Even ones which look small often import from Phobos.
>> Reducing the test case is the first step in fixing the bug, and it's
>> frequently ~30% of the total time required. Stripping out the unit tests
>> is the most time-consuming and error-prone part of reducing the test case.
>>
>> This should be a good task if you're relatively new to D but would like
>> to do something really useful.
>> -Don
>
> Can it be done in Ruby? Or you need it in D?

Part of the idea was that someone use it to learn D. However, the important
part is that it's done. Doing it in D would be preferable, but not a
requisite.


-- 
Simen

March 23, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Regan Heath
in reply to Jonathan M Davis

Regan Heath

Posted in reply to Jonathan M Davis

On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
>> Jonathan M Davis wrote:
>> > On Saturday 19 March 2011 18:04:57 Don wrote:
>> >> Jonathan M Davis wrote:
>> >>> On Saturday 19 March 2011 17:11:56 Don wrote:
>> >>>> Here's the task:
>> >>>> Given a .d source file, strip out all of the unittest {} blocks,
>> >>>> including everything inside them.
>> >>>> Strip out all comments as well.
>> >>>> Print out the resulting file.
>> >>>>
>> >>>> Motivation: Bug reports frequently come with very large test cases.
>> >>>> Even ones which look small often import from Phobos.
>> >>>> Reducing the test case is the first step in fixing the bug, and  
>> it's
>> >>>> frequently ~30% of the total time required. Stripping out the unit
>> >>>> tests is the most time-consuming and error-prone part of reducing  
>> the
>> >>>> test case.
>> >>>>
>> >>>> This should be a good task if you're relatively new to D but would
>> >>>> like to do something really useful.
>> >>>
>> >>> Unfortunately, to do that 100% correctly, you need to actually have  
>> a
>> >>> working D lexer (and possibly parser). You might be able to get
>> >>> something close enough to work in most cases, but it doesn't take  
>> all
>> >>> that much to throw off a basic implementation of this sort of thing  
>> if
>> >>> you don't lex/parse it with something which properly understands D.
>> >>>
>> >>> - Jonathan M Davis
>> >>
>> >> I didn't say it needs 100% accuracy. You can assume, for example,  
>> that
>> >> "unittest" always occurs at the start of a line. The only other  
>> things
>> >> you need to lex are {}, string literals, and comments.
>> >>
>> >> BTW, the immediate motivation for this is std.datetime in Phobos. The
>> >> sheer number of unittests in there is an absolute catastrophe for
>> >> tracking down bugs. It makes a tool like this MANDATORY.
>> >
>> > I tried to create a similar tool before and gave up because I couldn't
>> > make it 100% accurate and was running into problems with it. If  
>> someone
>> > wants to take a shot at it though, that's fine.
>> >
>> > As for the unit tests in std.datetime making it hard to track down  
>> bugs,
>> > that only makes sense to me if you're trying to look at the whole  
>> thing
>> > at once and track down a compiler bug which happens _somewhere_ in the
>> > code, but you don't know where. Other than a problem like that, I  
>> don't
>> > really see how the unit tests get in the way of tracking down bugs. Is
>> > it that you need to compile in a version of std.datetime which doesn't
>> > have any unit tests compiled in but you still need to compile with
>> > -unittest for other stuff?
>>
>> No. All you know there's a bug that's being triggered somewhere in
>> Phobos (with -unittest). It's probably not in std.datetime.
>> But Phobos is a horrible ball of mud where everything imports everything
>> else, and std.datetime is near the centre of that ball. What you have to
>> do is reduce the amount of code, and especially the number of modules,
>> as rapidly as possible; this means getting rid of imports.
>>
>> To do this, you need to remove large chunks of code from the files. This
>> is pretty simple; comment out half of the file, if it still works, then
>> delete it. Normally this works well because typically only about a dozen
>> lines are actually being used. After doing this about three or four
>> times it's small enough that you can usually get rid of most of the
>> imports. Unittests foul this up because they use functions/classes from
>> inside the file.
>>
>> In the case of std.datetime it's even worse because the signal-to-noise
>> ratio is so incredibly poor; it's really difficult to find the few lines
>> of code that are actually being used by other Phobos modules.
>>
>> My experience (obviously only over the last month or so) has been that
>> if the reduction of a bug is non-obvious, more than 10% of the total
>> time taken to fix that bug is the time taken to cut down std.datetime.
>
> Hmmm. I really don't know what could be done to fix that (other than making it
> easier to rip out the unittest blocks). And enough of std.datetime depends on
> other parts of std.datetime that trimming it down isn't (and can't be) exactly
> easy. In general, SysTime is the most likely type to be used, and it depends
> on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of the
> free functions in the module. It's not exactly designed in a manner which
> allows you to cut out large chunks and still have it compile. And I don't
> think that it _could_ be designed that way and still have the functionality
> that it has.
>
> I guess that this sort of problem is one that would pop up mainly when dealing
> with compiler bugs. I have a hard time seeing it popping up with your typical
> bug in Phobos itself. So, I guess that this is the sort of thing that you'd
> run into and I likely wouldn't.
>
> I really don't know how the situation could be improved though other than
> making it easier to cut out the unit tests.

I was just thinking .. if we get a list of the symbols the linker is including, then write an app to take that list, and strip everything else out of the source .. would that work.  The Q's are how hard is it to get the symbols from the linker and then how hard is it to match those to source.  IIRC there are functions in phobos to convert to/from symbol names, so if the app had sufficient lexing and parsing capability it could match on those.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

March 23, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Jonathan M Davis
in reply to Regan Heath

Jonathan M Davis

Posted in reply to Regan Heath

> On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis <jmdavisProg@gmx.com>
> 
> wrote:
> >> Jonathan M Davis wrote:
> >> > On Saturday 19 March 2011 18:04:57 Don wrote:
> >> >> Jonathan M Davis wrote:
> >> >>> On Saturday 19 March 2011 17:11:56 Don wrote:
> >> >>>> Here's the task:
> >> >>>> Given a .d source file, strip out all of the unittest {} blocks,
> >> >>>> including everything inside them.
> >> >>>> Strip out all comments as well.
> >> >>>> Print out the resulting file.
> >> >>>> 
> >> >>>> Motivation: Bug reports frequently come with very large test cases.
> >> >>>> Even ones which look small often import from Phobos.
> >> >>>> Reducing the test case is the first step in fixing the bug, and
> >> 
> >> it's
> >> 
> >> >>>> frequently ~30% of the total time required. Stripping out the unit tests is the most time-consuming and error-prone part of reducing
> >> 
> >> the
> >> 
> >> >>>> test case.
> >> >>>> 
> >> >>>> This should be a good task if you're relatively new to D but would like to do something really useful.
> >> >>> 
> >> >>> Unfortunately, to do that 100% correctly, you need to actually have
> >> 
> >> a
> >> 
> >> >>> working D lexer (and possibly parser). You might be able to get something close enough to work in most cases, but it doesn't take
> >> 
> >> all
> >> 
> >> >>> that much to throw off a basic implementation of this sort of thing
> >> 
> >> if
> >> 
> >> >>> you don't lex/parse it with something which properly understands D.
> >> >>> 
> >> >>> - Jonathan M Davis
> >> >> 
> >> >> I didn't say it needs 100% accuracy. You can assume, for example,
> >> 
> >> that
> >> 
> >> >> "unittest" always occurs at the start of a line. The only other
> >> 
> >> things
> >> 
> >> >> you need to lex are {}, string literals, and comments.
> >> >> 
> >> >> BTW, the immediate motivation for this is std.datetime in Phobos. The sheer number of unittests in there is an absolute catastrophe for tracking down bugs. It makes a tool like this MANDATORY.
> >> > 
> >> > I tried to create a similar tool before and gave up because I couldn't make it 100% accurate and was running into problems with it. If
> >> 
> >> someone
> >> 
> >> > wants to take a shot at it though, that's fine.
> >> > 
> >> > As for the unit tests in std.datetime making it hard to track down
> >> 
> >> bugs,
> >> 
> >> > that only makes sense to me if you're trying to look at the whole
> >> 
> >> thing
> >> 
> >> > at once and track down a compiler bug which happens _somewhere_ in the code, but you don't know where. Other than a problem like that, I
> >> 
> >> don't
> >> 
> >> > really see how the unit tests get in the way of tracking down bugs. Is it that you need to compile in a version of std.datetime which doesn't have any unit tests compiled in but you still need to compile with -unittest for other stuff?
> >> 
> >> No. All you know there's a bug that's being triggered somewhere in
> >> Phobos (with -unittest). It's probably not in std.datetime.
> >> But Phobos is a horrible ball of mud where everything imports everything
> >> else, and std.datetime is near the centre of that ball. What you have to
> >> do is reduce the amount of code, and especially the number of modules,
> >> as rapidly as possible; this means getting rid of imports.
> >> 
> >> To do this, you need to remove large chunks of code from the files. This is pretty simple; comment out half of the file, if it still works, then delete it. Normally this works well because typically only about a dozen lines are actually being used. After doing this about three or four times it's small enough that you can usually get rid of most of the imports. Unittests foul this up because they use functions/classes from inside the file.
> >> 
> >> In the case of std.datetime it's even worse because the signal-to-noise ratio is so incredibly poor; it's really difficult to find the few lines of code that are actually being used by other Phobos modules.
> >> 
> >> My experience (obviously only over the last month or so) has been that if the reduction of a bug is non-obvious, more than 10% of the total time taken to fix that bug is the time taken to cut down std.datetime.
> > 
> > Hmmm. I really don't know what could be done to fix that (other than
> > making it
> > easier to rip out the unittest blocks). And enough of std.datetime
> > depends on
> > other parts of std.datetime that trimming it down isn't (and can't be)
> > exactly
> > easy. In general, SysTime is the most likely type to be used, and it
> > depends
> > on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of
> > the
> > free functions in the module. It's not exactly designed in a manner which
> > allows you to cut out large chunks and still have it compile. And I don't
> > think that it _could_ be designed that way and still have the
> > functionality
> > that it has.
> > 
> > I guess that this sort of problem is one that would pop up mainly when
> > dealing
> > with compiler bugs. I have a hard time seeing it popping up with your
> > typical
> > bug in Phobos itself. So, I guess that this is the sort of thing that
> > you'd
> > run into and I likely wouldn't.
> > 
> > I really don't know how the situation could be improved though other than making it easier to cut out the unit tests.
> 
> I was just thinking .. if we get a list of the symbols the linker is including, then write an app to take that list, and strip everything else out of the source .. would that work.  The Q's are how hard is it to get the symbols from the linker and then how hard is it to match those to source.  IIRC there are functions in phobos to convert to/from symbol names, so if the app had sufficient lexing and parsing capability it could match on those.

That would require a full-blown D lexer and parser.

- Jonathan M Davis

March 23, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Regan Heath
in reply to Jonathan M Davis

Regan Heath

Posted in reply to Jonathan M Davis

On Wed, 23 Mar 2011 15:16:46 -0000, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
>> I was just thinking .. if we get a list of the symbols the linker is
>> including, then write an app to take that list, and strip everything else
>> out of the source .. would that work.  The Q's are how hard is it to get
>> the symbols from the linker and then how hard is it to match those to
>> source.  IIRC there are functions in phobos to convert to/from symbol
>> names, so if the app had sufficient lexing and parsing capability it could
>> match on those.
>
> That would require a full-blown D lexer and parser.
>
> - Jonathan M Davis

Yeah, I thought as much.  I wonder if the new guy "Ilya" who just posted on digitalmars.D would find this interesting..

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

March 23, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Kai Meyer
in reply to Jonathan M Davis

Kai Meyer

Posted in reply to Jonathan M Davis

On 03/23/2011 09:16 AM, Jonathan M Davis wrote:
>> On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis<jmdavisProg@gmx.com>
>>
>> wrote:
>>>> Jonathan M Davis wrote:
>>>>> On Saturday 19 March 2011 18:04:57 Don wrote:
>>>>>> Jonathan M Davis wrote:
>>>>>>> On Saturday 19 March 2011 17:11:56 Don wrote:
>>>>>>>> Here's the task:
>>>>>>>> Given a .d source file, strip out all of the unittest {} blocks,
>>>>>>>> including everything inside them.
>>>>>>>> Strip out all comments as well.
>>>>>>>> Print out the resulting file.
>>>>>>>>
>>>>>>>> Motivation: Bug reports frequently come with very large test cases.
>>>>>>>> Even ones which look small often import from Phobos.
>>>>>>>> Reducing the test case is the first step in fixing the bug, and
>>>>
>>>> it's
>>>>
>>>>>>>> frequently ~30% of the total time required. Stripping out the unit
>>>>>>>> tests is the most time-consuming and error-prone part of reducing
>>>>
>>>> the
>>>>
>>>>>>>> test case.
>>>>>>>>
>>>>>>>> This should be a good task if you're relatively new to D but would
>>>>>>>> like to do something really useful.
>>>>>>>
>>>>>>> Unfortunately, to do that 100% correctly, you need to actually have
>>>>
>>>> a
>>>>
>>>>>>> working D lexer (and possibly parser). You might be able to get
>>>>>>> something close enough to work in most cases, but it doesn't take
>>>>
>>>> all
>>>>
>>>>>>> that much to throw off a basic implementation of this sort of thing
>>>>
>>>> if
>>>>
>>>>>>> you don't lex/parse it with something which properly understands D.
>>>>>>>
>>>>>>> - Jonathan M Davis
>>>>>>
>>>>>> I didn't say it needs 100% accuracy. You can assume, for example,
>>>>
>>>> that
>>>>
>>>>>> "unittest" always occurs at the start of a line. The only other
>>>>
>>>> things
>>>>
>>>>>> you need to lex are {}, string literals, and comments.
>>>>>>
>>>>>> BTW, the immediate motivation for this is std.datetime in Phobos. The
>>>>>> sheer number of unittests in there is an absolute catastrophe for
>>>>>> tracking down bugs. It makes a tool like this MANDATORY.
>>>>>
>>>>> I tried to create a similar tool before and gave up because I couldn't
>>>>> make it 100% accurate and was running into problems with it. If
>>>>
>>>> someone
>>>>
>>>>> wants to take a shot at it though, that's fine.
>>>>>
>>>>> As for the unit tests in std.datetime making it hard to track down
>>>>
>>>> bugs,
>>>>
>>>>> that only makes sense to me if you're trying to look at the whole
>>>>
>>>> thing
>>>>
>>>>> at once and track down a compiler bug which happens _somewhere_ in the
>>>>> code, but you don't know where. Other than a problem like that, I
>>>>
>>>> don't
>>>>
>>>>> really see how the unit tests get in the way of tracking down bugs. Is
>>>>> it that you need to compile in a version of std.datetime which doesn't
>>>>> have any unit tests compiled in but you still need to compile with
>>>>> -unittest for other stuff?
>>>>
>>>> No. All you know there's a bug that's being triggered somewhere in
>>>> Phobos (with -unittest). It's probably not in std.datetime.
>>>> But Phobos is a horrible ball of mud where everything imports everything
>>>> else, and std.datetime is near the centre of that ball. What you have to
>>>> do is reduce the amount of code, and especially the number of modules,
>>>> as rapidly as possible; this means getting rid of imports.
>>>>
>>>> To do this, you need to remove large chunks of code from the files. This
>>>> is pretty simple; comment out half of the file, if it still works, then
>>>> delete it. Normally this works well because typically only about a dozen
>>>> lines are actually being used. After doing this about three or four
>>>> times it's small enough that you can usually get rid of most of the
>>>> imports. Unittests foul this up because they use functions/classes from
>>>> inside the file.
>>>>
>>>> In the case of std.datetime it's even worse because the signal-to-noise
>>>> ratio is so incredibly poor; it's really difficult to find the few lines
>>>> of code that are actually being used by other Phobos modules.
>>>>
>>>> My experience (obviously only over the last month or so) has been that
>>>> if the reduction of a bug is non-obvious, more than 10% of the total
>>>> time taken to fix that bug is the time taken to cut down std.datetime.
>>>
>>> Hmmm. I really don't know what could be done to fix that (other than
>>> making it
>>> easier to rip out the unittest blocks). And enough of std.datetime
>>> depends on
>>> other parts of std.datetime that trimming it down isn't (and can't be)
>>> exactly
>>> easy. In general, SysTime is the most likely type to be used, and it
>>> depends
>>> on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of
>>> the
>>> free functions in the module. It's not exactly designed in a manner which
>>> allows you to cut out large chunks and still have it compile. And I don't
>>> think that it _could_ be designed that way and still have the
>>> functionality
>>> that it has.
>>>
>>> I guess that this sort of problem is one that would pop up mainly when
>>> dealing
>>> with compiler bugs. I have a hard time seeing it popping up with your
>>> typical
>>> bug in Phobos itself. So, I guess that this is the sort of thing that
>>> you'd
>>> run into and I likely wouldn't.
>>>
>>> I really don't know how the situation could be improved though other than
>>> making it easier to cut out the unit tests.
>>
>> I was just thinking .. if we get a list of the symbols the linker is
>> including, then write an app to take that list, and strip everything else
>> out of the source .. would that work.  The Q's are how hard is it to get
>> the symbols from the linker and then how hard is it to match those to
>> source.  IIRC there are functions in phobos to convert to/from symbol
>> names, so if the app had sufficient lexing and parsing capability it could
>> match on those.
>
> That would require a full-blown D lexer and parser.
>
> - Jonathan M Davis

Why are we talking about having to recreate a full-blown lexer and parser when there has to be one that exists for D anyway? This is sounding more and more like you're asking the wrong crowd to solve a problem. To do it right, the people who have access to the real D lexer and parser would need to write this utility, and in some ways, it's already written since compiling with out a -unittest flag already omits all the unittests.

So I'm a bit confused about two things.

1) Why ask the wrong people to write the tool in the first place?
2) Why are we the wrong people any way?

-Kai Meyer

March 23, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Andrej Mitrovic

Andrej Mitrovic

On 3/23/11, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> That would require a full-blown D lexer and parser.
>
> - Jonathan M Davis
>
Isn't DDMD written in D? I'm not sure about how finished it is though.

March 23, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Jonathan M Davis
in reply to Kai Meyer

Jonathan M Davis

Posted in reply to Kai Meyer

> On 03/23/2011 09:16 AM, Jonathan M Davis wrote:
> >> On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis<jmdavisProg@gmx.com>
> >> 
> >> wrote:
> >>>> Jonathan M Davis wrote:
> >>>>> On Saturday 19 March 2011 18:04:57 Don wrote:
> >>>>>> Jonathan M Davis wrote:
> >>>>>>> On Saturday 19 March 2011 17:11:56 Don wrote:
> >>>>>>>> Here's the task:
> >>>>>>>> Given a .d source file, strip out all of the unittest {} blocks,
> >>>>>>>> including everything inside them.
> >>>>>>>> Strip out all comments as well.
> >>>>>>>> Print out the resulting file.
> >>>>>>>> 
> >>>>>>>> Motivation: Bug reports frequently come with very large test cases. Even ones which look small often import from Phobos. Reducing the test case is the first step in fixing the bug, and
> >>>> 
> >>>> it's
> >>>> 
> >>>>>>>> frequently ~30% of the total time required. Stripping out the unit tests is the most time-consuming and error-prone part of reducing
> >>>> 
> >>>> the
> >>>> 
> >>>>>>>> test case.
> >>>>>>>> 
> >>>>>>>> This should be a good task if you're relatively new to D but would like to do something really useful.
> >>>>>>> 
> >>>>>>> Unfortunately, to do that 100% correctly, you need to actually have
> >>>> 
> >>>> a
> >>>> 
> >>>>>>> working D lexer (and possibly parser). You might be able to get something close enough to work in most cases, but it doesn't take
> >>>> 
> >>>> all
> >>>> 
> >>>>>>> that much to throw off a basic implementation of this sort of thing
> >>>> 
> >>>> if
> >>>> 
> >>>>>>> you don't lex/parse it with something which properly understands D.
> >>>>>>> 
> >>>>>>> - Jonathan M Davis
> >>>>>> 
> >>>>>> I didn't say it needs 100% accuracy. You can assume, for example,
> >>>> 
> >>>> that
> >>>> 
> >>>>>> "unittest" always occurs at the start of a line. The only other
> >>>> 
> >>>> things
> >>>> 
> >>>>>> you need to lex are {}, string literals, and comments.
> >>>>>> 
> >>>>>> BTW, the immediate motivation for this is std.datetime in Phobos. The sheer number of unittests in there is an absolute catastrophe for tracking down bugs. It makes a tool like this MANDATORY.
> >>>>> 
> >>>>> I tried to create a similar tool before and gave up because I couldn't make it 100% accurate and was running into problems with it. If
> >>>> 
> >>>> someone
> >>>> 
> >>>>> wants to take a shot at it though, that's fine.
> >>>>> 
> >>>>> As for the unit tests in std.datetime making it hard to track down
> >>>> 
> >>>> bugs,
> >>>> 
> >>>>> that only makes sense to me if you're trying to look at the whole
> >>>> 
> >>>> thing
> >>>> 
> >>>>> at once and track down a compiler bug which happens _somewhere_ in the code, but you don't know where. Other than a problem like that, I
> >>>> 
> >>>> don't
> >>>> 
> >>>>> really see how the unit tests get in the way of tracking down bugs. Is it that you need to compile in a version of std.datetime which doesn't have any unit tests compiled in but you still need to compile with -unittest for other stuff?
> >>>> 
> >>>> No. All you know there's a bug that's being triggered somewhere in
> >>>> Phobos (with -unittest). It's probably not in std.datetime.
> >>>> But Phobos is a horrible ball of mud where everything imports
> >>>> everything else, and std.datetime is near the centre of that ball.
> >>>> What you have to do is reduce the amount of code, and especially the
> >>>> number of modules, as rapidly as possible; this means getting rid of
> >>>> imports.
> >>>> 
> >>>> To do this, you need to remove large chunks of code from the files. This is pretty simple; comment out half of the file, if it still works, then delete it. Normally this works well because typically only about a dozen lines are actually being used. After doing this about three or four times it's small enough that you can usually get rid of most of the imports. Unittests foul this up because they use functions/classes from inside the file.
> >>>> 
> >>>> In the case of std.datetime it's even worse because the signal-to-noise ratio is so incredibly poor; it's really difficult to find the few lines of code that are actually being used by other Phobos modules.
> >>>> 
> >>>> My experience (obviously only over the last month or so) has been that if the reduction of a bug is non-obvious, more than 10% of the total time taken to fix that bug is the time taken to cut down std.datetime.
> >>> 
> >>> Hmmm. I really don't know what could be done to fix that (other than
> >>> making it
> >>> easier to rip out the unittest blocks). And enough of std.datetime
> >>> depends on
> >>> other parts of std.datetime that trimming it down isn't (and can't be)
> >>> exactly
> >>> easy. In general, SysTime is the most likely type to be used, and it
> >>> depends
> >>> on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of
> >>> the
> >>> free functions in the module. It's not exactly designed in a manner
> >>> which allows you to cut out large chunks and still have it compile.
> >>> And I don't think that it _could_ be designed that way and still have
> >>> the
> >>> functionality
> >>> that it has.
> >>> 
> >>> I guess that this sort of problem is one that would pop up mainly when
> >>> dealing
> >>> with compiler bugs. I have a hard time seeing it popping up with your
> >>> typical
> >>> bug in Phobos itself. So, I guess that this is the sort of thing that
> >>> you'd
> >>> run into and I likely wouldn't.
> >>> 
> >>> I really don't know how the situation could be improved though other than making it easier to cut out the unit tests.
> >> 
> >> I was just thinking .. if we get a list of the symbols the linker is including, then write an app to take that list, and strip everything else out of the source .. would that work.  The Q's are how hard is it to get the symbols from the linker and then how hard is it to match those to source.  IIRC there are functions in phobos to convert to/from symbol names, so if the app had sufficient lexing and parsing capability it could match on those.
> > 
> > That would require a full-blown D lexer and parser.
> > 
> > - Jonathan M Davis
> 
> Why are we talking about having to recreate a full-blown lexer and parser when there has to be one that exists for D anyway? This is sounding more and more like you're asking the wrong crowd to solve a problem. To do it right, the people who have access to the real D lexer and parser would need to write this utility, and in some ways, it's already written since compiling with out a -unittest flag already omits all the unittests.
> 
> So I'm a bit confused about two things.
> 
> 1) Why ask the wrong people to write the tool in the first place? 2) Why are we the wrong people any way?

There are tasks for which you need to be able to lex and parse D code. To 100% correctly remove unit tests would be one such task. Another would be if you want a program to be able to syntax highlight some D code. Currently, as far as I know, there are only two lexers and two parsers for D: the C++ front end which dmd, gdc, and ldc use and the D front end which ddmd uses and which is based on the C++ front end. Both of those are under the GPL (which makes them useless for a lot of stuff) and both of them are tied to compilers. Being able to lex D code and get the list of tokens in a D program and being able to parse D code and get the resultant abstract syntax tree would be very useful for a number of programs.

So, while your average program may not care about being able to lex and parse D code, there _are_ programs that do, and being able to do so in D would be highly valuable for such programs. Previously Walter asked for a volunteer to port the lexer from the C++ front end to D under the Boost license to be put into Phobos (I volunteered for that and have been working on it off and on, slowly making progress on it). Andrei's reaction was that we should have a generic lexer which uses generic programming and is not tied to D at all, and _that_ is what someone may be working on for the GSoC (there are still solid arguments for having a D-specific lexer though, so hopefully we end up with both).

Now, for this particular problem, in order to track down certain types of compiler bugs, he needs to be able to build with -unittest but not have irrelevant code compiled in. So, for instance, if he's testing a bug related to compiling std.file with -unittest and it imported std.datetime, he would want to strip out as much as std.datetime as std.file doesn't need in order to minimize the code that he has to deal with to find the bug. std.datetime's unit tests are prime example of code that would be unnecessary. So, he wants a tool to strip the unit tests from a file. You can't use the compiler's lexer or parser to do that without a lot of changes. To do it 100% correctly, he needs a lexer (and possibly a parser) which can be used by a utility other than the compiler to read in a source file, strip out the unit tests, and then write out the file again. However, he's willing to settle for a utility that _mostly_ works, and you can do that without a full-blow D lexer or parser.

- Jonathan M Davis

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation