Thread overview
HTML :p, XML :D
Aug 17, 2001
John Carney
Aug 18, 2001
Walter
Aug 18, 2001
John Carney
Aug 21, 2001
John Fletcher
Aug 21, 2001
Jan Knepper
Aug 22, 2001
John Fletcher
Aug 23, 2001
Walter
Aug 24, 2001
John Carney
August 17, 2001
Why stop at HTML? Why not extract code from XML?

Cheers,
John Carney.


August 18, 2001
Because I don't understand XML. A dumb reason, but the truth. <g>

John Carney wrote in message <9ljd0p$1j67$1@digitaldaemon.com>...
>
>Why stop at HTML? Why not extract code from XML?
>
>Cheers,
>John Carney.
>
>


August 18, 2001
Yeah, I must admit it took me a while to get my head around XML, and even then I've really only scratched the surface :)

However, I do think that while a compiler that can extract code from HTML is a great idea, extraction from XML would be far, far better.

In using XML you wouldn't necessarily write a compiler that can strip <code> tag contents out of a HTML stream, you would instead feed the XML stream through a XSLT processor to deliver plain D to the compiler. Trust me, it would be bigger than Ben Hur. For one thing, it means that developers can use their own tag conventions. For another it means you can use multiple XSL sheets to do different things with the code.

I'm busy right at the moment, but give me a week or two and I'll send you some examples of the sort of things that could be done. Do you use, or have access to a Windows 98/ME/NT box with IE 5.01 or later?

Regards,
John Carney.


"Walter" <walter@digitalmars.com> wrote in message news:9lknn3$2sig$3@digitaldaemon.com...
> Because I don't understand XML. A dumb reason, but the truth. <g>
>
> John Carney wrote in message <9ljd0p$1j67$1@digitaldaemon.com>...
> >
> >Why stop at HTML? Why not extract code from XML?
> >
> >Cheers,
> >John Carney.



August 21, 2001

Walter wrote:

> Because I don't understand XML. A dumb reason, but the truth. <g>
>
> John Carney wrote in message <9ljd0p$1j67$1@digitaldaemon.com>...
> >
> >Why stop at HTML? Why not extract code from XML?
> >
> >Cheers,
> >John Carney.
> >
> >

Walter,

If it is helpful, there is a parser for XML called expat which is available and written in C.  I have a static library of it compiled with DM C++.  There are also other things which are in C++.

John

August 21, 2001
Hi John!

Is that a PD library?
I would be *very* interested in it...
Jan



> If it is helpful, there is a parser for XML called expat which is available and written in C.  I have a static library of it compiled with DM C++.  There are also other things which are in C++.

August 22, 2001

Jan Knepper wrote:

> Hi John!
>
> Is that a PD library?
> I would be *very* interested in it...
> Jan
>

It is called expat. See http://www.jclark.com/xml/expat.html for details.

John


August 23, 2001
Surprisingly, the code to extract source code from HTML is trivial. Is XML as easy? -Walter

"John Fletcher" <J.P.Fletcher@aston.ac.uk> wrote in message news:3B8222FB.2CE3B77D@aston.ac.uk...
>
>
> Walter wrote:
>
> > Because I don't understand XML. A dumb reason, but the truth. <g>
> >
> > John Carney wrote in message <9ljd0p$1j67$1@digitaldaemon.com>...
> > >
> > >Why stop at HTML? Why not extract code from XML?
> > >
> > >Cheers,
> > >John Carney.
> > >
> > >
>
> Walter,
>
> If it is helpful, there is a parser for XML called expat which is available and written in C.  I have a static library of it compiled with DM C++.  There are also other things which are in C++.
>
> John
>


August 24, 2001
I can't see how it would be any harder. XML is an SGML application, just like HTML is *supposed* to be. If you're doing the HTML extraction properly, then extracting from XML should be easier because it is a much stricter language (*many* fewer special cases).

However, I wouldn't recommend locking yourself into a simplistic "just grab everything in <code> tags". Leave yourself open to allow XSL processing so that people can develop their own tagging schemes. As I said, when I get some time I'll work up a couple of examples of things that could be done with XSL in the picture.

Regards,
John Carney.

"Walter" <walter@digitalmars.com> wrote in message news:9m4169$2uv8$1@digitaldaemon.com...
> Surprisingly, the code to extract source code from HTML is trivial. Is XML as easy? -Walter
>
> "John Fletcher" <J.P.Fletcher@aston.ac.uk> wrote in message news:3B8222FB.2CE3B77D@aston.ac.uk...
> >
> >
> > Walter wrote:
> >
> > > Because I don't understand XML. A dumb reason, but the truth. <g>
> > >
> > > John Carney wrote in message <9ljd0p$1j67$1@digitaldaemon.com>...
> > > >
> > > >Why stop at HTML? Why not extract code from XML?
> > > >
> > > >Cheers,
> > > >John Carney.
> > > >
> > > >
> >
> > Walter,
> >
> > If it is helpful, there is a parser for XML called expat which is available and written in C.  I have a static library of it compiled with DM C++.  There are also other things which are in C++.
> >
> > John
> >
>
>