Thread overview
XMLSTL progress: Testers / users wanted in a week or two; some opinions wanted now
Nov 28, 2005
Matthew
Nov 29, 2005
Jan Knepper
Nov 29, 2005
Matthew
Nov 30, 2005
Eric Beyeler
Nov 30, 2005
Matthew
Nov 30, 2005
Eric Beyeler
Nov 30, 2005
Matthew
Dec 01, 2005
Eric Beyeler
November 28, 2005
Eric Beyeler's recent post has had a positive effect.

I had some good progress last night, in particular the node class is now bristling with methods and properties corresponding to those available on the IXMLDOMNode interface. And, yes, I did say properties. It's using my technique for C++ Properties (described in Chapter 35 of Imperfect C++), facilitating code such as the following:

        try
        {
            using namespace xmlstl::msxml::dom;

            IXMLDOMDocument_ptr     doc_ptr     =
DOM_load_document(::comstl::ex::bstr(xmlFile));

            document                doc(doc_ptr);

// The following are all property invocations, i.e. they call into property
methods get_text(), get_nodeName(), get_nodeValue()

            string_t                text        =   doc.text;
            string_t                nodename    =   doc.nodeName;
            variant_t               nodeValue   =   doc.nodeValue;
        }
        catch(xmlstl::msxml::dom::parse_error &x)
        {
            .. // report
        }
        catch(xmlstl::msxml::dom::exception &x)
        {
            .. // report
        }

I'm interested in whether anyone would like to volunteer to be a tester for
the early implementations, in a week or two? At the moment, the code's using
some Synesis libraries - this will not be the case down the line (when
XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown
first - so it'd mean having a few DLLs on your system. Don't worry, there's
no spyware here. :-)

For now, I'm interested in some input on naming conventions:

1. Type names. The Synesis classes from which this stuff derives uses CamelCase naming, representative of the MSXML interfaces from which they derive. For example, the wrapper for the IXMLDOMDocument is called XMLDOMDocument. In the initial XMLSTL implementation I'm using nested namespaces to express the type "family", e.g. the document class is called

  xmlstl::msxml::dom::document

and resides in the file

 #include <xmlstl/msxml/dom/document.hpp>

The other types currently implemented are:

    - node
    - named_node_map
    - parse_error

and so on.

2. Property names. The Synesis classes from which this stuff derives doesn't
have
properties (since it was developed largely for VC++ 6 consumption). In
XMLSTL
I'm currently following the naming convention of the COM properties from the
MSXML types, i.e. the node value property is called nodeValue, rather than
NodeValue
or node_value.


Since the properties are method properties, ie. they are implemented in
terms of methods
on the class, the naming convention must allow for the get_ and/or set_
methods to
co-exist. For example, here's an extract from xmlstl::msxml::dom::node, for
the property
nodeValue:

    variant_t get_nodeValue() const;
    void  put_nodeValue(VARIANT const &);

    STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const &,
class_type, get_nodeValue, put_nodeValue, nodeValue);

Naturally, classes have methods too, which are also (currently) following
the MSXML
naming conventions, as in:

    SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);

The examples I presented in Imperfect C++ used ThisCase for properties, and I wrote at the time that that's my preferred convention, which it is. Also, since I've been doing a lot of .NET recently, I'm liking that case convention for (MS)XML more as well. However, both ThisCase for properties, and get_thisCase or get_ThisCase for property accessor methods conflict with the STLSoft convention of this_case().

So, I guess I'm looking for input from interested parties. There are thus three issues:

1. How are classes to be named: named_node_map or NamedNodeMap?  (the XMLDOM
bit is redundant, due to the namespace)
2. How are "normal" methods to be named: method_name(), or methodName(), or
MethodName()?
3. How are properties to be name: property_name, or propertyName, or
PropertyName()?
4. How are property methods to be named: get_property_name,
get_propertyName, or get_PropertyName()?

I think property methods should follow properties, i.e. if it's PropertyName then it should be get_PropertyName()

I _think_ - though I'm wide open to offers - that I might prefer the following two schemes:

A. class_name, methodName(), PropertyName, get_PropertyName()
B. class_name, method_name(), PropertyName, get_PropertyName()

The current scheme - class_name, methodName(), propertyName,
get_propertyName() - is not too bad, but it doesn't feel quite right.

(One thing I've currently got reservations about is how the naming methods and properties on collections, such as named_node_map, might affect these conventions, but I've yet to do one of those.)

So, what are your thoughts?

Cheers

Matthew


November 29, 2005
Matthew wrote:
> Eric Beyeler's recent post has had a positive effect.
> 
> I had some good progress last night, in particular the node class is now
> bristling with methods and properties corresponding to those available on
> the IXMLDOMNode interface. And, yes, I did say properties. It's using my
> technique for C++ Properties (described in Chapter 35 of Imperfect C++),
> facilitating code such as the following:
> 
>         try
>         {
>             using namespace xmlstl::msxml::dom;
> 
>             IXMLDOMDocument_ptr     doc_ptr     =
> DOM_load_document(::comstl::ex::bstr(xmlFile));
> 
>             document                doc(doc_ptr);
> 
> // The following are all property invocations, i.e. they call into property
> methods get_text(), get_nodeName(), get_nodeValue()
> 
>             string_t                text        =   doc.text;
>             string_t                nodename    =   doc.nodeName;
>             variant_t               nodeValue   =   doc.nodeValue;
>         }
>         catch(xmlstl::msxml::dom::parse_error &x)
>         {
>             .. // report
>         }
>         catch(xmlstl::msxml::dom::exception &x)
>         {
>             .. // report
>         }
> 
> I'm interested in whether anyone would like to volunteer to be a tester for
> the early implementations, in a week or two? At the moment, the code's using
> some Synesis libraries - this will not be the case down the line (when
> XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown
> first - so it'd mean having a few DLLs on your system. Don't worry, there's
> no spyware here. :-)
> 
> For now, I'm interested in some input on naming conventions:
> 
> 1. Type names. The Synesis classes from which this stuff derives uses
> CamelCase naming, representative of the MSXML interfaces from which they
> derive. For example, the wrapper for the IXMLDOMDocument is called
> XMLDOMDocument. In the initial XMLSTL implementation I'm using nested
> namespaces to express the type "family", e.g. the document class is called
> 
>   xmlstl::msxml::dom::document
> 
> and resides in the file
> 
>  #include <xmlstl/msxml/dom/document.hpp>
> 
> The other types currently implemented are:
> 
>     - node
>     - named_node_map
>     - parse_error
> 
> and so on.
> 
> 2. Property names. The Synesis classes from which this stuff derives doesn't
> have
> properties (since it was developed largely for VC++ 6 consumption). In
> XMLSTL
> I'm currently following the naming convention of the COM properties from the
> MSXML types, i.e. the node value property is called nodeValue, rather than
> NodeValue
> or node_value.
> 
> 
> Since the properties are method properties, ie. they are implemented in
> terms of methods
> on the class, the naming convention must allow for the get_ and/or set_
> methods to
> co-exist. For example, here's an extract from xmlstl::msxml::dom::node, for
> the property
> nodeValue:
> 
>     variant_t get_nodeValue() const;
>     void  put_nodeValue(VARIANT const &);
> 
>     STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const &,
> class_type, get_nodeValue, put_nodeValue, nodeValue);
> 
> Naturally, classes have methods too, which are also (currently) following
> the MSXML
> naming conventions, as in:
> 
>     SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);
> 
> The examples I presented in Imperfect C++ used ThisCase for properties, and
> I wrote at the time that that's my preferred convention, which it is. Also,
> since I've been doing a lot of .NET recently, I'm liking that case
> convention for (MS)XML more as well. However, both ThisCase for properties,
> and get_thisCase or get_ThisCase for property accessor methods conflict with
> the STLSoft convention of this_case().
> 
> So, I guess I'm looking for input from interested parties. There are thus
> three issues:
> 
> 1. How are classes to be named: named_node_map or NamedNodeMap?  (the XMLDOM
> bit is redundant, due to the namespace)
> 2. How are "normal" methods to be named: method_name(), or methodName(), or
> MethodName()?
> 3. How are properties to be name: property_name, or propertyName, or
> PropertyName()?
> 4. How are property methods to be named: get_property_name,
> get_propertyName, or get_PropertyName()?
> 
> I think property methods should follow properties, i.e. if it's PropertyName
> then it should be get_PropertyName()
> 
> I _think_ - though I'm wide open to offers - that I might prefer the
> following two schemes:
> 
> A. class_name, methodName(), PropertyName, get_PropertyName()
> B. class_name, method_name(), PropertyName, get_PropertyName()
> 
> The current scheme - class_name, methodName(), propertyName,
> get_propertyName() - is not too bad, but it doesn't feel quite right.
> 
> (One thing I've currently got reservations about is how the naming methods
> and properties on collections, such as named_node_map, might affect these
> conventions, but I've yet to do one of those.)
> 
> So, what are your thoughts?
> 
> Cheers
> 
> Matthew
> 
> 

Matthew, I hope you are not trying to write your own XML parser. That already has been done. Just do a search on expat used by quite a few major players in the Unix market.
On top of that there are several DOM's already out there. Apache www.apache.org seems to have a very good one.

Jan

-- 
ManiaC++
Jan Knepper

But as for me and my household, we shall use Mozilla...
www.mozilla.org
November 29, 2005
"Jan Knepper" <jan@smartsoft.us> wrote in message news:dmiih8$2969$1@digitaldaemon.com...
> Matthew wrote:
> > Eric Beyeler's recent post has had a positive effect.
> >
> > I had some good progress last night, in particular the node class is now bristling with methods and properties corresponding to those available
on
> > the IXMLDOMNode interface. And, yes, I did say properties. It's using my technique for C++ Properties (described in Chapter 35 of Imperfect C++), facilitating code such as the following:
> >
> >         try
> >         {
> >             using namespace xmlstl::msxml::dom;
> >
> >             IXMLDOMDocument_ptr     doc_ptr     =
> > DOM_load_document(::comstl::ex::bstr(xmlFile));
> >
> >             document                doc(doc_ptr);
> >
> > // The following are all property invocations, i.e. they call into
property
> > methods get_text(), get_nodeName(), get_nodeValue()
> >
> >             string_t                text        =   doc.text;
> >             string_t                nodename    =   doc.nodeName;
> >             variant_t               nodeValue   =   doc.nodeValue;
> >         }
> >         catch(xmlstl::msxml::dom::parse_error &x)
> >         {
> >             .. // report
> >         }
> >         catch(xmlstl::msxml::dom::exception &x)
> >         {
> >             .. // report
> >         }
> >
> > I'm interested in whether anyone would like to volunteer to be a tester
for
> > the early implementations, in a week or two? At the moment, the code's
using
> > some Synesis libraries - this will not be the case down the line (when XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown first - so it'd mean having a few DLLs on your system. Don't worry,
there's
> > no spyware here. :-)
> >
> > For now, I'm interested in some input on naming conventions:
> >
> > 1. Type names. The Synesis classes from which this stuff derives uses CamelCase naming, representative of the MSXML interfaces from which they derive. For example, the wrapper for the IXMLDOMDocument is called XMLDOMDocument. In the initial XMLSTL implementation I'm using nested namespaces to express the type "family", e.g. the document class is
called
> >
> >   xmlstl::msxml::dom::document
> >
> > and resides in the file
> >
> >  #include <xmlstl/msxml/dom/document.hpp>
> >
> > The other types currently implemented are:
> >
> >     - node
> >     - named_node_map
> >     - parse_error
> >
> > and so on.
> >
> > 2. Property names. The Synesis classes from which this stuff derives
doesn't
> > have
> > properties (since it was developed largely for VC++ 6 consumption). In
> > XMLSTL
> > I'm currently following the naming convention of the COM properties from
the
> > MSXML types, i.e. the node value property is called nodeValue, rather
than
> > NodeValue
> > or node_value.
> >
> >
> > Since the properties are method properties, ie. they are implemented in
> > terms of methods
> > on the class, the naming convention must allow for the get_ and/or set_
> > methods to
> > co-exist. For example, here's an extract from xmlstl::msxml::dom::node,
for
> > the property
> > nodeValue:
> >
> >     variant_t get_nodeValue() const;
> >     void  put_nodeValue(VARIANT const &);
> >
> >     STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const
&,
> > class_type, get_nodeValue, put_nodeValue, nodeValue);
> >
> > Naturally, classes have methods too, which are also (currently)
following
> > the MSXML
> > naming conventions, as in:
> >
> >     SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);
> >
> > The examples I presented in Imperfect C++ used ThisCase for properties,
and
> > I wrote at the time that that's my preferred convention, which it is.
Also,
> > since I've been doing a lot of .NET recently, I'm liking that case convention for (MS)XML more as well. However, both ThisCase for
properties,
> > and get_thisCase or get_ThisCase for property accessor methods conflict
with
> > the STLSoft convention of this_case().
> >
> > So, I guess I'm looking for input from interested parties. There are
thus
> > three issues:
> >
> > 1. How are classes to be named: named_node_map or NamedNodeMap?  (the
XMLDOM
> > bit is redundant, due to the namespace)
> > 2. How are "normal" methods to be named: method_name(), or methodName(),
or
> > MethodName()?
> > 3. How are properties to be name: property_name, or propertyName, or
> > PropertyName()?
> > 4. How are property methods to be named: get_property_name,
> > get_propertyName, or get_PropertyName()?
> >
> > I think property methods should follow properties, i.e. if it's
PropertyName
> > then it should be get_PropertyName()
> >
> > I _think_ - though I'm wide open to offers - that I might prefer the following two schemes:
> >
> > A. class_name, methodName(), PropertyName, get_PropertyName()
> > B. class_name, method_name(), PropertyName, get_PropertyName()
> >
> > The current scheme - class_name, methodName(), propertyName,
> > get_propertyName() - is not too bad, but it doesn't feel quite right.
> >
> > (One thing I've currently got reservations about is how the naming
methods
> > and properties on collections, such as named_node_map, might affect
these
> > conventions, but I've yet to do one of those.)
> >
> > So, what are your thoughts?
> >
> > Cheers
> >
> > Matthew
> >
> >
>
> Matthew, I hope you are not trying to write your own XML parser.

Heavens no! That would be insanity.

The library, like that of all STLSoft sub-projects, is a wrapper for existing functionality, with the purpose(s) of:

> That
> already has been done. Just do a search on expat used by quite a few
> major players in the Unix market.
> On top of that there are several DOM's already out there. Apache
> www.apache.org seems to have a very good one.

1. Improving ease of use. I doubt there'd be many C++ programmers who'd
contend that MSXML is easy to use
2. Unifying the syntax between libraries. Once MSXML is done, I'll be
wrapping other libs, including Xerces (Apache) and maybe Expat (though
AFAIK, that's SAX, and this first effort is wrapping DOMC). Just this
morning I'm amending a previous little XML editor that I wrote a few years
to compile and run with XMLSTL as well as my original Synesis libs, and also
with Xerces. XMLSTL currently contains only the xmlstl::msxml::dom
namespace, but I plan xmlstl::xerces::dom, and so on.
3. STL-ifying the wrapped libraries. I've got collections such as
child_node_sequence already written which works for nodes and attributes.

Cheers

Matthew


November 30, 2005
"Matthew" <matthew@hat.stlsoft.dot.org> wrote in message news:dmft3a$2vkf$1@digitaldaemon.com...
> Eric Beyeler's recent post has had a positive effect.
>
> I had some good progress last night, in particular the node class is now bristling with methods and properties corresponding to those available on the IXMLDOMNode interface. And, yes, I did say properties. It's using my technique for C++ Properties (described in Chapter 35 of Imperfect C++), facilitating code such as the following:
>
>         try
>         {
>             using namespace xmlstl::msxml::dom;
>
>             IXMLDOMDocument_ptr     doc_ptr     =
> DOM_load_document(::comstl::ex::bstr(xmlFile));
>
>             document                doc(doc_ptr);
>
> // The following are all property invocations, i.e. they call into
property
> methods get_text(), get_nodeName(), get_nodeValue()
>
>             string_t                text        =   doc.text;
>             string_t                nodename    =   doc.nodeName;
>             variant_t               nodeValue   =   doc.nodeValue;
>         }
>         catch(xmlstl::msxml::dom::parse_error &x)
>         {
>             .. // report
>         }
>         catch(xmlstl::msxml::dom::exception &x)
>         {
>             .. // report
>         }
>
> I'm interested in whether anyone would like to volunteer to be a tester
for
> the early implementations, in a week or two? At the moment, the code's
using
> some Synesis libraries - this will not be the case down the line (when XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown first - so it'd mean having a few DLLs on your system. Don't worry,
there's
> no spyware here. :-)
>
> For now, I'm interested in some input on naming conventions:
>
> 1. Type names. The Synesis classes from which this stuff derives uses CamelCase naming, representative of the MSXML interfaces from which they derive. For example, the wrapper for the IXMLDOMDocument is called XMLDOMDocument. In the initial XMLSTL implementation I'm using nested namespaces to express the type "family", e.g. the document class is called
>
>   xmlstl::msxml::dom::document
>
> and resides in the file
>
>  #include <xmlstl/msxml/dom/document.hpp>
>
> The other types currently implemented are:
>
>     - node
>     - named_node_map
>     - parse_error
>
> and so on.
>
> 2. Property names. The Synesis classes from which this stuff derives
doesn't
> have
> properties (since it was developed largely for VC++ 6 consumption). In
> XMLSTL
> I'm currently following the naming convention of the COM properties from
the
> MSXML types, i.e. the node value property is called nodeValue, rather than
> NodeValue
> or node_value.
>
>
> Since the properties are method properties, ie. they are implemented in
> terms of methods
> on the class, the naming convention must allow for the get_ and/or set_
> methods to
> co-exist. For example, here's an extract from xmlstl::msxml::dom::node,
for
> the property
> nodeValue:
>
>     variant_t get_nodeValue() const;
>     void  put_nodeValue(VARIANT const &);
>
>     STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const &,
> class_type, get_nodeValue, put_nodeValue, nodeValue);
>
> Naturally, classes have methods too, which are also (currently) following
> the MSXML
> naming conventions, as in:
>
>     SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);
>
> The examples I presented in Imperfect C++ used ThisCase for properties,
and
> I wrote at the time that that's my preferred convention, which it is.
Also,
> since I've been doing a lot of .NET recently, I'm liking that case convention for (MS)XML more as well. However, both ThisCase for
properties,
> and get_thisCase or get_ThisCase for property accessor methods conflict
with
> the STLSoft convention of this_case().
>
> So, I guess I'm looking for input from interested parties. There are thus three issues:
>
> 1. How are classes to be named: named_node_map or NamedNodeMap?  (the
XMLDOM
> bit is redundant, due to the namespace)
> 2. How are "normal" methods to be named: method_name(), or methodName(),
or
> MethodName()?
> 3. How are properties to be name: property_name, or propertyName, or
> PropertyName()?
> 4. How are property methods to be named: get_property_name,
> get_propertyName, or get_PropertyName()?
>
> I think property methods should follow properties, i.e. if it's
PropertyName
> then it should be get_PropertyName()
>
> I _think_ - though I'm wide open to offers - that I might prefer the following two schemes:
>
> A. class_name, methodName(), PropertyName, get_PropertyName()
> B. class_name, method_name(), PropertyName, get_PropertyName()
>
> The current scheme - class_name, methodName(), propertyName,
> get_propertyName() - is not too bad, but it doesn't feel quite right.
>
> (One thing I've currently got reservations about is how the naming methods and properties on collections, such as named_node_map, might affect these conventions, but I've yet to do one of those.)
>
> So, what are your thoughts?
>
> Cheers
>
> Matthew
>
>

Just my thoughts, especially if you will be extending this to other libraries and one of your goals is to "STL-ize" this functionality (as you indicate in another post)... I am beginning to really like the naming_convention() of the STL. Lower case and underscores.

I may be willing to test the alphas, but it depends on how much effort it will be to get a baseline to compile (which compiler? VC6 or VC7 / 8?)  I am using MSXML4 as the parser. I won't have much time to test until January, but should be able to get a little bit in this month.

Eric


November 30, 2005
>> (One thing I've currently got reservations about is how the naming
>> methods
>> and properties on collections, such as named_node_map, might affect these
>> conventions, but I've yet to do one of those.)
>>
>> So, what are your thoughts?
>>
>> Matthew
>
> Just my thoughts, especially if you will be extending this to other libraries and one of your goals is to "STL-ize" this functionality (as you indicate in another post)... I am beginning to really like the naming_convention() of the STL. Lower case and underscores.

I know. It's addictive, isn't it? ;-)

The thing is, I think that you/we like this simply because we're using it. I know I start out using function_naming in C many moons ago, and then detested MethodName when I started using C++. I then detested method_name when I started using STL, and the detested methodName in Java and Ruby.

I think one benefit with an underscore free/limited form is that the names stand out, which is a good thing for properties.

Perhaps, if my contention that all conventions are equally bad, it might be best to follow whatever the convention that W3C use in the DOM specification? I'll look into this ...

... they're defined in http://www.w3.org/TR/DOM-Level-2-Core/core.html as "previousSibling", "nodeValue" for the properties, and "getNamedItem()", "removeNamedItem()" for the methods. Both MSXML DOM and Xerces DOM already use this naming convention, so I'm inclined to go with it (in part because that means I don't have to change anything). The property methods will therefore be named get_previousSibling(), put_nodeValue().

This would also mean that I've have to change the class names, i.e. entity_reference => EntityReference, which is a bit of work. But I think conformance to the W3C std outweighs the STL convention, unless someone can persuade me otherwise.

> I won't have much time to test until January,
> but should be able to get a little bit in this month.

Today I've rewritten a horrid but non-trivial app - XmlEd - such that it now compiles with three configurations: with Synesis XML libraries; with XMLSTL (MSXML DOM); with Xerces. That's proved a fair amount of the MSXML DOM wrapping, although it's by no means full coverage. I've also just written a small and relatively simple program that reads in an XML file and outputs nodes and attributes.

I'm going to have to move off XMLSTL for a while soon, so maybe the best thing is to do a couple more simple tests over the next few days, and then release an alpha lib for people to play with.

> I may be willing to test the alphas, but it depends on how much effort it
> will be to get a baseline to compile (which compiler? VC6 or VC7 / 8?)  I
> am
> using MSXML4 as the parser.

Regarding the compiler support, the properties are enabled for VC++7.1 and other compilers that support my C++ Properties technique. For other compilers, the properties are not enabled, and therefore client code would have to use the property methods. Hence, the following code is equivalent:

xmlstl::msxml::dom::node n = . . .

// VC++ 6.0 and equivalent

string_t name = n.get_nodeName();

// VC++ 7.1 and equivalent

string_t name = n.nodeName;

The latter (.nodeName) simply invokes the former (.get_nodeName()).


Cheers

Matthew


November 30, 2005
"Matthew" <matthew@stlsoft.com> wrote in message news:dmkc8j$n0g$1@digitaldaemon.com...
> >> (One thing I've currently got reservations about is how the naming
> >> methods
> >> and properties on collections, such as named_node_map, might affect
these
> >> conventions, but I've yet to do one of those.)
> >>
> >> So, what are your thoughts?
> >>
> >> Matthew
> >
> > Just my thoughts, especially if you will be extending this to other libraries and one of your goals is to "STL-ize" this functionality (as
you
> > indicate in another post)... I am beginning to really like the
> > naming_convention() of the STL. Lower case and underscores.
>
> I know. It's addictive, isn't it? ;-)
>
> The thing is, I think that you/we like this simply because we're using it.
I
> know I start out using function_naming in C many moons ago, and then detested MethodName when I started using C++. I then detested method_name when I started using STL, and the detested methodName in Java and Ruby.
>
> I think one benefit with an underscore free/limited form is that the names stand out, which is a good thing for properties.
>
> Perhaps, if my contention that all conventions are equally bad, it might
be
> best to follow whatever the convention that W3C use in the DOM specification? I'll look into this ...
>
> ... they're defined in http://www.w3.org/TR/DOM-Level-2-Core/core.html as
> "previousSibling", "nodeValue" for the properties, and "getNamedItem()",
> "removeNamedItem()" for the methods. Both MSXML DOM and Xerces DOM already
> use this naming convention, so I'm inclined to go with it (in part because
> that means I don't have to change anything). The property methods will
> therefore be named get_previousSibling(), put_nodeValue().
>
> This would also mean that I've have to change the class names, i.e. entity_reference => EntityReference, which is a bit of work. But I think conformance to the W3C std outweighs the STL convention, unless someone
can
> persuade me otherwise.
>

naming convention isn't a huge deal for me, as long as it's consistent.


> > I won't have much time to test until January,
> > but should be able to get a little bit in this month.
>
> Today I've rewritten a horrid but non-trivial app - XmlEd - such that it
now
> compiles with three configurations: with Synesis XML libraries; with
XMLSTL
> (MSXML DOM); with Xerces. That's proved a fair amount of the MSXML DOM wrapping, although it's by no means full coverage. I've also just written
a
> small and relatively simple program that reads in an XML file and outputs nodes and attributes.
>
> I'm going to have to move off XMLSTL for a while soon, so maybe the best thing is to do a couple more simple tests over the next few days, and then release an alpha lib for people to play with.
>
> > I may be willing to test the alphas, but it depends on how much effort
it
> > will be to get a baseline to compile (which compiler? VC6 or VC7 / 8?)
I
> > am
> > using MSXML4 as the parser.
>
> Regarding the compiler support, the properties are enabled for VC++7.1 and other compilers that support my C++ Properties technique. For other compilers, the properties are not enabled, and therefore client code would have to use the property methods. Hence, the following code is equivalent:
>
> xmlstl::msxml::dom::node n = . . .
>
> // VC++ 6.0 and equivalent
>
> string_t name = n.get_nodeName();
>
> // VC++ 7.1 and equivalent
>
> string_t name = n.nodeName;
>
> The latter (.nodeName) simply invokes the former (.get_nodeName()).
>
>

cool.

One thing I will comment, though. You said that your node class has methods / properties corresponding to the IXMLDOMNode interface.  How will implementing other parsers affect the xmlstl interface? In particular, one of the reasons I am looking for a wrapper is that the interface for iteration is cumbersome. How will that be presented?

Eric


November 30, 2005
> > Perhaps, if my contention that all conventions are equally bad, it might
> be
> > best to follow whatever the convention that W3C use in the DOM specification? I'll look into this ...
> >
> > ... they're defined in http://www.w3.org/TR/DOM-Level-2-Core/core.html
as
> > "previousSibling", "nodeValue" for the properties, and "getNamedItem()",
> > "removeNamedItem()" for the methods. Both MSXML DOM and Xerces DOM
already
> > use this naming convention, so I'm inclined to go with it (in part
because
> > that means I don't have to change anything). The property methods will
> > therefore be named get_previousSibling(), put_nodeValue().
> >
> > This would also mean that I've have to change the class names, i.e. entity_reference => EntityReference, which is a bit of work. But I think conformance to the W3C std outweighs the STL convention, unless someone
> can
> > persuade me otherwise.
> >
>
> naming convention isn't a huge deal for me, as long as it's consistent.

Cool.

I think applying the Principle of Least Surprise in this case will be helpful to the library's acceptance.

> One thing I will comment, though. You said that your node class has
methods
> / properties corresponding to the IXMLDOMNode interface.  How will implementing other parsers affect the xmlstl interface?

Hopefully they should all be the same, or as near as. This remains to be seen.

> In particular, one
> of the reasons I am looking for a wrapper is that the interface for
> iteration is cumbersome. How will that be presented?

Excellent point. At the moment, I've got the xmlstl::msxml::dom::* classes
just using the (get_)length and (get_)item, as in the specified interface
for NodeList:

    interface NodeList {
      Node               item(in unsigned long index);
      readonly attribute unsigned long    length;
    };

So, to enumerate the child nodes of a node n, we currently have two options:

1. We can use the "childNodes" property on the node to obtain the node_list, and then use the "length" and "item" properties on the node_list instance. This is the W3C recommended practice. Here's an extract from the dom_node_print program I wrote yesterday (that also does attributes, and comments).

    static void dump_node(int depth, xmlstl::msxml::dom::node const &n)
    {
        stlsoft::simple_wstring  prefix(depth, ' ');

        wcout << prefix << L"<" << n.nodeName;

        xmlstl::msxml::dom::node_list       childNodes  =   n.childNodes;

        { for(size_t i = 0; i < childNodes.length; ++i)
        {
            dump_node(1 + depth, childNodes[i]);

// or

            dump_node(1 + depth, childNodes.get_item(i));

// but, since I've not yet done parameterised properties, there's not

            dump_node(1 + depth, childNodes.item[i]);
        }}

        wcout << prefix << L"</" << n.nodeName << L">" << endl;
    }

I think that's neat enough, although there's a subtlety regarding the fact that the item returned by the subscript operator (or get_item()) is not a node, but rather an instance of IXMLDOMNode_ptr, which is stlsoft::ref_ptr<IXMLDOMNode>. That type is convertible to xmlstl::msxml::dom::node, however, which is why the call to dump_node is well-formed. But the following code would not be well-formed, since stlsoft::ref_ptr<IXMLDOMNode> does not have a property "nodeName":

    childNodes[i].nodeName;

At the moment, I've a number of ideas about this, but I need to think about it some more.

2. We can pass the node into an instance of a class I wrote several years ago - child_node_sequence - which would afford the following use pattern:

    static void dump_node(int depth, xmlstl::msxml::dom::node const &n)
    {
        stlsoft::simple_wstring  prefix(depth, ' ');

        wcout << prefix << L"<" << n.nodeName;

        xmlstl::msxml::dom::child_node_sequence
children(xmlstl::get_ref(node));
        { for(xmlstl::msxml::dom::child_node_sequence::iterator begin  =
children.begin(); begin != children.end(); ++begin)
        {
            dump_node(1 + depth, *begin);
        }}


        wcout << prefix << L"</" << n.nodeName << L">" << endl;
    }

The child_node_sequence class and the node class currently have no relationship. They exchange instances of IXMLDOMNode_ptr. (This is the Ref element of the Handle::Ref pattern, which I'm busily writing up at long last at the moment.). The ugly use of the xmlstl::get_ref() shim is needed only until I build that into child_node_sequence.

3. I've several ideas about how to make things a little more succinct

a. Have node_list present begin() and end() methods, i.e. turn it into an
STL sequence. And the same for named_node_map
b. Have the node class present enum_children(), enum_attributes(), which
would return instances of child_node_sequence and attribute_sequence (which
is another sequence class from the early XMLSTL attempts some time back)
c. Have the node class present child_begin(), child_end(), attr_begin(),
attr_end()
d. Something else I've not yet thought of ...

At the moment, I'm leaning towards (a). For both (a) and (b), however, I'll
have to ensure that iterators from disparate sequence instances are
compatible, i.e. the "state" they hold will have to pertain to the
underlying DOM instances (for MSXML this is IXMLDOMNodeList* and
IXMLDOMNamedNodeMap*)

Whatever the end result, the implementation should be relatively straightforward. It's deciding on the interface that's the key. I think that that will be informed by other people using the current lib, and by my mapping other XML libs, such as Xerces.

I'll try and get an alpha out in the next few days, after I've written another test program. It'll be a lot easier to talk about once people can get their hands on some code.

Cheers

Matthew



December 01, 2005
>
> > In particular, one
> > of the reasons I am looking for a wrapper is that the interface for
> > iteration is cumbersome. How will that be presented?
>
> Excellent point. At the moment, I've got the xmlstl::msxml::dom::* classes
> just using the (get_)length and (get_)item, as in the specified interface
> for NodeList:
>
>     interface NodeList {
>       Node               item(in unsigned long index);
>       readonly attribute unsigned long    length;
>     };
>
> So, to enumerate the child nodes of a node n, we currently have two
options:
>
> 1. We can use the "childNodes" property on the node to obtain the
node_list,
> and then use the "length" and "item" properties on the node_list instance. This is the W3C recommended practice. Here's an extract from the dom_node_print program I wrote yesterday (that also does attributes, and comments).
>
>
> 2. We can pass the node into an instance of a class I wrote several years ago - child_node_sequence - which would afford the following use pattern:
>
>     static void dump_node(int depth, xmlstl::msxml::dom::node const &n)
>     {
>         stlsoft::simple_wstring  prefix(depth, ' ');
>
>         wcout << prefix << L"<" << n.nodeName;
>
>         xmlstl::msxml::dom::child_node_sequence
> children(xmlstl::get_ref(node));
>         { for(xmlstl::msxml::dom::child_node_sequence::iterator begin  =
> children.begin(); begin != children.end(); ++begin)
>         {
>             dump_node(1 + depth, *begin);
>         }}
>
>
>         wcout << prefix << L"</" << n.nodeName << L">" << endl;
>     }
>
> The child_node_sequence class and the node class currently have no relationship. They exchange instances of IXMLDOMNode_ptr. (This is the Ref element of the Handle::Ref pattern, which I'm busily writing up at long
last
> at the moment.). The ugly use of the xmlstl::get_ref() shim is needed only
> until I build that into child_node_sequence.
>
> 3. I've several ideas about how to make things a little more succinct
>
> a. Have node_list present begin() and end() methods, i.e. turn it into an
> STL sequence. And the same for named_node_map
> b. Have the node class present enum_children(), enum_attributes(), which
> would return instances of child_node_sequence and attribute_sequence
(which
> is another sequence class from the early XMLSTL attempts some time back)
> c. Have the node class present child_begin(), child_end(), attr_begin(),
> attr_end()
> d. Something else I've not yet thought of ...
>
> At the moment, I'm leaning towards (a). For both (a) and (b), however,
I'll
> have to ensure that iterators from disparate sequence instances are compatible, i.e. the "state" they hold will have to pertain to the underlying DOM instances (for MSXML this is IXMLDOMNodeList* and IXMLDOMNamedNodeMap*)
>

I was thinking along the lines of  3-c or 3-b. That's what I'm looking for -
stl-style iteration through the children. It wouldn't hurt to have 3-a as
well.
Another method of iteration people may want is a depth-first iteration of
all subelements of a node, not just its immediate children. Don't know how
that would factor in, but that shouldn't affect the decisions about the
shallow child iteration.

Eric