Thread overview
String Manipulation
Jul 10, 2007
okibi
Jul 10, 2007
okibi
Jul 10, 2007
Gilles G.
Jul 10, 2007
Gilles G.
Jul 10, 2007
Frits van Bommel
Jul 10, 2007
okibi
July 10, 2007
I have a question for you all.

If I have the following string or char[], how would I get the xsl filename out of it?

char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";

Is there a way to get it to return just example.xsl?

Thanks!
July 10, 2007
Well, I ended up just doing some splits to get what the location of the xsl file will be. Still, I'd like someone to tell me if there is an easier way.

Thanks!

okibi Wrote:

> I have a question for you all.
> 
> If I have the following string or char[], how would I get the xsl filename out of it?
> 
> char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";
> 
> Is there a way to get it to return just example.xsl?
> 
> Thanks!

July 10, 2007
Hello,
maybe you can just use a regular expression...
In D, this will give something like:
import std.regexp;
void main()
{
    char[] myStr = "...<?xml-stylesheet type=\"text/xsl\"href=\"example.xsl\"?>...";
    // this is the first RegExp I found, it may be flawed... It assumes the xsl
    // name has the extension ".xsl"
    auto nameOfStyleSheetRe = new RegExp(^.*href=\\\"(.*\.xsl).*$);
    auto m = nameOfStyleSheetRe.match(myStr);
    // m[1] should now contain the string "example.xsl"
    ... do what you want
}
okibi Wrote:

> Well, I ended up just doing some splits to get what the location of the xsl file will be. Still, I'd like someone to tell me if there is an easier way.
> 
> Thanks!
> 
> okibi Wrote:
> 
> > I have a question for you all.
> > 
> > If I have the following string or char[], how would I get the xsl filename out of it?
> > 
> > char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";
> > 
> > Is there a way to get it to return just example.xsl?
> > 
> > Thanks!
> 

July 10, 2007
Oups, sorry!
Of course you should use
RegExp(r"^.*href=\\\"(.*\.xsl).*$");
instead of
RegExp(^.*href=\\\"(.*\.xsl).*$);

(I Forgot the quotes _and_ the r in front of the RegExp to indicate that backslash must not be treated as an escape character...)

Regards.
--
Gilles

Gilles G. Wrote:

> Hello,
> maybe you can just use a regular expression...
> In D, this will give something like:
> import std.regexp;
> void main()
> {
>     char[] myStr = "...<?xml-stylesheet type=\"text/xsl\"href=\"example.xsl\"?>...";
>     // this is the first RegExp I found, it may be flawed... It assumes the xsl
>     // name has the extension ".xsl"
>     auto nameOfStyleSheetRe = new RegExp(^.*href=\\\"(.*\.xsl).*$);
>     auto m = nameOfStyleSheetRe.match(myStr);
>     // m[1] should now contain the string "example.xsl"
>     ... do what you want
> }
> okibi Wrote:
> 
> > Well, I ended up just doing some splits to get what the location of the xsl file will be. Still, I'd like someone to tell me if there is an easier way.
> > 
> > Thanks!
> > 
> > okibi Wrote:
> > 
> > > I have a question for you all.
> > > 
> > > If I have the following string or char[], how would I get the xsl filename out of it?
> > > 
> > > char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";
> > > 
> > > Is there a way to get it to return just example.xsl?
> > > 
> > > Thanks!
> > 
> 

July 10, 2007
[fixed upside-down reply]
okibi wrote:
> okibi Wrote:
> 
>> I have a question for you all.
>>
>> If I have the following string or char[], how would I get the xsl filename out of it?
>>
>> char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";
>>
>> Is there a way to get it to return just example.xsl?
>>
> Well, I ended up just doing some splits to get what the location of the xsl file will be. Still, I'd like someone to tell me if there is an easier way.

Did you try regexes (regular expressions)? (see http://www.digitalmars.com/d/1.0/phobos/std_regexp.html)

(I see Gilles G. has already suggested regexes since I started this post, but I'll post it anyway since I think my suggested regex is better :) )

For example:
---
import std.regexp;
import std.stdio;

void main() {
    char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";

    if (auto m = search(myStr, `<\?.*href="(.*)".*\?>`)) {
        writefln("Match: '%s'", m.match(1));
    } else {
        writefln("No match found.");
    }
}
---
(Correct for linewrapping before use)
This picks out the text between (double) quotes after 'href=' in a '<?'-'?>' block. You'll need to be a bit more tricky if you want to handle single quotes as well (or is that a HTML-only thing?). Perhaps `<\?.*href=(["'])(.*)\1.*\?>`: The part between the first parentheses captures the opening quote after 'href=', the \1 says to match the same quote there. The .xsl file name is then m.match(2), since m.match(1) is now the opening quote.

By the way: note the use of a backquoted string (``s) to avoid escaping the '\'s and '"'s, otherwise the original regexp would be "<\\?.*href=\"(.*)\".*\\?>" which is equivalent but uglier (IMHO). You could also use this for your XML string to avoid '\"' all over the place.

Also notice the quoting of '?' as '\?' since '?' is a special character in regexes.
July 10, 2007
That's exactly what I wanted! I tried using regex, but I've never really understood how they work. It's one of those things I think I need someone to spell it out for me lol.

Thanks!

Frits van Bommel Wrote:

> [fixed upside-down reply]
> okibi wrote:
> > okibi Wrote:
> > 
> >> I have a question for you all.
> >>
> >> If I have the following string or char[], how would I get the xsl filename out of it?
> >>
> >> char[] myStr = "...<?xml-stylesheet type=\"text/xsl\" href=\"example.xsl\"?>...";
> >>
> >> Is there a way to get it to return just example.xsl?
> >>
> > Well, I ended up just doing some splits to get what the location of the xsl file will be. Still, I'd like someone to tell me if there is an easier way.
> 
> Did you try regexes (regular expressions)? (see http://www.digitalmars.com/d/1.0/phobos/std_regexp.html)
> 
> (I see Gilles G. has already suggested regexes since I started this post, but I'll post it anyway since I think my suggested regex is better :) )
> 
> For example:
> ---
> import std.regexp;
> import std.stdio;
> 
> void main() {
>      char[] myStr = "...<?xml-stylesheet type=\"text/xsl\"
> href=\"example.xsl\"?>...";
> 
>      if (auto m = search(myStr, `<\?.*href="(.*)".*\?>`)) {
>          writefln("Match: '%s'", m.match(1));
>      } else {
>          writefln("No match found.");
>      }
> }
> ---
> (Correct for linewrapping before use)
> This picks out the text between (double) quotes after 'href=' in a
> '<?'-'?>' block. You'll need to be a bit more tricky if you want to
> handle single quotes as well (or is that a HTML-only thing?). Perhaps
> `<\?.*href=(["'])(.*)\1.*\?>`: The part between the first parentheses
> captures the opening quote after 'href=', the \1 says to match the same
> quote there. The .xsl file name is then m.match(2), since m.match(1) is
> now the opening quote.
> 
> By the way: note the use of a backquoted string (``s) to avoid escaping the '\'s and '"'s, otherwise the original regexp would be "<\\?.*href=\"(.*)\".*\\?>" which is equivalent but uglier (IMHO). You could also use this for your XML string to avoid '\"' all over the place.
> 
> Also notice the quoting of '?' as '\?' since '?' is a special character in regexes.