0

I stumbled upon a behaviour I can not explain, hope some of you guys can help me out.

I try to generate a sort of documentation from a bigger Ant project, as such I use sed to filter the information from the files, that i need in the documentation later.

I have a normal ant buildfile with lines like this:

    <target name="targetA" depends="targetD" description="some fancy description">
...
    <target name="targetB" depends="targetD" description="some fancy description">
...
    <target name="targetC" depends="targetD" description="some fancy description">

Now I run along with this sed line:

sed -nr 's/.*?target name="(.*?)".*="(.*?)".*/ * \1 - \2/p'

It should give me:

 * targetA - some fancy description
 * targetB - some fancy description
 * targetC - some fancy description

Instead I get:

 * targetA" depends="targetD" - some fancy description
 * targetA" depends="targetD" - some fancy description
 * targetA" depends="targetD" - some fancy description

I tried skipping the output of the second group to verify that it is the first group that actually matches the whole "depends" part with it, although I set the regex to be non-greedy till the next double-quote.

What am I missing here?

Using a more explicit regex like this works like expected but I still don't get the greedy thing:

sed -nr 's/.*?target name="(.*?)".*=.*="(.*?)".*/ * \1 - \2/p'

As this might be of interest, I'm using sed-4.2.2-4ubuntu1 on ubuntu linux (default install)

Jawa
  • 3,679

1 Answers1

1

Sed does not support non-greedy matches as seen in ".*?" expression.

Try this:

sed -nr 's/.*target name="([^"]*)" .*="(.*)".*/ * \1 - \2/p' file

Output:

 * targetA - some fancy description
 * targetB - some fancy description
 * targetC - some fancy description
Cyrus
  • 5,751