Your problem is a bit trickier than Why do I get the first capture group only? but some of those ideas may help. The trick is to stop thinking about doing everything in a single pattern.
If that's really your input, I'd be tempted to match groups of things around an =. Matching in list context, such as assigning to a hash, returns the list of matches:
use Data::Dumper;
my $input = "<cell> cell1=cell2 <pin> pin1=pin2 pin3=pin4 <type> type1=type2";
my %values = $input =~ m/ (\S+) = (\S+) /gx;
print Dumper( \%values );
The things before the = become keys and the things after become the values:
$VAR1 = {
'pin1' => 'pin2',
'type1' => 'type2',
'cell1' => 'cell2',
'pin3' => 'pin4'
};
But life probably isn't that easy. The example names probably don't really have pin, cell, and so on.
There's another thing I like to do, though, because I miss having all that fun with sscan. You can walk a string by matching part of it at a time, then on the next match, start where you left off. Here's the whole thing first:
use v5.10;
use Data::Dumper;
my $input = "<cell> cell1=cell2 <pin> pin1=pin2 pin3=pin4 <type> type1=type2";
my %hash;
while( 1 ) {
state $type;
if( $input =~ /\G < (.*?) > \s* /xgc ) {
$type = $1;
}
elsif( $input =~ /\G (\S+) = (\S+) \s* /xgc ) {
$hash{$type}{$1}{$2}++;
}
else { last }
}
print Dumper( \%hash );
And the data structure, which really doesn't matter and can be anything that you like:
$VAR1 = {
'type' => {
'type1' => {
'type2' => 1
}
},
'pin' => {
'pin1' => {
'pin2' => 1
},
'pin3' => {
'pin4' => 1
}
},
'cell' => {
'cell1' => {
'cell2' => 1
}
}
};
But let's talk about his for a moment. First, all of the matches are in scalar context since they are in the conditional parts of the if-elsif-else branches. That means they only make the next match.
However, I've anchored the start of each pattern with \G. This makes the pattern match at the beginning of the string or the position where the previous successful match left off when I use the /g flag in scalar context.
But, I want to try several patterns, so some of them are going to fail. That's where the /c flag comes in. It doesn't reset the match position on failure. That means the \G anchor won't reset on an unsuccessful match. So, I can try a pattern, and if that doesn't work, start at the same position with the next one.
So, when I encounter something in angle brackets, I remember that type. Until I match another thing in angle brackets, that's the type of thing I'm matching. Now when I match (\S+) = (\S+), I can assign the matches to the right type.
To watch this happen, you can output the remembered string position. Each scalar maintains its own cursor and pos(VAR) returns that position:
use v5.10;
use Data::Dumper;
my $input = "<cell> cell1=cell2 <pin> pin1=pin2 pin3=pin4 <type> type1=type2";
my %hash;
while( 1 ) {
state $type;
say "Starting matches at " . ( pos($input) // 0 );
if( $input =~ /\G < (.*?) > \s* /xgc ) {
$type = $1;
say "Matched <$type>, left off at " . pos($input);
}
elsif( $input =~ /\G (\S+) = (\S+) \s* /xgc ) {
$hash{$type}{$1}{$2}++;
say "Matched <$1|$2>, left off at " . pos($input);
}
else {
say "Nothing left to do, left off at " . pos($input);
last;
}
}
print Dumper( \%hash );
Before the Dumper output, you now see the global matches in scalar context walk the string:
Starting matches at 0
Matched <cell>, left off at 7
Starting matches at 7
Matched <cell1|cell2>, left off at 19
Starting matches at 19
Matched <pin>, left off at 25
Starting matches at 25
Matched <pin1|pin2>, left off at 35
Starting matches at 35
Matched <pin3|pin4>, left off at 45
Starting matches at 45
Matched <type>, left off at 52
Starting matches at 52
Matched <type1|type2>, left off at 63
Starting matches at 63
Nothing left to do, left off at 63
Finally, as a bonus, here's a recursive decent grammar that does it. It's certainly overkill for what you've provided, but does better in more tricky situations. I won't explain it other than to say it produces the same data structure:
use v5.10;
use Parse::RecDescent;
use Data::Dumper;
my $grammar = <<~'HERE';
startrule: context_pairlist(s)
context_pairlist: context /\s*/ pair(s)
context: '<' /[^>]+/ '>'
{ $::context = $item[2] }
pair: /[A-Za-z0-9]+/ '=' /[A-Za-z0-9]+/
{ main::build_hash( $::context, @item[1,3] ) }
HERE
my $parser = Parse::RecDescent->new( $grammar );
my %hash;
sub build_hash {
my( $context, $name, $value ) = @_;
$hash{$context}{$name}{$value}++;
}
my $input = "<cell> cell1=cell2 <pin> pin1=pin2 pin3=pin4 <type> type1=type2";
$parser->startrule( $input );
say Dumper( \%hash );