I try to learn python3 and i am stack little bit with the regular expressions. I study the HOWTO for this but i did not understand very well.this page
1\d2\D2
^a\w+z$
I try to learn python3 and i am stack little bit with the regular expressions. I study the HOWTO for this but i did not understand very well.this page
1\d2\D2
^a\w+z$
you can generate example strings by reading the expression and choosing appropriate characters step by step.
for example, 1\d2\D2:
1\d2\D2 -> 1
^ 1 means a literal number 1
1\d2\D2 -> 17
^^ \d means any digit (0-9). let's choose 7.
1\d2\D2 -> 172
^ 2 means a literal number 2.
1\d2\D2 -> 172X
^^ \D means anything *but* a digit (0-9). let's choose X
1\d2\D2 -> 172X2
^ 2 means a literal number 2.
so 172X2 would be matched by 1\d2\D2
your next one - ^a\w+z$ - can have multiple lengths:
^a\w+z$
^ this means we need to be at the start of a line (and we are, so that's cool)
^a\w+z$ -> a
^ a means a literal letter a
^a\w+z$ -> a4
^^ \w means a digit, letter, or "_". let's choose 4.
^a\w+z$ -> a4
^ + means we can return to whatever is to the left, if we want, so let's do that...
^a\w+z$ -> a4Q
^^ \w means a digit, letter, or "_". let's choose Q.
^a\w+z$ -> a4Q
^ + means we can return to whatever is to the left, if we want, so let's do that...
^a\w+z$ -> a4Q1
^^ \w means a digit, letter, or "_". let's choose 1.
^a\w+z$ -> a4Q1
^ + means we can return to whatever is to the left, but now let's stop
^a\w+z$ -> a4Q1z
^ z means a literal letter z
^a\w+z$ -> a4Q1z
^ $ means we must be at the end of the line, and we are (and so cannot add more)
so a4Q1z would be matched by ^a\w+z$. so would a4z (you can check...)
note that * is like + in that you can jump back and repeat but also it means that you can completely skip what is to the left (in other words, + means "repeat at least once", but * means "repeat zero or more" (the "zero" being the skip)).
update:
[abc] means pick any one of a, b or c.
x{2,3} means add x 2 to 3 times (like + but with limits to the number of times). so, xx or xxx.
\1 is a bit more complicated. you need to find what would have been inside the first (because the number 1) set of parentheses and add that. so, for example, (\d+)\1 would match 2323 if you had worked from left to right and chosen 23 for (\d+).
To generate some samples that would be matched, one would probably parse the regex and send each chunk of the regex to a function you would write like getRandomSatisfyingText. Call it a bunch of times until you get 3 unique strings. It probably wouldn't be too hard until you started supporting atomic assertions (lookaheads and behinds).