An example describes it better. Suppose you have a structure like this:
<h1>TITLE OF HEAD 1</h1>
<table>
<tbody>
<tr>
<td class="one">ITEM 1, AFTER HEAD 1</td>
</tr>
<tr>
<td class="one">ITEM 2, AFTER HEAD 1</td>
</tr>
</tbody>
</table>
<table>
<tbody>
<tr>
<td class="one">ITEM 3, AFTER HEAD 1</td>
</tr>
<tr>
<td class="one">ITEM 4, AFTER HEAD 1</td>
</tr>
<tr>
<td class="one">ITEM 5, AFTER HEAD 1</td>
</tr>
</tbody>
</table>
<h1>TITLE OF HEAD 2</h1>
<table>
<tbody>
<tr>
<td class="one">ITEM 6, AFTER HEAD 2</td>
</tr>
</tbody>
</table>
<h1>TITLE OF HEAD 3</h1>
<table>
<tbody>
<tr>
<td class="one">ITEM 7, AFTER HEAD 3</td>
</tr>
<tr>
<td class="one">ITEM 8, AFTER HEAD 3</td>
</tr>
<tr>
<td class="one">ITEM 9, AFTER HEAD 3</td>
</tr>
<tr>
<td class="one">ITEM 10, AFTER HEAD 3</td>
</tr>
</tbody>
</table>
<h1>TITLE OF HEAD 4</h1>
<table>
<tbody>
<tr>
<td class="one">ITEM 11, AFTER HEAD 4</td>
</tr>
<tr>
<td class="one">ITEM 12, AFTER HEAD 4</td>
</tr>
</tbody>
</table>
And with regex, the outcome should be:
<table>
<tbody>
<tr>
<td class="one">ITEM 1, AFTER HEAD 1</td>
<td class="two">TITLE OF HEAD 1</td>
</tr>
<tr>
<td class="one">ITEM 2, AFTER HEAD 1</td>
<td class="two">TITLE OF HEAD 1</td>
</tr>
</tbody>
</table>
<table>
<tbody>
<tr>
<td class="one">ITEM 3, AFTER HEAD 1</td>
<td class="two">TITLE OF HEAD 1</td>
</tr>
<tr>
<td class="one">ITEM 4, AFTER HEAD 1</td>
<td class="two">TITLE OF HEAD 1</td>
</tr>
<tr>
<td class="one">ITEM 5, AFTER HEAD 1</td>
<td class="two">TITLE OF HEAD 1</td>
</tr>
</tbody>
</table>
<h1>TITLE OF HEAD 2</h1>
<table>
<tbody>
<tr>
<td class="one">ITEM 6, AFTER HEAD 2</td>
<td class="two">TITLE OF HEAD 2</td>
</tr>
</tbody>
</table>
<h1>TITLE OF HEAD 3</h1>
<table>
<tbody>
<tr>
<td class="one">ITEM 7, AFTER HEAD 3</td>
<td class="two">TITLE OF HEAD 3</td>
</tr>
<tr>
<td class="one">ITEM 8, AFTER HEAD 3</td>
<td class="two">TITLE OF HEAD 3</td>
</tr>
<tr>
<td class="one">ITEM 9, AFTER HEAD 3</td>
<td class="two">TITLE OF HEAD 3</td>
</tr>
<tr>
<td class="one">ITEM 10, AFTER HEAD 3</td>
<td class="two">TITLE OF HEAD 3</td>
</tr>
</tbody>
</table>
<h1>TITLE OF HEAD 4</h1>
<table>
<tbody>
<tr>
<td class="one">ITEM 11, AFTER HEAD 4</td>
<td class="two">TITLE OF HEAD 4</td>
</tr>
<tr>
<td class="one">ITEM 12, AFTER HEAD 4</td>
<td class="two">TITLE OF HEAD 4</td>
</tr>
</tbody>
</table>
What I've tried so far:
Now getting the strings inside the <h1> is easy:
find: (<h1>)(.*?)(</h1>)
replace: $2
Then I tried:
find: (<h1>)(.*?)(</h1>)(\n|.)*?(<td class="one">.*?</td>)
replace: $5<td class="two">$2</td>
which works, but the other tags are removed as well, so I've modified it:
find (<h1>)(.*?)(</h1>)((\n|.)*?)(<td class="one">.*?</td>)
replace: $4$6<td class="two">$2</td>
Each string of a new h1 will be used for the tds that occur afterwards until a new h1 occurs, which will then be used - the problem is this only works for each first tdafter each h1, not all tds.
Could somebody tell me what needs to be added to the regex for this to work?
Thank you!