0

I have an xml I need to edit using batch. How do you insert a newline for every occurrence of the word abstract_ in the whole file?

Here's a line in the xml (9999999x.xml)

<related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999999_abstract_ddd.jpg"/><related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999988_abstract_ddd.jpg"/><related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999977_abstract_ddd.jpg"/><related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999966_abstract_ddd.jpg"/>

What I wanted to look like...

<related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999999_abstract_
ddd.jpg"/><related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999988_abstract_
ded.jpg"/><related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999977_abstract_
dfd.jpg"/><related-object content-type="image.extract" object-type="image/jpeg" specific-use="data" xlink:href="99999966_abstract_
dgd.jpg"/>

It doesn't have to overwrite the file, it just have to be saved in another text or temp file.

Thanks!

2 Answers2

0
sed 's/abstract_/abstract_\n/g' 9999999x.xml > 9999999xa.xml

sed is a weird unix editor which few people ever use, except to do in-line editing. It comes installed in all distributions.

In this case it takes the input file, and then applies the command between quotes, which tells the editor to substitute abstract_ by abstract_\n. The g tells sed to go and do it over the entire file.

It types the result to stdout which is redirected here to 9999999xa.xml. Don't redirect to the same file as the input, as that can cause unpredictable results.

Edit: We're all so addicted to screen editor, where the text is shown, and you move around and edit what you want.

sed is different - you have to know beforehand which commands you are going to apply to the file, and either write them into a 'script' file, or include the commands on the command line itself. Then sed will apply those commands in a (mostly) line-by-line based way to the input file.

The title 'in-line-editor' was probably earned by the fact that you can insert sed in a command line and use redirection to make it part of the process... An example (which could be optimized):

cat some.txt | sed 's/abstract_/abstract_\n/g' | sort

So, the text flows from cat, through sed, to sort which is called a pipeline. sed is sometime called a stream editor too, for the same reason. Have a look at the intro for sed's manual.

jcoppens
  • 767
0

I have seen a hybrid of batch and javascript where it looks like a batch but it's actually javascript running. jrepl.bat by Dave Benham http://www.dostips.com/forum/viewtopic.php?f=3&t=6044

Or a third party program like perl or sed. jcop shows re sed. Sed is old, nowadays people use perl. This link shows how to convert a search and replace sed line, into perl Perl for matching with regular expressions in Terminal? But it's OK to use sed. Sed works line by line, so doesn't support \n in the find section (even the 'latest' version doesn't). But sed(apart from an old version on unxutils) does support it in the replace section which is what you need. Perl of course supports it in find or replace.

You can get sed from gnuwin32 http://gnuwin32.sourceforge.net/packages.html download sed there, and if doing that then I suggest getting gnuwin32 coreutils too, and whatever others you find you want.

There is an old version of sed from unxutils though I don't suggest that and the old version of sed that unxutils has is GNU sed version 3.02 which doesn't support \n so won't help. There is also a sed on Windows SUA, but that one doesn't support \n either and doesn't even say what version it is.

So get sed from gnuwin32 or cygwin or MinGW or gow. Not SUA and not unxutils.

barlop
  • 25,198