In a bash file, I have logfileA.txt that contains output from wget that I'd like to run grep on to check for any instances of the words "error" or "fail", etc, as so:
grep -ni --color=never -e "error" -e "fail" logfileA.txt | awk -F: '{print "Line "$1": "$2}'
# grep -n line number, -i ignore case; awk to add better format to the line numbers (https://stackoverflow.com/questions/3968103)
Trouble is though, I think the wget output in logfileA.txt is full of characters that may be messing up the input for grep, as I'm not getting reliable matches.
Troubleshooting this, I cannot even cat the contents of the log file reliably. For instance, with cat logfileA.txt, all I get is the last line which is garbled:
FINISHED --2019-05-29 17:08:52--me@here:/home/n$ 71913592/3871913592]atmed out). Retrying.
The contents of logfileA.txt is:
--2019-05-29 15:26:50--  http://somesite.com/somepath/a0_FooBar/BarFile.dat
Reusing existing connection to somesite.com:80.
HTTP request sent, awaiting response... 302 Found
Location: http://cdn.somesite.com/storage/a0_FooBar/BarFile.dat [following]
--2019-05-29 15:26:50--  http://cdn.somesite.com/storage/a0_FooBar/BarFile.dat
Resolving cdn.somesite.com (cdn.somesite.com)... xxx.xxx.xx.xx
Connecting to cdn.somesite.com (cdn.somesite.com)|xxx.xxx.xx.xx|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3871913592 (3.6G) [application/octet-stream]
Saving to: 'a0_FooBar/BarFile.dat’
a0_FooBar/BarFile.dat   0%[                    ]       0  --.-KB/s               
a0_FooBar/BarFile.dat   0%[                    ]  15.47K  70.5KB/s               
...
a0_FooBar/BarFile.dat  49%[========>           ]   1.80G  --.-KB/s    in 50m 32s 
2019-05-29 16:17:23 (622 KB/s) - Read error at byte 1931163840/3871913592 (Connection timed out). Retrying.
--2019-05-29 16:17:24--  (try: 2)  http://cdn.somesite.com/storage/a0_FooBar/BarFile.dat
Connecting to cdn.somesite.com (cdn.somesite.com)|xxx.xxx.xx.xx|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 3871913592 (3.6G), 1940749752 (1.8G) remaining [application/octet-stream]
Saving to: 'a0_FooBar/BarFile.dat’
a0_FooBar/BarFile.dat  49%[+++++++++           ]   1.80G  --.-KB/s               
...
a0_FooBar/BarFile.dat 100%[+++++++++==========>]   3.61G  1.09MB/s    in 34m 44s 
2019-05-29 16:52:09 (909 KB/s) - 'a0_FooBar/BarFile.dat’ saved [3871913592/3871913592]
FINISHED --2019-05-29 17:08:52--
I assume the problem could be the /s or ---s or >s or ==>s or |s?
But since the output from wget could vary, how do I anticipate and escape anything problematical for grep?
Command:
grep -ni --color=never -e "error" -e "fail" logfileA.txt | awk -F: '{print "Line "$1": "$2}'
Expected output:
Line 17: 2019-05-29 16:17:23 (622 KB/s) - Read error at byte 1931163840/3871913592 (Connection timed out). Retrying.
Also, would an ack line be better at this job? And if so, what/how?
 
     
    