I am trying to read the output of a subprocess called from Python. To do this I am using Popen (because I do not think it is possible to pipe stdout if using subprocess.call).
As of now I have two ways of doing it which, in testing, seem to provide the same results. The code is as follows:
with Popen(['Robocopy', source, destination, '/E', '/TEE', '/R:3', '/W:5', '/log+:log.txt'], stdout=PIPE) as Robocopy:
for line in Robocopy.stdout:
line = line.decode('ascii')
message_list = [item.strip(' \t\n').replace('\r', '') for item in line.split('\t') if item != '']
print(message_list[0], message_list[0])
Robocopy.wait()
returncode = Robocopy.returncode
and
with Popen(['Robocopy', source, destination, '/E', '/TEE', '/R:3', '/W:5', '/log+:log.txt'], stdout=PIPE, universal_newlines=True, bufsize=1) as Robocopy:
for line in Robocopy.stdout:
message_list = [item.strip() for item in line.split('\t') if item != '']
print(message_list[0], message_list[2])
Robocopy.wait()
returncode = Robocopy.returncode
The first method does not include universal_newlines=True, as the documentation states this is only usable if universal_newlines=True i.e., in a text mode.
The second version does include universal_newlines and therefore I specify a bufsize.
Can somebody explain the difference to me? I can't find the article but I did read about issues with an overflowing buffer causing some sort of issue and thus the importance of using for line in stdout.
Additionally, when looking at the output, not specifying universal_newlines makes stdout a bytes object - but I am not sure what difference that makes if I just decode the bytes object with ascii (in terms of new lines and tabs) compared universal_newlines mode.
Lastly, setting the bufsize to 1 makes the output "line-buffered" but I am not sure what that means. I would appreciate an explanation about how these various elements tie together. Thanks