Problem
This code that I wrote is supposed to read/write a pipe-delimited file line by line to a new file with some simple text manipulation. (It also adds two new columns) and publishes a “Status Update” ever 100,000 lines to keep me updated on how close it is to completion.
I previously posted this code on StackOverflow to get help with incrementing, and someone mentioned that it would be faster if I did not open the second text file, but being extremely new at Python, I do not understand how to do that without potentially breaking the code.
counter=1
for line in open(r"C:Pathname.txt"):
spline = line.split("|")
if counter==1:
with open(r"C:PATH2019.txt",'a') as NewFile:
spline.insert(23,"Column A")
spline.insert(23,"Column B")
s="|"
newline=s.join(spline)
NewFile.write(newline)
elif counter > 1 and not spline[22]=="0.00":
spline.insert(23,"")
spline.insert(23,"")
gl=spline[0]
gl=gl.strip()
if gl[0]=="-": gl="000" + gl
gl=gl.upper()
spline[0]=gl
if gl[:3]=="000": spline[24]="Incorrect"
s="|"
newline=s.join(spline)
with open(r"C:PATHPythonWrittenData.txt",'a') as NewFile:
NewFile.write(newline)
counter+=1
if counter%100000==0: print("Status Update: n", "{:,}".format(counter))
Solution
A nice trick you can use in python is to open two (or more) files at once in one line. This is done with something like:
with open('file_one.txt', 'r') as file_one, open('file_two.txt', 'r') as file_two:
for line in file_one:
...
for line in file_two:
...
This is a very common way of reading from one file and writing to another without continually opening and closing one of them.
Currently, you’re opening and closing the files with each iteration of the loop. Your program loops through the lines in name.txt
, checks an if
/ elif
condition, then if either are satisfied, a file is opened, written to, then closed again with every iteration of the loop.
Simply by opening both files at the same time you can stop opening and closing them repeatedly.
For more info on the with
statement and other context managers, see here.
Another small improvement can be made. At the moment, you check the first if
condition every time, but you know it will only actually evaluate to True
once. it would be better to remove that check and just always perform that block once. Assign counter after the first block (after where if counter == 1
currently is) then replace the elif
statement with a while
loop.
It would be worth getting familiar with PEP8 if you’re going to use Python a lot in the future. It’s a standard style guide and will help with the readability of your code (for you and others). Just small stuff like new lines after colons or spaces either side of variable declarations / comparisons.
If you include an example file and desired output, there may be more I can help with.
Here is another way to organize your code. Instead of an if
within the loop, use iterators more explicitly. Concretely:
with open(r"C:Pathname.txt") as source:
lines = iter(source)
# first line
first_line = next(lines)
with open(r"C:PATH2019.txt") as summary:
# ... omitted ...
# remaining lines
with open(r"C:PATHPythonWrittenData.txt", 'a') as dest:
for counter, line in enumerate(lines, start=1):
# ... omitted ...
I have also used enumerate
to update counter
and line
simultaneously.
The other answer has some more tips on writing good python code. But as far as structuring the opening and closing of files, as well as the main loop, this approach should get you started.