I am trying to figure out how to renumber a certain file format and struggling to get it right.
First, a little background may help: There is a certain file format used in computational chemistry to describe the structure of a molecule with the extension .xyz. The first column is the number used to identify a specific atom (carbon, hydrogen, etc.), and the subsequent columns show what other atom numbers it is connected to. Below is a small sample of this file, but the usual file is significantly larger.
  259   252             
  260   254                  
  261   255                  
  262   256                  
  264   248   265   268      
  265   264   266   269   270
  266   265   267   282      
  267   266                  
  268   264                  
  269   265       
  270   265   271   276   277
  271   270   272   273      
  272   271   274   278      
  273   271   275   279      
  274   272   275   280      
  275   273   274   281      
  276   270                  
  277   270                  
  278   272                  
  279   273                  
  280   274                  
  282   266   283   286      
  283   282   284   287   288
  284   283   285   289      
  285   284                  
  286   282                  
  287   283                  
  288   283                  
  289   284   290   293      
  290   289   291   294   295
  291   290   292   304
As you can see, the numbers 263 and 281 are missing. Of course, there could be many more missing numbers so I need my script to be able to account for this. Below is the code I have thus far, and the lists missing_nums and missing_nums2 are given as well, however, I would normally obtain them from an earlier part of the script. The last element of the list missing_nums2 is where I want numbering to finish, so in this case: 289.
    missing_nums = ['263', '281']
    missing_nums2 = ['281', '289']
    with open("atom_nums.xyz", "r") as f2:  
            lines = f2.read()
    
    for i in range(0, len(missing_nums) - 1):
        if i == 0:
            with open("atom_nums_out.xyz", "w") as f2: 
                
                replacement = int(missing_nums[i])
                
                for number in range(int(missing_nums[i]) + 1, int(missing_nums2[i])):
                    lines = lines.replace(str(number), str(replacement))
                    replacement += 1
                
                f2.write(lines)
    
        else:
            with open("atom_nums_out.xyz", "r") as f2:  
                lines = f2.read()
                
            with open("atom_nums_out.xyz", "w") as f2:   
                
                replacement = int(missing_nums[i]) - (i + 1)
                print(replacement)
                
                for number in range(int(missing_nums[i]), int(missing_nums2[i])):
                    lines = lines.replace(str(number), str(replacement))
                    replacement += 1
                    
                f2.write(lines)
The problem lies in the fact that as the file gets larger, there seems to be repeats of numbers for reasons I cannot figure out. I hope somebody can help me here.
EDIT: The desired output of the code using the above sample would be
  259   252                  
  260   254                  
  261   255                  
  262   256                  
  263   248   264   267      
  264   263   265   268   269
  265   264   266   280      
  266   265                  
  267   263                  
  268   264                  
  269   264   270   275   276
  270   269   271   272      
  271   270   273   277      
  272   270   274   278      
  273   271   274   279      
  274   272   273   279      
  275   269                  
  276   269                  
  277   271                  
  278   272                  
  279   273                  
  280   265   281   284      
  281   280   282   285   286
  282   281   283   287      
  283   282                  
  284   280                  
  285   281                  
  286   281                  
  287   282   288   291      
  288   287   289   292   293
  289   288   290   302
Which is, indeed, what I get as the output for this small sample, but as the missing numbers increase it seems to not work and I get duplicate numbers. I can provide the whole file if anyone wants.
Thanks!
 
    