I'm trying to read a file in chunks of multiple lines. For example, if a file has 100 lines and I want each chunk to have 10 lines, then there should be 10 chunks. Then should be able to extract the chunk as following:
# Here 'read_chunk' function should return a generator.
for chunk in read_chunk(file_path="./file.txt", line_count=10):
    print(chunk)
This is how I attempted it.
from typing import Generator
def read_chunk(
    *,
    file_path: str,
    line_count: int = 10,  # Number of chunked line.
) -> Generator[str, None, None]:
    """Read a file in chunks of 'line_count' lines."""
    with open(file_path, "r") as f:
        chunk = []
        for idx, line in enumerate(f):
            if line.strip():
                chunk.append(line)
            if not idx == 0 and idx % line_count == 0:
                yield "\n".join(chunk)
                chunk = []
        # This returns the last chunk.
        yield "\n".join(chunk)
Let's run this on the following file:
# file.txt
* [What is Normalization in DBMS (SQL)? 1NF, 2NF, 3NF, BCNF Database with Example - Richard Peterson](https://www.guru99.com/database-normalization.html) 
-> Normalization roughly means deduplication of data in a table by leveraging foreign keys, multiple tables, and intermediary join tables. 
This article explains it in finer detail.
* [OLTP vs OLAP System](https://www.guru99.com/oltp-vs-olap.html) 
-> OLTP is an online transactional system that manages database modification whereas OLAP is an online analysis and data retrieving process.
for chunk in read_chunk(file_path='./file.txt', line_count=2):
    print('============\n')      # This is to discern between the chunks better.
    print(chunk)
    print('============\n')
This returns:
============
# file.txt
* [What is Normalization in DBMS (SQL)? 1NF, 2NF, 3NF, BCNF Database with Example - Richard Peterson](https://www.guru99.com/database-normalization.html)
============
============
-> Normalization roughly means deduplication of data in a table by leveraging foreign keys, multiple tables, and intermediary join tables.
This article explains it in finer detail.
============
============
* [OLTP vs OLAP System](https://www.guru99.com/oltp-vs-olap.html)
============
============
-> OLTP is an online transactional system that manages database modification whereas OLAP is an online analysis and data retrieving process.
============
The output looks alright in the beginning and then it doesn't make sense to me. Shouldn't there be a single chunk with 2 lines at the end instead of two with 1 line? Also, is there a better way of doing this?
