I have large Excel worksheets that I want to be able to read into MySQL using PHPExcel.
I am using the recent patch which allows you to read in Worksheets without opening the whole file. This way I can read one worksheet at a time.
However, one Excel file is 27MB large. I can successfully read in the first worksheet since it is small, but the second worksheet is so large that the cron job that started the process at 22:00 was not finished at 8:00 AM, the worksheet is simple too big.
Is there any way to read in a worksheet line by line, e.g. something like this:
$inputFileType = 'Excel2007';
$inputFileName = 'big_file.xlsx';
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$worksheetNames = $objReader->listWorksheetNames($inputFileName);
foreach ($worksheetNames as $sheetName) {
    //BELOW IS "WISH CODE":
    foreach($row = 1; $row <=$max_rows; $row+= 100) {
        $dataset = $objReader->getWorksheetWithRows($row, $row+100);
        save_dataset_to_database($dataset);
    }
}
Addendum
@mark, I used the code you posted to create the following example:
function readRowsFromWorksheet() {
    $file_name = htmlentities($_POST['file_name']);
    $file_type = htmlentities($_POST['file_type']);
    echo 'Read rows from worksheet:<br />';
    debug_log('----------start');
    $objReader = PHPExcel_IOFactory::createReader($file_type);
    $chunkSize = 20;
    $chunkFilter = new ChunkReadFilter();
    $objReader->setReadFilter($chunkFilter);
    for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
        $chunkFilter->setRows($startRow, $chunkSize);
        $objPHPExcel = $objReader->load('data/' . $file_name);
        debug_log('reading chunk starting at row '.$startRow);
        $sheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, true);
        var_dump($sheetData);
        echo '<hr />';
    }
    debug_log('end');
}
As the following log file shows, it runs fine on a small 8K Excel file, but when I run it on a 3 MB Excel file, it never gets past the first chunk, is there any way I can optimize this code for performance, otherwise it doesn't look like it is not performant enough to get chunks out of a large Excel file:
2011-01-12 11:07:15: ----------start
2011-01-12 11:07:15: reading chunk starting at row 2
2011-01-12 11:07:15: reading chunk starting at row 22
2011-01-12 11:07:15: reading chunk starting at row 42
2011-01-12 11:07:15: reading chunk starting at row 62
2011-01-12 11:07:15: reading chunk starting at row 82
2011-01-12 11:07:15: reading chunk starting at row 102
2011-01-12 11:07:15: reading chunk starting at row 122
2011-01-12 11:07:15: reading chunk starting at row 142
2011-01-12 11:07:15: reading chunk starting at row 162
2011-01-12 11:07:15: reading chunk starting at row 182
2011-01-12 11:07:15: reading chunk starting at row 202
2011-01-12 11:07:15: reading chunk starting at row 222
2011-01-12 11:07:15: end
2011-01-12 11:07:52: ----------start
2011-01-12 11:08:01: reading chunk starting at row 2
(...at 11:18, CPU usage at 93% still running...)
Addendum 2
When I comment out:
//$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, true);
//var_dump($sheetData);
Then it parses at an acceptable speed (about 2 rows per second), is there anyway to increase the performance of toArray()?
2011-01-12 11:40:51: ----------start
2011-01-12 11:40:59: reading chunk starting at row 2
2011-01-12 11:41:07: reading chunk starting at row 22
2011-01-12 11:41:14: reading chunk starting at row 42
2011-01-12 11:41:22: reading chunk starting at row 62
2011-01-12 11:41:29: reading chunk starting at row 82
2011-01-12 11:41:37: reading chunk starting at row 102
2011-01-12 11:41:45: reading chunk starting at row 122
2011-01-12 11:41:52: reading chunk starting at row 142
2011-01-12 11:42:00: reading chunk starting at row 162
2011-01-12 11:42:07: reading chunk starting at row 182
2011-01-12 11:42:15: reading chunk starting at row 202
2011-01-12 11:42:22: reading chunk starting at row 222
2011-01-12 11:42:22: end
Addendum 3
This seems to work adequately, for instance, at least on the 3 MB file:
for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
    echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ', $startRow, ' to ', ($startRow + $chunkSize - 1), '<br />';
    $chunkFilter->setRows($startRow, $chunkSize);
    $objPHPExcel = $objReader->load('data/' . $file_name);
    debug_log('reading chunk starting at row ' . $startRow);
    foreach ($objPHPExcel->getActiveSheet()->getRowIterator() as $row) {
        $cellIterator = $row->getCellIterator();
        $cellIterator->setIterateOnlyExistingCells(false);
        echo '<tr>';
        foreach ($cellIterator as $cell) {
            if (!is_null($cell)) {
                //$value = $cell->getCalculatedValue();
                $rawValue = $cell->getValue();
                debug_log($rawValue);
            }
        }
    }
}