0

The code is exactly the same -- I copied it from one computer to another. The code is compiled with g++-4 (4.9.1) obtained from fink on OSX on both machines, and is not run in parallel.

Compiler options are "-O2", and the computers are basically doing nothing else (low CPU & memory usage). Code is a 2400-line research code link.

Machine 1:

  • Late 2013 MacBook Pro Retina,
  • 2.8 GHz i7-4558U,
  • 16GB 1600MHz DDR3,
  • 500GB Flash storage

Machine 2:

  • Late 2013 MacPro Workstation,
  • 3.5GHz 6-Core Intel Xeon E5-1650,
  • 32GB 1867MHz DDR3
  • 251GB Flash storage,
  • 3TB external SATA drive

Run-time:
Machine 1: with output 200 sec., w/o 18 sec.
Machine 2: (/ directory -- should be flash drive): with 2230 sec., w/o 2075 sec.
Machine 2: (~ directory -- should be external drive): with 2262 sec., w/o 2080 sec.

Any ideas of how to improve runtime on the MacPro?

Mokubai
  • 95,412

2 Answers2

1

This is a speculative guess, but your code works with the disk and disk I/O, and I am going to assume that this is your bottleneck - you mentioned that it runs faster on the machine with 500GB flash storage than on the one with 250 GB flash storage - this makes sense, logically, because of how flash storage is essentially a raid-0 of smaller (32/64gb) flash storage chips, and more chips/disks in a raid-0 array will greatly increase performance. I do not know the particular make/model/firmware/controller of the storage, however I suspect that if you were to do a disk I/O test, you would find a similar discrepancy in performance on the two machines. Such a performance test can best be done using XBench.

0

The proper way to approach the question "why does this code take so long to run", whether "long" is in absolute or relative terms, is to use a tool called a profiler.

Basically, you run the program through the profiler or with the profiler attached, and the profiler records how much time the program spends in various functions. This information is then presented to you in a form that allows you to pinpoint the parts of the program that took the longest to run during that execution. Often it will also be possible to get additional information from that report, such as which parts of the program are called the largest number of times, and things like that, which can also point toward areas that could use some scrutiny.

Based on that data, it's usually easy to tell which parts need to be optimized such that the program runs faster, without employing the guessing game known as "premature optimization" or relying on the particulars of some specific piece of hardware.

user
  • 30,336