4

I am trying to understand the intricate terms related to performance of processors.

Computer performance is measured in FLOPS, which is short for FLOPs per Second. FLOPs itself stands for Floating-point Operations.

Now, why is floating-point operations considered for the performance of a computer. What about integer operations? Is there any source online (official citation) that would explain this trend? Google did not give me anything with my searching.

Now, what exactly does operation in FLOP mean? Does that mean a mathematical operation like MUL, ADD, DIV etc?

In this case, what would be an instruction? If an instruction is something like ADDPD, ADDSD etc, as I can see here (http://docs.oracle.com/cd/E26502_01/html/E28388/epmpv.html), then one instruction can lead to many internal operations. Would that be micro-operations or uops?

I did not find the explanation for micro-operation on Wikipedia helpful. Whoever explains this clearly here will probably have to improve the Wikipedia article as well :)

According to the Hennessy/Patterson book on Computer Architecture (5th edition, page 233), the ARM Cortex-A8 (RISC) is able to execute two instructions per clock. Does that mean that the processor can perform, for example an ADDPD and an ADDSD (total=2 Instructions) in one cycle?

This source (http://en.community.dell.com/techcenter/high-performance-computing/w/wiki/2329) states the following:

Most microprocessors today can do four (4) FLOPs per clock cycle, that is, 4 FLOPs per Hz.

I believe the author is wrong. He probably implied 4 Instructions per cycles, limiting himself to CISC-based (Intel for example) processors. That is because some instructions like FMA on Haswell can boost performance whereby the processor can process more operations per cycle. In other words, 1 FMA instruction translates to a couple of operations. Am I right?

IPC stands for Instruction per Cycle. What instructions are referred here? The instructions retired by the processor? With a hardware counter, I can count the number of CPU cycles and the number of INST_RETIRED.ANY. Would that be the right way to calculate IPC?

Thank you very much for any answers and comments. Hopefully my question will help many other confused souls :)

Shailen
  • 184

1 Answers1

2

Now, why is floating-point operations considered for the performance of a computer. What about integer operations?

Floating-point operations is just one of several metrics that have been used over the years to benchmark computer performance. Measuring FP operations is considered to be more applicable for some real-world applications (such as weather simulations) than integer operations. If you were evaluating computers for a database application, you would probably ignore the FLOPS specifications, and focus on IPS (instructions per second) and I/O performance.

Now, what exactly does operation in FLOP mean? Does that mean a mathematical operation like MUL, ADD, DIV etc?

The "operation" is the execution of the "instruction", which is a machine code (i.e. a binary value), or one calculation by the FPU, Floating Point Unit. The (older) FPU typically runs asynchronously with the CPU and ALU, in order to not hinder program execution that is not dependent on the FP result.

Note that a computer (circa 1980) that did not have a FPU could be upgraded with a FPU peripheral. The FP library of software routines that implemented fundamental FP operations (add, subtract, multiply, divide, square root etc.) would be replaced with a library that invoked I/O instructions to access the FPU peripheral. An interrupt from the FPU would notify the CPU that the FP operation was complete.

Early PCs were of similar construction. The original IBM PC used the Intel 8088 microprocessor which did not have HW FP capability. But an 8087 math co-processor could be installed, so that the FP instructions could be performed by hardware instead of being redirected to software routines. Eventually Intel integrated the math co-processor in the CPU package for the i486

In this case, what would be an instruction?

"Instruction" should not be an ambiguous entity. It's one machine code or one mnemomic of the processor.

then one instruction can lead to many internal operations. Would that be micro-operations or uops?

Apparently you are referring to microprogramming.
(There used to be a computer company that took microprogramming one iteration lower: to the nanoprogramming level. The products were for CPU emulation.)
Microprogramming is not really relevant to application program performance. That is, you typically cannot rewrite/improve the microprogramming as you could on a nanoprogram processor.

Does that mean that the processor can perform, for example an ADDPD and an ADDSD (total=2 Instructions) in one cycle?

Sort of. Performing more than one instruction per clock cycle requires a pipeline of "execution units". Think of a (vehicle) manufacturing assembly line. At each station a specific task is performed. At the end of the assembly line (pipeline), only one vehicle (instruction) is completed at a time. The concurrency is staggered, rather than synchronized.

What instructions are referred here?

Each instruction is a machine code.

In other words, 1 FMA instruction translates to a couple of operations. Am I right?

No, one instruction correlates to one operation.

sawdust
  • 18,591