A | B | C | D | E | F | G | H | CH | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations.[1]
For such cases, it is a more accurate measure than measuring instructions per second.[citation needed]
Floating-point arithmetic
Floating-point arithmetic is needed for very large or very small real numbers, or computations that require a large dynamic range. Floating-point representation is similar to scientific notation, except everything is carried out in base two, rather than base ten. The encoding scheme stores the sign, the exponent (in base two for Cray and VAX, base two or ten for IEEE floating point formats, and base 16 for IBM Floating Point Architecture) and the significand (number after the radix point). While several similar formats are in use, the most common is ANSI/IEEE Std. 754-1985. This standard defines the format for 32-bit numbers called single precision, as well as 64-bit numbers called double precision and longer numbers called extended precision (used for intermediate results). Floating-point representations can support a much wider range of values than fixed-point, with the ability to represent very small numbers and very large numbers.[2]
Dynamic range and precision
The exponentiation inherent in floating-point computation assures a much larger dynamic range – the largest and smallest numbers that can be represented – which is especially important when processing data sets where some of the data may have extremely large range of numerical values or where the range may be unpredictable. As such, floating-point processors are ideally suited for computationally intensive applications.[3]
Computational performance
FLOPS and MIPS are units of measure for the numerical computing performance of a computer. Floating-point operations are typically used in fields such as scientific computational research, as well as in machine learning. However, before the late 1980s floating-point hardware (it's possible to implement FP arithmetic in software over any integer hardware) was typically an optional feature, and computers that had it were said to be "scientific computers", or to have "scientific computation" capability. Thus the unit MIPS was useful to measure integer performance of any computer, including those without such a capability, and to account for architecture differences, similar MOPS (million operations per second) was used as early as 1970[4] as well. Note that besides integer (or fixed-point) arithmetics, examples of integer operation include data movement (A to B) or value testing (If A = B, then C). That's why MIPS as a performance benchmark is adequate when a computer is used in database queries, word processing, spreadsheets, or to run multiple virtual operating systems.[5][6] In 1974 David Kuck coined the terms flops and megaflops for the description of supercomputer performance of the day by the number of floating-point calculations they performed per second.[7] This was much better than using the prevalent MIPS to compare computers as this statistic usually had little bearing on the arithmetic capability of the machine on scientific tasks.
FLOPS on an HPC-system can be calculated using this equation:[8]
This can be simplified to the most common case: a computer that has exactly 1 CPU:
FLOPS can be recorded in different measures of precision, for example, the TOP500 supercomputer list ranks computers by 64 bit (double-precision floating-point format) operations per second, abbreviated to FP64.[9] Similar measures are available for 32-bit (FP32) and 16-bit (FP16) operations.
Floating-point operations per clock cycle for various processors
Microarchitecture | Instruction set architecture | FP64 | FP32 | FP16 |
---|---|---|---|---|
Intel CPU | ||||
Intel 80486 | x87 (32-bit) | ? | 0.128[11] | ? |
|
x87 (32-bit) | ? | 0.5[11] | ? |
|
MMX (64-bit) | ? | 1[12] | ? |
Intel P6 Pentium III | SSE (64-bit) | ? | 2[12] | ? |
Intel NetBurst Pentium 4 (Willamette, Northwood) | SSE2 (64-bit) | 2 | 4 | ? |
Intel P6 Pentium M | SSE2 (64-bit) | 1 | 2 | ? |
SSE3 (64-bit) | 2 | 4 | ? | |
4 | 8 | ? | ||
Intel Atom (Bonnell, Saltwell, Silvermont and Goldmont) | SSE3 (128-bit) | 2 | 4 | ? |
Intel Sandy Bridge (Sandy Bridge, Ivy Bridge) | AVX (256-bit) | 8 | 16 | 0 |
|
AVX2 & FMA (256-bit) | 16 | 32 | 0 |
Intel Xeon Phi (Knights Corner) | IMCI (512-bit) | 16 | 32 | 0 |
|
AVX-512 & FMA (512-bit) | 32 | 64 | 0 |
AMD CPU | ||||
AMD Bobcat | AMD64 (64-bit) | 2 | 4 | 0 |
4 | 8 | 0 | ||
AMD K10 | SSE4/4a (128-bit) | 4 | 8 | 0 |
AMD Bulldozer[13] (Piledriver, Steamroller, Excavator) | 4 | 8 | 0 | |
AVX2 & FMA (128-bit, 256-bit decoding)[18] | 8 | 16 | 0 | |
AVX2 & FMA (256-bit) | 16 | 32 | 0 | |
ARM CPU | ||||
ARM Cortex-A7, A9, A15 | ARMv7 | 1 | 8 | 0 |
ARM Cortex-A32, A35 | ARMv8 | 2 | 8 | 0 |
ARM Cortex-A53, A55, A57,[13] A72, A73, A75 | ARMv8 | 4 | 8 | 0 |
ARM Cortex-A76, A77, A78 | ARMv8 | 8 | 16 | 0 |
ARM Cortex-X1 | ARMv8 | 16 | 32 | ? |
Qualcomm Krait | ARMv8 | 1 | 8 | 0 |
Qualcomm Kryo (1xx - 3xx) | ARMv8 | 2 | 8 | 0 |
Qualcomm Kryo (4xx - 5xx) | ARMv8 | 8 | 16 | 0 |
Samsung Exynos M1 and M2 | ARMv8 | 2 | 8 | 0 |
Samsung Exynos M3 and M4 | ARMv8 | 3 | 12 | 0 |
IBM PowerPC A2 (Blue Gene/Q) | ? | 8 | 8 (as FP64) | 0 |
Hitachi SH-4[20][21] | SH-4 | 1 | 7 | 0 |
Nvidia GPU | ||||
Nvidia Curie (GeForce 6 series and GeForce 7 series) | PTX | ? | 8 | ? |
Nvidia Tesla 2.0 (GeForce GTX 260–295) | PTX | ? | 2 | ? |
Nvidia Fermi (only GeForce GTX 465–480, 560 Ti, 570–590) | PTX | 1/4 (locked by driver, 1 in hardware) | 2 | 0 |
Nvidia Fermi (only Quadro 600–2000) | PTX | 1/8 | 2 | 0 |
Nvidia Fermi (only Quadro 4000–7000, Tesla) | PTX | 1 | 2 | 0 |
Nvidia Kepler (GeForce (except Titan and Titan Black), Quadro (except K6000), Tesla K10) | PTX | 1/12 (for GK110: locked by driver, 2/3 in hardware) | 2 | 0 |
Nvidia Kepler (GeForce GTX Titan and Titan Black, Quadro K6000, Tesla (except K10)) | PTX | 2/3 | 2 | 0 |
PTX | 1/16 | 2 | 1/32 | |
Nvidia Pascal (only Quadro GP100 and Tesla P100) | PTX | 1 | 2 | 4 |
Nvidia Volta[22] | PTX | 1 | 2 (FP32) + 2 (INT32) | 16 |
Nvidia Turing (only GeForce 16XX) | PTX | 1/16 | 2 (FP32) + 2 (INT32) | 4 |
Nvidia Turing (all except GeForce 16XX) | PTX | 1/16 | 2 (FP32) + 2 (INT32) | 16 |
Nvidia Ampere[23][24] (only Tesla A100/A30) | PTX | 2 | 2 (FP32) + 2 (INT32) | 32 |
Nvidia Ampere (all GeForce and Quadro, Tesla A40/A10) | PTX | 1/32 | 2 (FP32) + 0 (INT32) or 1 (FP32) + 1 (INT32) | 8 |
AMD GPU | ||||
AMD TeraScale 1 (Radeon HD 4000 series) | TeraScale 1 | 0.4 | 2 | ? |
AMD TeraScale 2 (Radeon HD 5000 series) | TeraScale 2 | 1 | 2 | ? |
AMD TeraScale 3 (Radeon HD 6000 series) | TeraScale 3 | 1 | 4 | ? |
AMD GCN (only Radeon Pro W 8100–9100) | GCN | 1 | 2 | ? |
AMD GCN (all except Radeon Pro W 8100–9100, Vega 10–20) | GCN | 1/8 | 2 | 4 |
AMD GCN Vega 10 | GCN | 1/8 | 2 | 4 |
AMD GCN Vega 20 (only Radeon VII) | GCN | 1/2 (locked by driver, 1 in hardware) | 2 | 4 |
AMD GCN Vega 20 (only Radeon Instinct MI50 / MI60 and Radeon Pro VII) | GCN | 1 | 2 | 4 |
RDNA | 1/8 | 2 | 4 | |
AMD RDNA3 | RDNA | 1/8? | 4 | 8? |
AMD CDNA | CDNA | 1 | 4 (Tensor)[27] | 16 |
AMD CDNA 2 | CDNA 2 | 4 (Tensor) | 4 (Tensor) | 16 |
Intel GPU | ||||
Intel Xe-LP (Iris Xe MAX)[28] | Xe | 1/2? | 2 | 4 |
Intel Xe-HPG (Arc Alchemist)[28] | Xe | 0 | 2 | 16 |
Intel Xe-HPC (Ponte Vecchio)[29] | Xe | 2 | 2 | 32 |
Qualcomm GPU | ||||
Qualcomm Adreno 5x0 | Adreno 5xx | 1 | 2 | 4 |
Qualcomm Adreno 6x0 | Adreno 6xx | 1 | 2 | 4 |
Graphcore | ||||
Graphcore Colossus GC2[30][31] | ? | 0 | 16 | 64 |
? | 0 | 32 | 128 | |
Supercomputer | ||||
ENIAC @ 100 kHz in 1945 |
Zdroj:https://en.wikipedia.org?pojem=TeraFLOPS Text je dostupný za podmienok Creative Commons Attribution/Share-Alike License 3.0 Unported; prípadne za ďalších podmienok. Podrobnejšie informácie nájdete na stránke Podmienky použitia.
Analytika
Antropológia Aplikované vedy Bibliometria Dejiny vedy Encyklopédie Filozofia vedy Forenzné vedy Humanitné vedy Knižničná veda Kryogenika Kryptológia Kulturológia Literárna veda Medzidisciplinárne oblasti Metódy kvantitatívnej analýzy Metavedy Metodika Text je dostupný za podmienok Creative
Commons Attribution/Share-Alike License 3.0 Unported; prípadne za ďalších
podmienok. www.astronomia.sk | www.biologia.sk | www.botanika.sk | www.dejiny.sk | www.economy.sk | www.elektrotechnika.sk | www.estetika.sk | www.farmakologia.sk | www.filozofia.sk | Fyzika | www.futurologia.sk | www.genetika.sk | www.chemia.sk | www.lingvistika.sk | www.politologia.sk | www.psychologia.sk | www.sexuologia.sk | www.sociologia.sk | www.veda.sk I www.zoologia.sk |