fbpx
Wikipedia

AMD FireStream

AMD FireStream was AMD's brand name for their Radeon-based product line targeting stream processing and/or GPGPU in supercomputers. Originally developed by ATI Technologies around the Radeon X1900 XTX in 2006, the product line was previously branded as both ATI FireSTREAM and AMD Stream Processor.[1] The AMD FireStream can also be used as a floating-point co-processor for offloading CPU calculations, which is part of the Torrenza initiative. The FireStream line has been discontinued since 2012, when GPGPU workloads were entirely folded into the AMD FirePro line.

Overview edit

The FireStream line is a series of add-on expansion cards released from 2006 to 2010, based on standard Radeon GPUs but designed to serve as a general-purpose co-processor, rather than rendering and outputting 3D graphics. Like the FireGL/FirePro line, they were given more memory and memory bandwidth, but the FireStream cards do not necessarily have video output ports. All support 32-bit single-precision floating point, and all but the first release support 64-bit double-precision. The line was partnered with new APIs to provide higher performance than existing OpenGL and Direct3D shader APIs could provide, beginning with Close to Metal, followed by OpenCL and the Stream Computing SDK, and eventually integrated into the APP SDK.

For highly parallel floating point math workloads, the cards can speed up large computations by more than 10 times; Folding@Home, the earliest and one of the most visible users of the GPGPU, obtained 20-40 times the CPU performance.[2] Each pixel and vertex shader, or unified shader in later models, can perform arbitrary floating-point calculations.

History edit

Following the release of the Radeon R520 and GeForce G70 GPU cores with programmable shaders, the large floating-point throughput drew attention from academic and commercial groups, experimenting with using then for non-graphics work. The interest led ATI (and Nvidia) to create GPGPU products — able to calculate general purpose mathematical formulas in a massively parallel way — to process heavy calculations traditionally done on CPUs and specialized floating-point math co-processors. GPGPUs were projected to have immediate performance gains of a factor of 10 or more, over compared to contemporary multi-socket CPU-only calculation.

With the development of the high-performance X1900 XFX nearly finished, ATI based its first Stream Processor design on it, announcing it as the upcoming ATI FireSTREAM together with the new Close to Metal API at SIGGRAPH 2006.[3] The core itself was mostly unchanged, except for doubling the onboard memory and bandwidth, similar to the FireGL V7350; new driver and software support made up most of the difference. Folding@home began using the X1900 for general computation, using a pre-release of version 6.5 of the ATI Catalyst driver, and reported 20-40x improvement in GPU over CPU.[2] The first product was released in late 2006, rebranded as AMD Stream Processor after the merger with AMD.[4]

The brand became AMD FireStream with the second generation of stream processors in 2007, based on the RV650 chip with new unified shaders and double precision support.[5] Asynchronous DMA also improved performance by allowing a larger memory pool without the CPU's help. One model was released, the 9170, for the initial price of $1999. Plans included the development of a stream processor on an MXM module by 2008, for laptop computing,[6] but was never released.

The third-generation quickly followed in 2008 with dramatic performance improvements from the RV770 core; the 9250 had nearly double the performance of the 9170, and became the first single-chip teraflop processor, despite dropping the price to under $1000.[7] A faster sibling, the 9270, was released shortly after, for $1999.

In 2010 the final generation of FireStreams came out, the 9350 and 9370 cards, based on the Cypress chip featured in the HD 5800. This generation again doubled the performance relative to the previous, to 2 teraflops in the 9350 and 2.6 teraflops in the 9370,[8] and was the first built from the ground up for OpenCL. This generation was also the only one to feature fully passive cooling, and active cooling was unavailable.

The Northern and Southern Islands generations were skipped, and in 2012, AMD announced that the new FirePro W (workstation) and S (server) series based on the new Graphics Core Next architecture would take the place of FireStream cards.[9]

Models edit

Model
(Codename)
Launch Architecture
(Fab)
Bus interface Stream processors Clock rate Memory Processing power[a]
(GFLOPS)
TDP (Watts)
Core (MHz) Memory (MHz) Size (MB) Type Bus width (bit) Bandwidth (GB/s) Single Double
Stream Processor
(R580)
2006 R500
80 nm
240 600 1024 GDDR3 256 83.2 375[10] N/A 165
FireStream 9170
(RV670)[11][12]
November 8, 2007 TeraScale 1
55 nm
PCIe 2.0 x16 320 800 800 2048 GDDR3 256 51.2 512 102.4 105
FireStream 9250
(RV770)[13][14]
June 16, 2008 TeraScale 1
55 nm
PCIe 2.0 x16 800 625 993 1024 GDDR3 256 63.6 1000 200 150
FireStream 9270
(RV770)[15][16]
November 13, 2008 TeraScale 1
55 nm
PCIe 2.0 x16 800 750 850 2048 GDDR5 256 108.8 1200 240 160
FireStream 9350
(Cypress XT)[17]
June 23, 2010 TeraScale 2
40 nm
PCIe 2.1 x16 1440 700 1000 2048 GDDR5 256 128 2016 403.2 150
FireStream 9370
(Cypress XT)[18]
June 23, 2010 TeraScale 2
40 nm
PCIe 2.1 x16 1600 825 1150 4096 GDDR5 256 147.2 2640 528 225
  1. ^ Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.

Software edit

The AMD FireStream was launched with a wide range of software platform support. One of the supporting firms was PeakStream (acquired by Google in June 2007), who was first to provide an open beta version of software to support CTM and AMD FireStream as well as x86 and Cell (Cell Broadband Engine) processors. The FireStream was claimed to be 20 times faster in typical applications than regular CPUs after running PeakStream's software [citation needed]. RapidMind also provided stream processing software that worked with ATI and NVIDIA, as well as Cell processors.[19]

Software Development Kit edit

After abandoning their short-lived Close to Metal API, AMD focused on OpenCL. AMD first released its Stream Computing SDK (v1.0), in December 2007 under the AMD EULA, to be run on Windows XP.[19] The SDK includes "Brook+", an AMD hardware optimized version of the Brook language developed by Stanford University, itself a variant of the ANSI C (C language), open-sourced and optimized for stream computing. The AMD Core Math Library (ACML) and AMD Performance Library (APL) with optimizations for the AMD FireStream and the COBRA video library (further renamed as "Accelerated Video Transcoding" or AVT) for video transcoding acceleration will also be included. Another important part of the SDK, the Compute Abstraction Layer (CAL), is a software development layer aimed for low-level access, through the CTM hardware interface, to the GPU architecture for performance tuning software written in various high-level programming languages.

In August 2011, AMD released version 2.5 of the ATI APP Software Development Kit,[19] which includes support for OpenCL 1.1, a parallel computing language developed by the Khronos Group. The concept of compute shaders, officially called DirectCompute, in Microsoft's next generation API called DirectX 11 is already included in graphics drivers with DirectX 11 support.

AMD APP SDK edit

Benchmarks edit

According to an AMD-demonstrated system[20] with two dual-core AMD Opteron processors and two Radeon R600 GPU cores running on Microsoft Windows XP Professional, 1 teraflop (TFLOP) can be achieved by a universal multiply-add (MADD) calculation. By comparison, an Intel Core 2 Quad Q9650 3.0 GHz processor at the time could achieve 48 GFLOPS.[21]

In a demonstration of Kaspersky SafeStream anti-virus scanning that had been optimized for AMD stream processors, was able to scan 21 times faster with the R670 based acceleration than with search running entirely on an Opteron, in 2007.[22]

Limitations edit

  • Recursive functions are not supported in Brook+ because all function calls are inlined at compile time. Using CAL, functions (recursive or otherwise) are supported to 32 levels.[23]
  • Only bilinear texture filtering is supported; mipmapped textures and anisotropic filtering are not supported.
  • Functions cannot have a variable number of arguments. The same problem occurs for recursive functions.
  • Conversion of floating-point numbers to integers on GPUs is done differently than on x86 CPUs; it is not fully IEEE-754 compliant.
  • Doing "global synchronization" on the GPU is not very efficient, which forces the GPU to divide the kernel and do synchronization on the CPU. Given the variable number of multiprocessors and other factors, there may not be a perfect solution to this problem.
  • The bus bandwidth and latency between the CPU and the GPU may become a bottleneck.

See also edit

References edit

  1. ^ AMD Press Release
  2. ^ a b Gasior, Geoff (October 16, 2006). "A closer look at Folding@home on the GPU". The Tech Report. Retrieved 2016-05-26.
  3. ^ (PDF) (Report). ATI Technologies. Archived from the original (PDF) on 2016-12-21. Retrieved 2016-05-26.
  4. ^ Valich, Theo (November 16, 2006). . The Inquirer. Archived from the original on August 21, 2009. Retrieved 2016-05-26.{{cite news}}: CS1 maint: unfit URL (link)
  5. ^ . AMD. November 8, 2007. Archived from the original on 2017-06-19. Retrieved 2016-05-26.
  6. ^ AMD WW HPC 2007 presentation (PDF) (Report). p. 37.
  7. ^ . AMD. June 16, 2008. Archived from the original on 2017-06-19. Retrieved 2016-05-26.
  8. ^ . AMD. June 23, 2010. Archived from the original on 2017-06-19. Retrieved 2016-05-26.
  9. ^ Smith, Ryan (14 August 2012). "The AMD Firepro W9000 W8000 Review Part 1". Anandtech.com. Retrieved 28 June 2016.
  10. ^ "Beyond3D - ATI R580: Radeon X1900 XTX & Crossfire". Beyond3D.
  11. ^ "AMD Delivers First Stream Processor with Double Precision Floating Point Technology". AMD. November 8, 2007. Retrieved 2016-05-26.
  12. ^ "AMD FireStream 9170 Specs". TechPowerUp.
  13. ^ AMD FireStream 9250 - Product page May 13, 2010, at the Wayback Machine
  14. ^ "AMD FireStream 9250 Specs". TechPowerUp.
  15. ^ AMD FireStream 9270 - Product page February 16, 2010, at the Wayback Machine
  16. ^ "AMD FireStream 9270 Specs". TechPowerUp.
  17. ^ "AMD FireStream 9350 Specs". TechPowerUp.
  18. ^ "AMD FireStream 9370 Specs". TechPowerUp.
  19. ^ a b c AMD APP SDK download page 2012-09-03 at the Wayback Machine and Stream Computing SDK EULA March 6, 2009, at the Wayback Machine, retrieved December 29, 2007
  20. ^ HardOCP report 2016-03-04 at the Wayback Machine, retrieved July 17, 2007
  21. ^ Intel microprocessor export compliance metrics
  22. ^ Valich, Theo (September 12, 2007). . The Inquirer. Archived from the original on September 23, 2009. Retrieved 2016-05-26.{{cite news}}: CS1 maint: unfit URL (link)
  23. ^ AMD Intermediate Language Reference Guide, August 2008

External links edit

  • ATI Stream Technology FAQ 2010-12-30 at the Wayback Machine
  • AnandTech article on distributed computing

firestream, brand, name, their, radeon, based, product, line, targeting, stream, processing, gpgpu, supercomputers, originally, developed, technologies, around, radeon, x1900, 2006, product, line, previously, branded, both, firestream, stream, processor, also,. AMD FireStream was AMD s brand name for their Radeon based product line targeting stream processing and or GPGPU in supercomputers Originally developed by ATI Technologies around the Radeon X1900 XTX in 2006 the product line was previously branded as both ATI FireSTREAM and AMD Stream Processor 1 The AMD FireStream can also be used as a floating point co processor for offloading CPU calculations which is part of the Torrenza initiative The FireStream line has been discontinued since 2012 when GPGPU workloads were entirely folded into the AMD FirePro line Contents 1 Overview 2 History 3 Models 4 Software 4 1 Software Development Kit 4 1 1 AMD APP SDK 5 Benchmarks 6 Limitations 7 See also 8 References 9 External linksOverview editThe FireStream line is a series of add on expansion cards released from 2006 to 2010 based on standard Radeon GPUs but designed to serve as a general purpose co processor rather than rendering and outputting 3D graphics Like the FireGL FirePro line they were given more memory and memory bandwidth but the FireStream cards do not necessarily have video output ports All support 32 bit single precision floating point and all but the first release support 64 bit double precision The line was partnered with new APIs to provide higher performance than existing OpenGL and Direct3D shader APIs could provide beginning with Close to Metal followed by OpenCL and the Stream Computing SDK and eventually integrated into the APP SDK For highly parallel floating point math workloads the cards can speed up large computations by more than 10 times Folding Home the earliest and one of the most visible users of the GPGPU obtained 20 40 times the CPU performance 2 Each pixel and vertex shader or unified shader in later models can perform arbitrary floating point calculations History editFollowing the release of the Radeon R520 and GeForce G70 GPU cores with programmable shaders the large floating point throughput drew attention from academic and commercial groups experimenting with using then for non graphics work The interest led ATI and Nvidia to create GPGPU products able to calculate general purpose mathematical formulas in a massively parallel way to process heavy calculations traditionally done on CPUs and specialized floating point math co processors GPGPUs were projected to have immediate performance gains of a factor of 10 or more over compared to contemporary multi socket CPU only calculation With the development of the high performance X1900 XFX nearly finished ATI based its first Stream Processor design on it announcing it as the upcoming ATI FireSTREAM together with the new Close to Metal API at SIGGRAPH 2006 3 The core itself was mostly unchanged except for doubling the onboard memory and bandwidth similar to the FireGL V7350 new driver and software support made up most of the difference Folding home began using the X1900 for general computation using a pre release of version 6 5 of the ATI Catalyst driver and reported 20 40x improvement in GPU over CPU 2 The first product was released in late 2006 rebranded as AMD Stream Processor after the merger with AMD 4 The brand became AMD FireStream with the second generation of stream processors in 2007 based on the RV650 chip with new unified shaders and double precision support 5 Asynchronous DMA also improved performance by allowing a larger memory pool without the CPU s help One model was released the 9170 for the initial price of 1999 Plans included the development of a stream processor on an MXM module by 2008 for laptop computing 6 but was never released The third generation quickly followed in 2008 with dramatic performance improvements from the RV770 core the 9250 had nearly double the performance of the 9170 and became the first single chip teraflop processor despite dropping the price to under 1000 7 A faster sibling the 9270 was released shortly after for 1999 In 2010 the final generation of FireStreams came out the 9350 and 9370 cards based on the Cypress chip featured in the HD 5800 This generation again doubled the performance relative to the previous to 2 teraflops in the 9350 and 2 6 teraflops in the 9370 8 and was the first built from the ground up for OpenCL This generation was also the only one to feature fully passive cooling and active cooling was unavailable The Northern and Southern Islands generations were skipped and in 2012 AMD announced that the new FirePro W workstation and S server series based on the new Graphics Core Next architecture would take the place of FireStream cards 9 Models editFireStream 9170 include Direct3D 10 1 OpenGL 3 3 and APP Stream FireStream 92x0 include Direct3D 10 1 OpenGL 3 3 and OpenCL 1 0 FireStream 93x0 include Direct3D 11 OpenGL 4 3 and OpenCL 1 2 with Last Driver updates Model Codename Launch Architecture Fab Bus interface Stream processors Clock rate Memory Processing power a GFLOPS TDP Watts Core MHz Memory MHz Size MB Type Bus width bit Bandwidth GB s Single Double Stream Processor R580 2006 R50080 nm 240 600 1024 GDDR3 256 83 2 375 10 N A 165 FireStream 9170 RV670 11 12 November 8 2007 TeraScale 155 nm PCIe 2 0 x16 320 800 800 2048 GDDR3 256 51 2 512 102 4 105 FireStream 9250 RV770 13 14 June 16 2008 TeraScale 155 nm PCIe 2 0 x16 800 625 993 1024 GDDR3 256 63 6 1000 200 150 FireStream 9270 RV770 15 16 November 13 2008 TeraScale 155 nm PCIe 2 0 x16 800 750 850 2048 GDDR5 256 108 8 1200 240 160 FireStream 9350 Cypress XT 17 June 23 2010 TeraScale 240 nm PCIe 2 1 x16 1440 700 1000 2048 GDDR5 256 128 2016 403 2 150 FireStream 9370 Cypress XT 18 June 23 2010 TeraScale 240 nm PCIe 2 1 x16 1600 825 1150 4096 GDDR5 256 147 2 2640 528 225 vte Precision performance is calculated from the base or boost core clock speed based on a FMA operation Software editSee also Close to Metal The AMD FireStream was launched with a wide range of software platform support One of the supporting firms was PeakStream acquired by Google in June 2007 who was first to provide an open beta version of software to support CTM and AMD FireStream as well as x86 and Cell Cell Broadband Engine processors The FireStream was claimed to be 20 times faster in typical applications than regular CPUs after running PeakStream s software citation needed RapidMind also provided stream processing software that worked with ATI and NVIDIA as well as Cell processors 19 Software Development Kit edit After abandoning their short lived Close to Metal API AMD focused on OpenCL AMD first released its Stream Computing SDK v1 0 in December 2007 under the AMD EULA to be run on Windows XP 19 The SDK includes Brook an AMD hardware optimized version of the Brook language developed by Stanford University itself a variant of the ANSI C C language open sourced and optimized for stream computing The AMD Core Math Library ACML and AMD Performance Library APL with optimizations for the AMD FireStream and the COBRA video library further renamed as Accelerated Video Transcoding or AVT for video transcoding acceleration will also be included Another important part of the SDK the Compute Abstraction Layer CAL is a software development layer aimed for low level access through the CTM hardware interface to the GPU architecture for performance tuning software written in various high level programming languages In August 2011 AMD released version 2 5 of the ATI APP Software Development Kit 19 which includes support for OpenCL 1 1 a parallel computing language developed by the Khronos Group The concept of compute shaders officially called DirectCompute in Microsoft s next generation API called DirectX 11 is already included in graphics drivers with DirectX 11 support AMD APP SDK edit Main article AMD APP SDKBenchmarks editAccording to an AMD demonstrated system 20 with two dual core AMD Opteron processors and two Radeon R600 GPU cores running on Microsoft Windows XP Professional 1 teraflop TFLOP can be achieved by a universal multiply add MADD calculation By comparison an Intel Core 2 Quad Q9650 3 0 GHz processor at the time could achieve 48 GFLOPS 21 In a demonstration of Kaspersky SafeStream anti virus scanning that had been optimized for AMD stream processors was able to scan 21 times faster with the R670 based acceleration than with search running entirely on an Opteron in 2007 22 Limitations editRecursive functions are not supported in Brook because all function calls are inlined at compile time Using CAL functions recursive or otherwise are supported to 32 levels 23 Only bilinear texture filtering is supported mipmapped textures and anisotropic filtering are not supported Functions cannot have a variable number of arguments The same problem occurs for recursive functions Conversion of floating point numbers to integers on GPUs is done differently than on x86 CPUs it is not fully IEEE 754 compliant Doing global synchronization on the GPU is not very efficient which forces the GPU to divide the kernel and do synchronization on the CPU Given the variable number of multiprocessors and other factors there may not be a perfect solution to this problem The bus bandwidth and latency between the CPU and the GPU may become a bottleneck See also editStream Processing ROCm Heterogeneous System Architecture NVIDIA Tesla similar solution by Nvidia Intel Xeon Phi similar solution by Intel Open Computing Language OpenCL an industry standard Compute Unified Device Architecture CUDA a proprietary Nvidia only solution List of AMD graphics processing unitsReferences edit AMD Press Release a b Gasior Geoff October 16 2006 A closer look at Folding home on the GPU The Tech Report Retrieved 2016 05 26 ATI SIGGRAPH 2006 Presentation PDF Report ATI Technologies Archived from the original PDF on 2016 12 21 Retrieved 2016 05 26 Valich Theo November 16 2006 ATI FireSTREAM AMD Stream board revealed The Inquirer Archived from the original on August 21 2009 Retrieved 2016 05 26 a href Template Cite news html title Template Cite news cite news a CS1 maint unfit URL link AMD Delivers First Stream Processor with Double Precision Floating Point Technology AMD November 8 2007 Archived from the original on 2017 06 19 Retrieved 2016 05 26 AMD WW HPC 2007 presentation PDF Report p 37 AMD Stream Processor First to Break 1 Teraflop Barrier AMD June 16 2008 Archived from the original on 2017 06 19 Retrieved 2016 05 26 Newest AMD FireStream TM GPU Compute Accelerators Deliver Almost 2x Single and Double Precision Peak Performance and Performance Per Watt Over Last Generation AMD June 23 2010 Archived from the original on 2017 06 19 Retrieved 2016 05 26 Smith Ryan 14 August 2012 The AMD Firepro W9000 W8000 Review Part 1 Anandtech com Retrieved 28 June 2016 Beyond3D ATI R580 Radeon X1900 XTX amp Crossfire Beyond3D AMD Delivers First Stream Processor with Double Precision Floating Point Technology AMD November 8 2007 Retrieved 2016 05 26 AMD FireStream 9170 Specs TechPowerUp AMD FireStream 9250 Product page Archived May 13 2010 at the Wayback Machine AMD FireStream 9250 Specs TechPowerUp AMD FireStream 9270 Product page Archived February 16 2010 at the Wayback Machine AMD FireStream 9270 Specs TechPowerUp AMD FireStream 9350 Specs TechPowerUp AMD FireStream 9370 Specs TechPowerUp a b c AMD APP SDK download page Archived 2012 09 03 at the Wayback Machine and Stream Computing SDK EULA Archived March 6 2009 at the Wayback Machine retrieved December 29 2007 HardOCP report Archived 2016 03 04 at the Wayback Machine retrieved July 17 2007 Intel microprocessor export compliance metrics Valich Theo September 12 2007 GPGPU drastically accelerates anti virus software The Inquirer Archived from the original on September 23 2009 Retrieved 2016 05 26 a href Template Cite news html title Template Cite news cite news a CS1 maint unfit URL link AMD Intermediate Language Reference Guide August 2008External links editATI Stream Technology FAQ Archived 2010 12 30 at the Wayback Machine ATI Stream published papers and presentations ATI Stream SDK AnandTech article on distributed computing AMD Intermediate Language Reference Guide CAL v2 0 Feb 09 Retrieved from https en wikipedia org w index php title AMD FireStream amp oldid 1167192320, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.