As daylight saving time has ended, days get shorter and evenings get longer, more time becomes available for some reading by the fireplace. In case you would like to read up on GPU architectures, you may find an introduction on GPGPU architectures of the last couple of years below.
Programmable GPU architectures have been around for about seven years now. In November 2006 NVIDIA launched its first fully programmable GPU architecture, the G80 based GeForce 8800. In June 2008 a major revision was introduced, the GT200. This first architecture is described in detail in IEEE Micro volume 28, issue 2 (March-April 2008). The NVIDIA Tesla: A Unified Graphics and Computing Architecture article describes not only the history of NVIDIA GPUs from dedicated graphic accelerators to a unified architecture suitable for GPGPU workloads, but also the CUDA programming model. Many architecture details of the GT200 have been revealed by benchmarks in the paper Demystifying GPU Microarchitecture through Microbenchmarking (PDF).
In 2010 NVIDIA’s launched its next big architecture: Fermi. Many details are described in the Fermi White paper and in the AnandTech article NVIDIA’s GeForce GTX 480 and GTX 470. Later that year an update of the Fermi architecture, oriented more at gaming rather than GPGPU compute, was introduced, the GF104 in the GeForce GTX 460. More (architecture) details are described by AnandTech in NVIDIA’s GeForce GTX 460.
The latest GPGPU architecture by NVIDIA, Kepler, was released in 2012. Another whitepaper by NVIDIA describes this GK110 architecture used in the Tesla K20 GPGPU compute card. Also a gaming version of Kepler has been made: the GK104 used in the GeForce GTX 680. A couple of articles on AnandTech describe the architecture in more detail: the GK104 and the GK110.
For the history of AMD’s programmable GPGPU architecture the best place to start is the AMD Graphics Core Next (GCN) Architecture Whitepaper. It describes the evolution of AMD GPUs from fixed function GPUs to the programmable VLIW5 and VLIW4 GPUs and finally the GCN architecture. Again AnandTech gives some nice insights in the transition from VLIW5 to VLIW4 in the article AMD’s Radeon HD6970 & Radeon HD 6950, and from VLIW to GCN in AMD’s Graphics Core Next Preview.
Google Scholar has released its 2013 list of top conferences/journals. In the category Computing Systems the following conferences/journals of interest are ranked in the top 20:
2. Transactions on Parallel and Distributed Systems
4. Supercomputing (SC)
13. IEEE Micro
Rankings of interest are:
David Patterson has recently published a tech report titled How to Build a Bad Research Center. David Patterson’s research centers date back to the X-Tree in 1977, and include the famous RISC, RAID, and Network of Workstations. His recent research centers include the Par Lab, the AMP Lab, and the ASPIRE Lab. In this report David Patterson summarizes eight pitfalls of building research centers, and provides suggestions to avoid these pitfalls.
p.s. The Par Lab has an end of project celebration on May 31 2013, with talk slides available. A book titled The Berkeley Par Lab: Progress in the Parallel Computer Landscape will be published soon.
Update: the June 2013 Top500 list is available now.
The Top500 Supercomputer List will be updated in the International Supercomputing Conference on June 16. From many sources, we may expect an Intel MIC (i.e., Xeon Phi) based system topping the list. According to HPCWire, the new system will have a peak performance of 53-55 Petaflop and a LINPACK performance of 27-29 Petaflop. At the moment, these numbers stay as rumours, until the official announcement, if the system can be tested in time to make it there.
A paper, titled Cache-aware Roofline model: Upgrading the loft, is to appear in Computer Architecture Letters. The ideas and experiments in this paper are relevant and similar to the ongoing research of PARSE members.
NVIDIA has updated their GPU and Tegra roadmaps at the GPU Technology Conference (GTC), held in the last week of March 2013.
The GPU roadmap includes Volta as the successor of Maxwell, in turn being the successor of the current Kepler architecture. Volta will be NVIDIA’s first 3D stacked GPU, stacking DRAM chips on top of the logic. Using this technology, Volta is said to achieve a bandwidth of 1TB/s.
The Tegra roadmap introduces the Tegra 5 (Logan) and Tegra 6 (Parker) SoCs. Logan will for the first time include a desktop GPU (Kepler architecture), allowing it to run CUDA programs. Parker will feature NVIDIA’s own ARM-based CPU architecture (Denver) and an updated GPU core.
The proceeding of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays 2013 is now available on the ACM Digital Library. There are two papers on polyhedral optimization:
The programme of ASPLOS 2013 was recently published online. There are a number of interesting publications related to GPUs:
Co-located with ASPLOS is the 6th edition of the GPGPU workshop, of which the program is also expected to be published in the coming weeks.