RSS feed
Location, Location, Location
Posted on 23-1-2013 by Zhenyu Ye Tags: FPGA, architecture, parallel processing, reconfigurable computing

Location, Location, Location—The Role of Spatial Locality in Asymptotic Energy Minimization (PDF), by André DeHon, a seminal paper in ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA) 2013. It is an elegant paper with great insight! We need more papers like this.

Redefining the Role of the CPU in the Era of CPU-GPU Integration
Posted on 3-1-2013 by Zhenyu Ye Tags: CPU, GPU, application, architecture, SIMD, parallel processing

Here is the article:
Redefining the Role of the CPU in the Era of CPU-GPU Integration (PDF), in IEEE Micro Nov.-Dec. 2012.

This article points out that, in the context of CPU-GPU integration, the workload characteristics on the CPU side change significantly. This article provides great insights into the design issues on CPU architecture. To name a few:

  • single thread performance becomes more relevant
  • better branch predictors are needed for fewer but harder branches
  • smarter memory pre-fetchers are needed for hard memory access patterns
  • short vector instructions become less relevant
  • instruction level parallelism diminishes significantly
ACM SIGDA TCFPGA Wiki and Reading List
Posted on 8-10-2012 by Zhenyu Ye Tags: FPGA, parallel processing, reconfigurable computing

The ACM Special Interest Group for Design Automation (SIGDA)” “Technical Committee on FPGAs and Reconfigurable Computing (TCFPGA)” recently opens a wiki website. The wiki links to a recommended reading list for FPGAs and Reconfigurable Computing, which contains the most important papers in the field.

News on Intel's MIC / Knights Corner / Xeon Phi
Posted on 2-8-2012 by Cedric Nugteren Tags: architecture, MIC

Some recent news provides more information on the second generation of Intel’s MIC, the Knights Corner. The many-core Pentium project, originally started as Larrabee, will be sold under the name Xeon Phi. VR-zone publishes some details on the chip:

“Knights Corner B0 comes in several flavors, with 57C, 60C and 61 cores being the most common configurations. Yes, the company unlocked an odd number of cores, compared to even number in Larrabee and Aubrey Isle. The change in number of processing cores changed L1 and L2 cache, and we now have 1.8-1.9MB of L1 and 28-30.5MB of L2 cache. Onboard memory now greatly varies between the parts, with available flavors being 3GB, 6GB and 8GB of GDDR5 memory.”

Top Picks from the 2011 Computer Architecture Conferences
Posted on 5-6-2012 by Zhenyu Ye Tags:

IEEE Micro May-June 2012 presents the Top Picks from the 2011 Computer Architecture Conferences.

GPGPU-day in Amsterdam
Posted on 1-6-2012 by Cedric Nugteren Tags: GPU, conference, symposium

In September 2010 we organized a 1-day GPU symposium. Now, we would like to direct your attention to a similar event: the GPGPU-day of the ‘Platform Parallel Netherlands’, held 28 June in Amsterdam. More information can be found on the official website:

On 28 June a Parallel Programming Conference will be held in Amsterdam, where researchers and professionals talk about their experiences with GPGPU, OpenCL, CUDA and alike techniques. The program is very diverse and given by various researchers and professionals."

The target audience includes researchers, developers, industry, managers and students. Registration is free for students and academia.

As organizers of the 2010 symposium, we encourage you to visit this GPGPU-day in Amsterdam. Our GPU team will be giving talks and will be available for discussions.

Computing Frontiers '12 papers
Posted on 17-5-2012 by Cedric Nugteren Tags: GPU, programming, architecture, conference, parallel processing

This is a short overview of some of the interesting papers presented at the ACM Computing Frontiers ’12 conference held in Cagliari, Italy.

A selection of papers from CF’12:

  • BSArc: Blacksmith Streaming Architecture for HPC Accelerators: A GPU-like processor with a reconfigurable L2 memory. The processor is simulated with SArcs, a new simulator for GPU-like architectures.
  • Compile-Time Loop Fusion for Unstructured Mesh Applications: Loop fusion applied to unstructured mesh applications using the OP2 active library. Targets both GPUs and CPUs.
  • The Tradeoffs of Fused Memory Hierarchies in Heterogeneous Architectures: An evaluation study of AMD’s APU.
  • DMA-circular: An Enhanced High Level Programmable DMA Controller for Optimal Management Of On-chip Local Memories: A DMA controller to perform global (off-chip) to local (on-chip) memory copies.
  • GA-GPU: Extending a Library-based Global Address Space Programming Model for Scalable Heterogeneous Computing Systems: Evaluating the problems with GPU programming models and implementing a global access space memory consistency model onto a GPU.

At CF’12, there was a special session on Exascale projects in Europe. The four projects discussed where:

  • The CRESTA project, focussing on the application part of Exascale computing.
  • The Teraflux project, which simulates exascale hardware on a simulator. The project uses a dataflow execution model.
  • The DEEP project, creating a ‘Booster’ cluster with Intel MIC processors and a custom network. The Booster cluster is programmed using OmpSs and MPI.
  • The Mont-Blanc project, developing a green supercomputer based on low-power ARM SoCs and GPUs. The first prototype uses Tegra 2 chips, the next prototype will use Tegra 3 boards with Quadro 5010M GPUs. It is programmed using OmpSs (an OpenMP + StarSS directive based programming model).
NVIDIA GPU Technology Conference and Inpar '12
Posted on 10-5-2012 by Cedric Nugteren Tags: GPU, conference

Next week, May 14-17, the GPU Technology Conference (GTC) 2012 will be held in Silicon Valley. The GTC is hosted by NVIDIA, so expect a lot of interesting CUDA/GPU news coming up (e.g. Kepler, CUDA 5). Apart from news, GPU research will also be presented – either at the main conference or at one of the co-located events.

Inpar is one of these co-located events, presenting research work on a variety of topics, including:

The full program is available at their website.