RSS feed
Proceeding of ASPLOS 2011 available
Posted on 17-3-2011 by Zhenyu Ye Tags: conference

The proceeding of ASPLOS 2011 is available.

Data structures in the multicore age
Posted on 11-3-2011 by Zhenyu Ye Tags: application, architecture, algorithms, multicore

Data structures in the multicore age, by Nir Shavit, is an interesting article in Communications of the ACM 2011. Optimizing even a simple data structure, e.g. a stack, is shown to require mind-blowing endeavor. The stack example is comparable to optimizing the histogramming on GPU. Although algorithms and data structures are highly coupled, the author’s emphasis on data structures seems to motivate the data structure driven exploration for parallel algorithm.

International symposium on Field programmable gate arrays 2011
Posted on 9-3-2011 by Zhenyu Ye Tags: FPGA, architecture, tool, OpenCL, conference

The proceeding of FPGA 2011 was released. There is a pre-conference workshop on The Role of FPGAs in a Converged Future with Heterogeneous Programmable Processors, where ALTERA describes its OpenCL initiatives.

PIPS: Automatic Parallelizer and Code Transformation Framework
Posted on 4-3-2011 by Zhenyu Ye Tags: tool, GPU, CUDA

The PIPS framework performs source-to-source transformation. The input can be C or Fortran. The output can be OpenMP, SSE, and CUDA (with limited optimization). The team is working on OpenCL output support and improving quality of the generated CUDA code. An overview of the framework is well described in the PIPS tutorial in PPoPP 2010.

Related projects: Par4All, OpenGPU

Computer Architecture 2010 Top Picks
Posted on 3-3-2011 by Zhenyu Ye Tags: architecture, conference

The Jan-Feb 2011 issue of IEEE Micro has selected 11 best papers published in top computer architecture conferences (5 from ISCA, 3 from Micro, 2 from ASPLOS, 1 from HPCA) in 2010. In the introduction of this special issue, Yale Patt and Onur Mutlu summarise a few observations regarding future conference reviewing. The number one observation is more focus on insights over quantitative results.

ISCA 2010 Slides Updated
Posted on 2-3-2011 by Zhenyu Ye Tags: architecture, conference, neural network

ISCA 2010 website posts additional slides of keynotes and oral presentations. There is a motivating keynote on the rebirth of neural networks.

NVIDIA announces CUDA 4.0
Posted on 2-3-2011 by Gert-Jan van den Braak Tags: GPU, CUDA

NVIDIA just announced the release of the first release candidate of CUDA 4.0 to registered developers next Friday (March 4th). The main improvement of this new CUDA version is better multi-GPU support. With the new release multiple GPUs can be controlled from a single thread, and multiple threads can work on the same GPU. This will make it a lot easier to create multi-GPU programs. Luckily we already have a multi-GPU setup, I can hardly wait to test this new features.

Another nice feature is unified virtual addressing, which puts all CPU and GPU data in the same address space. Also GPU-to-GPU memory copies are now supported, which will also help in making multi-GPU applications.

Technical discussion about CUDA 4.0 can be found on the CUDA forum . Pretty pictures by AnandTech: NVIDIA Announces CUDA 4.0

Update: NVIDIA placed a presentation on their developers website: CUDA Toolkit 4.0 overview

MV5 Simulator -- mv5sim
Posted on 2-3-2011 by Zhenyu Ye Tags: simulator

The mv5sim simulator is an extended version of the m5sim. The mv5sim, developed in-house, has been used for several many-core related papers (listed at the bottom of its tutorial website). From the past experience, m5sim has a steep learning curve, compared to gpgpu-sim.