GPGPU-4 proceedings available
Posted on 13-4-2011 by Cedric Nugteren Tags: GPU, conference

The proceedings of the 4th workshop on General Purpose Graphics Processing Units is now available through the ACM portal

Interesting papers are:

ISSCC 2011 Processor Sessions
Posted on 9-4-2011 by Zhenyu Ye Tags: architecture, conference

The proceeding of International Solid-State Circuits Conference (ISSCC) 2011 was released. There are some interesting papers in the processor sessions:

Some interesting CGO papers
Posted on 7-4-2011 by Cedric Nugteren Tags: conference

Last week, the CGO conference on code generation and optimization was held in France. The whole program can be found here, but the proceedings are not available yet. Some interesting papers:

Embedded Vision Archtiecture, in Your Optical Mouse, in 1981.
Posted on 27-3-2011 by Zhenyu Ye Tags: architecture, vision

The Optical Mouse, and an Architectural Methodology for Smart Digital Sensors, a techreport published by Richard F. Lyon, the inventor of optical mouse, in 1981. Due to resource limitation in the 80s, the image sensor only has dozens of pixels, and the mapping and tracking of pixels are hard-coded into the ASIC. 20 years later, mega-pixel image sensors are ubiquitous, and computing power is abundant. Now it is a great time for smart people to invent the next killer app of embedded vision.

MAPLE (MAssively Parallel Learning/Classification Engine)
Posted on 23-3-2011 by Gert-Jan van den Braak Tags: GPU, FPGA, architecture

NEC Lab has published a paper, titled “A programmable parallel accelerator for learning and classification”, in PACT 2010. The MAPLE is a programmable processor for the learning and classification domain. Multiple learning and classification algorithms, including SSI, CNNs, K-means, SVM, and GLVQ, have been mapped on MAPLE. The performance results, obtained from an FPGA-based prototype, are promising. For SVM, MAPLE can be 50% slower than a non-programmable FPGA-based accelerator. For CNN, the performance of MAPLE is comparable to non-programmable FPGA-based accelerator. For the SSI and CNN, MAPLE outperforms the GPU implementation, but for the SVM, the MAPLE underperforms the GPU.

They have a follow-up paper, title “An Energy-Efficient Heterogeneous System for Embedded Learning and Classification”, which plugs an FPGA prototype board of MAPLE onto an NVIDIA ION platform. The energy efficiency of this heterogeneous system is analysed.

Cyber-Physical Systems
Posted on 22-3-2011 by Zhenyu Ye Tags:

Edward A. Lee has long been advocating the cyber-physical system concept. He has recently published a book, titled Introduction to Embedded Systems – A Cyber-Physical Systems Approach, on this subject.

According to Edward A. Lee, “Embedded System” is different from “Cyber-Physical System” in that “embedded
systems are closed boxes that do not expose the computing capability to the
outside”. More motivations of the cyber-physical system approach is described in his position paper for the Workshop On Cyber-Physical Systems in 2006, titled Cyber-Physical Systems – Are Computing Foundations Adequate?.

FPGA20: Highlighting Significant Contributions from 20 Years of the International Symposium on Field-Programmable Gate Arrays (1992--2012)
Posted on 20-3-2011 by Zhenyu Ye Tags: FPGA

FPGA20 is going to nominate, among this list of candidates, about 25 most important papers published in the FPGA conference from 1992 to 2012. There is an old paper, titled Unifying FPGAs and SIMD Arrays, which shows interesting analysis on “the differences and similarities between the FPGA array architecture and SIMD array architecture”. It also touches “techniques and lessons which can be transfered between the architectures” and use a unified model to show the promising prospects for hybrid array architecture.

Random Thoughts on Holistic Approach
Posted on 20-3-2011 by Zhenyu Ye Tags: architecture, NOC

The NOC community has long been calling for a holistic approach. Those calls were answered, at least partially. For example,
Q: Key research problems in NoC design: a holistic perspective (in CODES+ISSS ’05)
A: Network-on-Chip Architectures: A Holistic Design Exploration (a book by Chrysostomos Nicopoulos, et al. in 2010)

The parallel computing community has also been calling for a holistic approach, but those calls are not (yet) properly answered compared to the NOC community. For example,
Q: The Landscape of Parallel Computing Research: A View from Berkeley (Berkeley TechReport 2006)
A: Emm…
It is unlikely that the researchers in the parallel computing community are less competent. There could be other reasons,

  • The key problems are simply too hard
  • The key problems are not properly defined
  • The key problems are too hard AND not properly defined
  • Other thoughts