Algorithm classification
Posted on 1-3-2011 by Cedric Nugteren Tags: GPU

In this project, we focus on the classification of algorithms. This classification can be helpful for code-generation (e.g. using skeletonization), performance prediction or architecture design. We distinguish ourselves from other classifications by focusing on three requirements:

  • It must be modular
  • It must cover all applications, e.g. be complete
  • It must be detailed enough to serve the goals.

At first, we focus on the initial domain of image processing and computer vision applications. We take a GPU as an example target for the goals.

People involved: Cedric Nugteren

Adaptive Alarm Clock Using Movement Detection to Differentiate Sleep Phases
Posted on 13-2-2011 by Gert-Jan van den Braak Tags: GPU, application

Waking up can be very troubling. Often people feel tired or confused after being awoken by the alarm clock during a period of deep sleep. This project aims to prevent this. The amount of movement by the sleeper, corresponding to different sleeping phases, is measured using a common laptop with a webcam. A movement detection algorithm is executed on the videofeed to detect movements. This algorithm (some pre-processing and background subtraction) is executed on the GPU, using the CUDA environment. The GPU and CUDA allow for high-speed image processing using cheap hardware, available to consumers.

People involved: Jasper Kuijsten

Analyzing CUDA's Compiler through the Visualization of Decoded GPU Binaries
Posted on 13-2-2011 by Cedric Nugteren Tags: GPU, CUDA, tool, compiler

Part of the compiler in the CUDA tool-chain is entirely undocumented, as is its output. To draw conclusions on the behaviour of this compiler, GPU object code is reverse engineered. A visualization tool is introduced, analyzing the previously unknown compiler behaviour and proving helpful to improve the mapping process for the programmer. These improvements focus on the area of register allocation. The research gives an extension to the CUDA tool-chain, providing programmers with a visualization of register life ranges. Also, the research gives guidelines describing how to apply optimizations in order to obtain a lower register pressure.

People involved: Cedric Nugteren

Automated memory optimizations
Posted on 13-2-2011 by Gert-Jan van den Braak Tags: GPU, auto-tuning, tool

Programming GPUs can be a big challenge. Especially getting all the memory accesses right, can make the difference between a good and a bad GPU implementation. To help the programmer to make his global memory accesses coalesced, we created a tool which analyzes the memory accesses at compile-time. When possible, the tool will optimize the kernel (and host code) to improve the memory bandwidth utilization by swapping loops (thread index vs. block index) and combining threads together.

People involved: Gert-Jan van den Braak