Tools

Bones: A Parallelizing Source-to-Source Compiler Based on Algorithmic Skeletons

By Cedric Nugteren
Bones is a source-to-source compiler based on algorithmic skeletons and the presented algorithm classification. The compiler takes C-code annotated with class information as input and generates parallelized target code. At this moment, targets include NVIDIA GPUs (through CUDA), AMD GPUs (through OpenCL) and x86 CPUs (through OpenCL). he compiler is based on the C-parser CAST (http://cast.rubyforge.org/), which is used to parse the input code into an abstract syntax tree (AST) ...
View the full description


Algorithm mappings

Hough Transform on GPU

By Gert-Jan van den Braak
Mapping the Hough Transform to a GPU can be tricky, especially when you want to achieve maximum performance. In the paper "Fast Hough Transform on GPUs: Exploration of Algorithm Trade-Offs" (see the Publications page) we introduced three different methods to calculate the Hough transform for lines on a GPU. The first implementation is basic, and is (just a bit) slower than an optimized CPU implementation. The second implementation is aimed at speed ...
View the full description


Speed Sign Detection and Recognition by Convolutional Neural Networks

By Maurice Peemen
Dataset for training and testing a speed sign detection and recognition application: A fully trainable application is for speed sign detection and recognition from a video stream is developed in this work. We show that a fully trainable solution can perform reliable classification under varying circumstances (day and night). When such a parallel neural network is mapped to a parallel platform such as a GPU; real-time detection is achieved with 35 fps ...
View the full description


Highly efficient and predictable histogramming for GPUs

By Cedric Nugteren
Histogramming has been mapped on a GPU prior to this work. Although significant research effort has been spent in optimizing the mapping, we show that the performance and performance predictability of existing methods can still be improved. We present two novel histogramming methods, both achieving a higher performance and predictability than existing methods ...
View the full description