Section: Journal
R-GPU: A Reconfigurable GPU Architecture
Gert-Jan van den Braak, Henk Corporaal,
ACM Transactions on Architecture and Code Optimization (TACO),
2016
Download PDF | Link to DOI
Tags: GPU, architecture
Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs
Gert-Jan van den Braak, Juan Gómez-Luna, José María González-Linares, Henk Corporaal, Nicolás Guil,
IEEE Transactions on Computers,
2015
Download PDF | Link to DOI
Tags: GPU, scratchpad memory, hash functions
Correlation Ratio Based Volume Image Registration on GPUs
Ang Li, Akash Kumar, Yajun Ha, Henk Corporaal,
Microprocessors and Microsystems (MICPRO),
2015
Tags: GPU, image registration, correlation ratio
A Co-Design Framework with OpenCL Support for Low-Energy Wide SIMD Processor
Dongrui She, Yifan He, Luc Waeijen, Henk Corporaal,
Journal of Signal Processing Systems (JSPS),
2015
Link to DOI
Tags: OpenCL, SIMD, low energy
Bones: An Automatic Skeleton-Based C-to-CUDA Compiler for GPUs
C. Nugteren and H. Corporaal,
ACM Transactions on Architecture and Code Optimization (TACO),
2015
Link to DOI
Tags: GPU, CUDA, compiler, skeletons
A Low Energy Wide SIMD Architecture with Explicit Datapath
Luc Waeijen, Dongrui She, Henk Corporaal and Yifan He,
Journal of Signal Processing Systems (JSPS),
2014
Link to DOI
Tags: low energy, wide SIMD, explicit datapath
Construction and Exploitation of VLIW ASIPs with Heterogeneous Vector-Widths
Diken, E.; Jordans, R.; Corvino, R.; Jóźwiak, L.; Corporaal, H. and Chies, F. A.,
Microprocessors and Microsystems,
2014
Link to DOI
Tags: VLIW, heterogeneous vectorization, DLP
ASAM: Automatic architecture synthesis and application mapping
Jozwiak, L.; Lindwer, M.; Corvino, R.; Meloni, P.; Micconi, L.; Madsen, J.; Diken, E.; Gangadharan, D.; Jordans, R.; Pomata, S.; Pop, P.; Tuveri, G.; Raffo, L. and Notarangelo, G,
Microprocessors and Microsystems,
2013
Link to DOI
Tags: DSE, ASIP, MPSoC
Exploring processor parallelism: Estimation methods and optimization strategies
Roel Jordans, Rosilde Corvino, Lech Jóźwiak, Henk Corporaal,
International Journal of Microelectronics and Computer Science,
2013
Link to publication
Tags: VLIW, issue-width estimation, DSE
An Energy Efficient Method of Supporting Flexible Special Instructions in an Embedded Processor with Compact ISA
D. She, Y. He, and H. Corporaal,
ACM Transactions on Architecture and Code Optimization (TACO),
2013
Link to DOI
Tags: low-power, SFU, architecture
Efficient Communication Support in Predictable Heterogeneous MPSoC Designs for Streaming Applications
Y. He, D. She, S. Stuijk, and H. Corporaal,
Journal of Systems Architecture (JSA),
2013
Link to DOI
Tags: architecture, Communication Assist, Predictable
Algorithmic Species: An Algorithm Classification of Affine Loop Nests for Parallel Programming
C. Nugteren, P. Custers, H. Corporaal,
ACM TACO: Transactions on Architecture and Code Optimisations 9, 4, Article 40,
2013
Link to DOI
Tags: Algorithm Classification, Polyhedral Model
From Xetal-II to Xetal-Pro: On the Road Towards an Ultra Low-Energy and High Throughput SIMD Processor
Y. Pu, Y. He, Z. Ye, S. M. Londono, R. Kleihorst, A. Abbo, and H. Corporaal,
in IEEE Transactions on Circuit and Systems for Video Technology (CAS-VT), Vol. 21, No. 4, pp. 472-484,
2011
Link to publication
Tags: Xetal, SIMD
Section: Conference
Code generation for reconfigurable explicit datapath architectures with LLVM
Adriaansen, M.; Wijtvliet, M.; Jordans, R.; Waeijen, L. and Corporaal, H.,
DSD 2016 - 19th Euromicro Conference on Digital System Design,
2016
Link to publication
Tags: compiler, LLVM, CGRA
SFU-Driven Transparent Approximation Acceleration on GPUs
Ang Li, Shuaiwen Leon Song, Mark Wijtvliet, Akash Kumar and Henk Corporaal,
International Conference on Supercomputing (ICS),
2016
Tags: GPU, SFU, Approximate Computing
The Neuro Vector Engine: Flexibility to Improve Convolutional Net Efficiency for Wearable Vision
Maurice Peemen, Runbin Shi, Sohan Lal, Ben Juurlink, Bart Mesman, and Henk Corporaal,
Design, Automation and Test in Europe (DATE),
2016
Download PDF
Tags: SIMD, VLIW, Convolutional Networks, Accelerator, Locality, ultra-low power
X: A Comprehensive Analytic Model for Parallel Machines
Ang Li, Shuaiwen Leon Song, Eric Brugel, Akash Kumar, Daniel Chavarria, Henk Corporaal,
International Parallel & Distributed Processing Symposium (IPDPS),
2016
Tags:
Critical Points Based Register-Concurrency Autotuning for GPUs
Ang Li, Shuaiwen Leon Song, Akash Kumar, Eddy Z. Zhang, Daniel Chavarria and Henk Corporaal,
Design, Automation and Test in Europe (DATE),
2016
Download PDF | Link to publication
Tags: GPU register
Supplementary File to Adaptive and Transparent Cache Bypassing for GPUs
Ang Li, Gert-Jan van den Braak, Akash Kumar and Henk Corporaal,
SC,
2015
Download PDF
Tags:
Adaptive and Transparent Cache Bypassing for GPUs (nominated for Best Paper and Best Student Paper)
Ang Li, Gert-Jan van den Braak, Akash Kumar and Henk Corporaal,
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC),
2015
Download PDF
Tags: GPU, cache
A Compilation Technique and Performance Profits for VLIW with Heterogeneous Vectors
Erkan Diken, Lech Jozwiak,
3rd EUROMICRO/IEEE Workshop on Embedded and Cyber-Physical Systems (ECYPS),
2015
Tags: compiler, SIMD, Code Generation, VLIW
VLIW Code Generation for a Convolutional Network Accelerator
Maurice Peemen, Wisnu Pramadi, Bart Mesman, and Henk Corporaal,
18th International Workshop on Software and Compilers for Embedded Systems (SCOPES),
2015
Download PDF
Tags: Code Generation, VLIW, Convolutional Networks, Compilation, Software Pipelining
Mixed-Length SIMD Code Generation for VLIW Architectures with Multiple Native Vector-Widths
Erkan Diken, Martin J. O’Riordan, Roel Jordans , Lech Jozwiak, Henk Corporaal and David Moloney,
ASAP 2015 - 26th IEEE International Conference on Application-specific Systems, Architectures and Processors,
2015
Link to publication
Tags: compiler, SIMD, Code Generation, VLIW
Transit: A Visual Analytical Model for Multithreaded Machines
Ang Li, Y.C. Tay, Akash Kumar and Henk Corporaal,
International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC),
2015
Download PDF
Tags: GPU, multithreaded machines, performance modeling, optimization
Fine-Grained Synchronizations and Dataflow Programming on GPUs
Ang Li, Gert-Jan van den Braak, Henk Corporaal and Akash Kumar,
International Conference on Supercomputing (ICS),
2015
Download PDF
Tags: GPU, synchronizations, Spin-lock, Dataflow
Inter-Tile Reuse Optimization Applied to Bandwidth Constrained Embedded Accelerators
Maurice Peemen, Bart Mesman, Henk Corporaal,
DATE '15: Design, Automation and Test in Europe,
2015
Download PDF
Tags: FPGA, Data Reuse, High-Level Synthesis, Tiling, Interchange
A Study of the Potential of Locality-Aware Thread Scheduling for GPUs
Cedric Nugteren, Gert-Jan van den Braak, Henk Corporaal,
7th International Workshop on Multi-/Many-Core Computing Systems (MuCoCoS),
2014
Link to publication
Tags: GPU, thread scheduling
A Data-Reuse Aware Accelerator for Large-Scale Convolutional Networks
Maurice Peemen, Bart Mesman, and Henk Corporaal,
NeuroArch Workshop at ISCA,
2014
Download PDF | Link to publication
Tags: Convolutional Networks, Data Reuse, Fusion, Recomputation
Reduction Operator for Wide-SIMDs Reconsidered
Luc Waeijen, Dongrui She, Henk Corporaal and Yifan He,
51st Design Automation Conference (DAC'51),
2014
Tags:
Construction and Exploitation of VLIW ASIPs with Multiple Vector-Widths
Diken, E.; Jordans, R.; Jóźwiak, L. and Corporaal, H.,
MECO 2014 - 3rd Mediterranean Conference on Embedded Computing,
2014
Tags: VLIW, ASIP, DLP, vectorization
Instruction-set Architecture Exploration of VLIW ASIPs Using a Genetic Algorithm
Jordans, R.; Jóźwiak, L. and Corporaal, H.,
MECO 2014 - 3rd Mediterranean Conference on Embedded Computing,
2014
Tags: DSE, VLIW, ASIP, genetic algorithm
BuildMaster: Efficient ASIP Architecture Exploration Through Compilation and Simulation Result Caching
Jordans, R.; Diken, E.; Jóźwiak, L. and Corporaal, H.,
DDECS 2014 - 17th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems,
2014
Tags: DSE, VLIW, ASIP
A Detailed GPU Cache Model Based on Reuse Distance Theory
Cedric Nugteren, Gert-Jan van den Braak, Henk Corporaal, Henri Bal,
High Performance Computer Architecture (HPCA),
2014
Link to publication
Tags: GPU, cache model
Roofline-aware DVFS for GPUs
Cedric Nugteren, Gert-Jan van den Braak and Henk Corporaal,
ADAPT,
2014
Download PDF | Link to DOI
Tags: GPU, roofline, DVFS
Memory-Centric Accelerator Design for Convolutional Neural Networks
Maurice Peemen, Arnaud A. A. Setio, Bart Mesman and Henk Corporaal,
IEEE International Conference on Computer Design (ICCD),
2013
Download PDF | Link to DOI
Tags: FPGA, Memory hierarchy, loop tiling, accelerators, Convolutional Neural Networks
Simulation and Architecture Improvements of Atomic Operations on GPU Scratchpad Memory
Gert-Jan van den Braak, Juan Gómez-Luna, Henk Corporaal, José María González-Linares, Nicolás Guil,
IEEE International Conference on Computer Design (ICCD),
2013
Download PDF | Link to DOI
Tags: GPU, scratchpad memory, atomic operations, hash functions
Automatic Skeleton-Based Compilation through Integration with an Algorithm Classification
C. Nugteren, P.J.J.M. Custers, H. Corporaal,
APPT: Advanced Parallel Processing Technology,
2013
Download PDF
Tags: GPU, compiler, skeletons, Code Generation
Algorithmic Species Revisited: A Program Code Classification Based on Array References
C. Nugteren, R. Corvino, H. Corporaal,
MuCoCoS '13: International Workshop on Multi-/Many-core Computing Systems,
2013
Download PDF
Tags: Algorithm Classification, Polyhedral Model, Species
Exploring Processor Parallelism: Estimation Methods and Optimization Strategies
Roel Jordans, Rosilde Corvino, Lech Jóźwiak, and Henk Corporaal,
DDECS - 16th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems,
2013
Link to publication
Tags: DSE, VLIW, issue-width
Instruction-set Architecture Exploration Strategies for Deeply Clustered VLIW ASIPs
Roel Jordans, Rosilde Corvino, Lech Jóźwiak, and Henk Corporaal,
ECyPS - EUROMICRO/IEEE Workshop on Embedded and Cyber-Physical Systems,
2013
Link to publication
Tags: DSE, architecture, VLIW
An Efficient Method for Energy Estimation of Application Specific Instruction-set Processors
Roel Jordans, Rosilde Corvino, Lech Jóźwiak, and Henk Corporaal,
DSD - 16th Euromicro Conference on Digital System Design,
2013
Link to publication
Tags: VLIW, Energy estimation
GPU-CC: a Reconfigurable GPU Architecture with Communicating Cores
Gert-Jan van den Braak, Henk Corporaal,
M-SCOPES,
2013
Download PDF | Link to DOI
Tags: GPU, architecture
OpenCL Code Generation for Low Energy Wide SIMD Architectures with Explicit Datapath
D. She, Y. He, L. Waeijen, and H. Corporaal,
International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS),
2013
Tags: SIMD, Code Generation, low-power
SIMD Made Explicit
L. Waeijen, D. She, H. Corporaal, and Y. He,
International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS),
2013
Tags: architecture, SIMD, low-power
Future of GPGPU Micro-Architectural Parameters
C. Nugteren, G.J.W.v.d. Braak, H. Corporaal,
DATE '13: Design, Automation and Test in Europe,
2013
Download PDF | Link to publication
Tags: GPU, architecture
Algorithm Parallelism Estimation for Constraining Instruction-Set Synthesis for VLIW Processors
Roel Jordans, Rosilde Corvino, and Lech Jóźwiak,
DSD - 15th Euromicro Conference on Digital System Design,
2012
Link to DOI
Tags: DSE, VLIW, issue-width
Energy Efficient Special Instruction Support in an Embedded Processor with Compact ISA
Dongrui She, Yifan He and Henk Corporaal,
CASES '12: International Conference on Compilers, Architecture, and Synthesis for Embedded Systems,
2012
Link to DOI
Tags: low energy, Code Generation, SFU, Reconfigurable Architecture
GPU-Vote: A Framework for Accelerating Voting Algorithms on GPU
Gert-Jan van den Braak, Cedric Nugteren, Bart Mesman and Henk Corporaal,
Euro-Par,
2012
Download PDF | Link to DOI
Tags: GPU, Voting Algorithms, hough transform, Histogram
The Boat Hull Model: Enabling Performance Prediction for Parallel Computing Prior to Code Development
C. Nugteren and H. Corporaal,
CF '12: International Conference on Computing Frontiers,
2012
Download PDF | Link to DOI
Tags: GPU, model, roofline, CPU, architecture
Scheduling for Register File Energy Minimization in Explicit Datapath Architectures
Dongrui She, Yifan He, Bart Mesman, Henk Corporaal,
Design, Automation and Test in Europe (DATE),
2012
Download PDF
Tags: TTA, Code Generation, Low Power, Register File
Introducing 'Bones': A Parallelizing Source-to-Source Compiler Based on Algorithmic Skeletons
C. Nugteren and H. Corporaal,
GPGPU-5: Fifth Workshop on General Purpose Processing on Graphics Processing Units at ASPLOS'12,
2012
Download PDF | Link to DOI
Tags: GPU, skeletons
The Boat Hull Model: Adapting the Roofline Model to Enable Performance Prediction for Parallel Computing
C. Nugteren and H. Corporaal,
PPoPP ’12: 17th ACM Symposium on Principles and Practice of Parallel Programming,
2012
Download PDF | Link to DOI
Tags: CPU, GPU, model, architecture, roofline
An automated flow to map throughput constrained applications to a MPSoC
Jordans, R.; Siyoum, F.; Stuijk, S.; Kumar, A. and Corporaal, H.,
Bringing Theory to Practice: Predictability and Performance in Embedded Systems,
2011
Link to DOI
Tags: FPGA, DSE, architecture, multi-core
Demo: An embedded vision system for high frame rate visual servoing
Z. Ye, Y. He, R. Pieters, B. Mesman, H. Corporaal, and P. Jonker,
the 5th ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC’11),
2011
Tags: FPGA, visual servoing, demo
Bottlenecks and Tradeoffs in Ultra High Frame Rate Visual Servoing: A Case Study
Z. Ye, Y. He, R. Pieters, B. Mesman, H. Corporaal, and P. Jonker,
the 12th IAPR Conference on Machine Vision Applications (MVA’11),
2011
Tags: FPGA, visual servoing
Energy Efficient Code Generation for Processors with Exposed Datapath
Dongrui She, Yifan He, Bart Mesman, Henk Corporaal,
9th Workshop on Optimizations for DSP and Embedded Systems (ODES-9),
2011
Download PDF
Tags: TTA, Code Generation, Low Power
Efficiency Optimization of Trainable Feature Extractors for a Consumer Platform
Maurice Peemen, Bart Mesman, Henk Corporaal,
in Proceedings of the 13th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS’11), Ghent, Belgium,
2011
Download PDF | Link to DOI
Tags: GPU, vision, neural networks
Fast Hough Transform on GPUs: Exploration of Algorithm Trade-Offs
Gert-Jan van den Braak, Cedric Nugteren, Bart Mesman, Henk Corporaal,
in Proceedings of the 13th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS’11), Ghent, Belgium,
2011
Download PDF | Link to DOI
Tags: GPU, hough transform
Skeleton-based Automatic Parallelization of Image Processing Algorithms for GPUs
C. Nugteren, H.Corporaal and B. Mesman,
SAMOS XI: International Conference on Embedded Computer Systems,
2011
Download PDF | Link to DOI
Tags: GPU, skeletons
Feasibility Analysis of Ultra High Frame Rate Visual Servoing on FPGA and SIMD Processor
Y. He, Z. Ye, D. She, B. Mesman, and H. Corporaal,
in Proceedings of the 13th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS’11), Ghent, Belgium,
2011
Tags: FPGA, application, vision, SIMD
MOVE-Pro: a Low Power and High Code Density TTA Architecture
Y. He, D. She, B. Mesman, and H. Corporaal,
in Proceedings of the 11th International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS’11), Samos, Greece,
2011
Download PDF
Tags: TTA, architecture, power, low energy
Speed Sign Detection and Recognition by Convolutional Neural Networks
M. Peemen, B. Mesman, H. Corporaal,
International Automotive Congress,
2011
Download PDF
Tags: GPU, application, neural networks
High Performance Predictable Histogramming on GPUs: Exploring and Evaluating Algorithm Trade-offs
C. Nugteren, G.J.W.v.d. Braak, H. Corporaal and B. Mesman,
GPGPU: Fourth Workshop on General Purpose Processing on Graphics Processing Units at ASPLOS'11,
2011
Download PDF | Link to DOI
Tags: GPU, application, Histogram
Fast Huffman Decoding by Exploiting Data Level Parallelism
Tim Drijvers, Carlos Alba Pinto, Henk Corporaal, Bart Mesman, Gert-Jan van den Braak,
SAMOS,
2010
Download PDF | Link to DOI
Tags: SIMD, Huffman
Xetal-Pro: An Ultra-Low Energy and High Throughput SIMD Processor
Y. He, Y. Pu, Z. Ye, S. M. Londono, R. Kleihorst, A. Abbo, and H. Corporaal,
in Proceedings of the 47th ACM/IEEE International Conference on Design Automation (DAC’10), pp. 543-548, Anaheim, USA (HiPEAC Paper Award),
2010
Link to publication
Tags: SIMD, memory, Xetal, low energy
1000 fps Visual Servoing on the Reconfigurable Wide SIMD Processor
Y. He, Z. Ye, D. She, R. Pieters, B. Mesman, and H. Corporaal,
in Proceedings of the 16th Annual Conference of the Advanced School for Computing and Imaging (ASCI’10), Veldhoven, the Netherlands,
2010
Link to publication
Tags: SIMD, application, vision
Compile-time GPU memory access optimizations
Gert-Jan van den Braak, Bart Mesman, Henk Corporaal,
International Conference on Embedded Computer Systems: Architecture, Modelling and Simulation (SAMOS) ,
2010
Download PDF | Link to DOI
Tags: GPU, memory, auto-tuning
Analyzing CUDA's Compiler through the Visualization of Decoded GPU Binaries
C. Nugteren, B. Mesman and H.Corporaal,
ODES-8: Proceedings of the 8th Workshop on Optimizations for DSP and Embedded Systems at CGO '10 ,
2010
Download PDF | Link to publication
Tags: GPU, CUDA
Real-Time Implementations of Hough Transform on SIMD Architecture
Y. He, Z. Zivkovic, R. Kleihorst, A. Danilin, and H. Corporaal,
in Proceedings of the 2nd ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC’08), pp. 1-8, Palo Alto, USA,
2008
Link to publication
Tags: hough transform, SIMD
Real-Time Hough Transform on 1-D SIMD Processors: Implementation and Architecture Exploration
Y. He, Z. Zivkovic, R. Kleihorst, A. Danilin, H. Corporaal, and Bart Mesman,
in Proceedings of the 10th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS’08), LNCS, Vol. 5259, pp. 254-265, Juan-les-Pins, France,
2008
Link to publication
Tags: Xetal, hough transform
Section: Presentation
GPU research in the ES-group
Gert-Jan van den Braak,
NIRICT GPGPU meeting,
2015
Download PDF
Tags: GPU
A high-level implementation of software pipelining in LLVM
Roel Jordans, David Moloney,
EuroLLVM 2015,
2015
Link to publication
Tags: VLIW, LLVM, software-pipelining
moviCompile: An LLVM based compiler for heterogeneous SIMD code generation
Erkan Diken, Roel Jordans, Martin J. O’Riordan,
LLVM devroom FOSDEM,
2015
Download PDF
Tags: LLVM
Visualizing sound and vibrations using a GPU and a 1024-channel microphone array
Wouter Ouwens,
Applied GPGPU-day (Amsterdam),
2013
Tags: GPU
The ‘Bones’ Source-to-Source Compiler: Making Parallel Programming Easy
Cedric Nugteren,
GPGPU-day (Amsterdam),
2012
Download PDF
Tags: compiler
Introduction to GPGPU and GPU-architectures
Gert-Jan van den Braak,
GPGPU-day (Amsterdam),
2012
Download PDF
Tags: GPU, architecture