Bones and A-Darwin

This README covers both the source-to-source compiler Bones and the species extraction tool A-Darwin. Please refer to the corresponding sections for documentation.

Bones

Introduction

Recent advances in multi-core and many-core processors requires programmers to exploit an increasing amount of parallelism from their applications. Data parallel languages such as CUDA and OpenCL make it possible to take advantage of such processors, but still require a large amount of effort from programmers. To address the challenge of parallel programming, we introduce Bones.

Bones is a source-to-source compiler based on algorithmic skeletons and a new algorithm classification (named 'algorithmic species'). The compiler takes C-code annotated with class information as input and generates parallelized target code. Targets include NVIDIA GPUs (through CUDA), AMD GPUs (through OpenCL) and CPUs (through OpenCL and OpenMP). Bones is open-source, written in the Ruby programming language, and is available through our website. The compiler is based on the C-parser CAST, which is used to parse the input code into an abstract syntax tree (AST) and to generate the target code from a transformed AST.

Usage

The usage is as follows:

bones --application <input> --target <target> [OPTIONS]

With the following flags:

     --application, -a <s>:   Input application file
          --target, -t <s>:   Target processor (choose from: CPU-C, CPU-OPENCL-AMD,
                              CPU-OPENCL-INTEL, CPU-OPENMP,GPU-CUDA, GPU-OPENCL-AMD)
        --measurements, -m:   Enable/disable timers
              --verify, -c:   Verify correctness of the generated code
 --only-alg-number, -o <i>:   Only generate code for the x-th species (99 -> all)
    --merge-factor, -f <i>:   Thread merge factor, default is 1 (==disabled)
--register-caching, -r <i>:   Enable register caching: 1:enabled (default), 0:disabled
       --zero-copy, -z <i>:   Enable OpenCL zero-copy: 1:enabled (default), 0:disabled
       --skeletons, -s <i>:   Enable non-default skeletons: 1:enabled (default), 0:disabled
             --version, -v:   Print version and exit
                --help, -h:   Show this message

Bones can be invoked from the command-line. Two arguments (-a and -t) are mandatory, others are optional. This is an example of the usage of Bones assuming the file 'example.c' to be present:

bones -a example.c -t GPU-CUDA -c

Examples

The best place to start experimenting with Bones is the ‘examples’ directory. A large number of examples are available in this folder, grouped by algorithmic species (either element, neighbourhood, shared or chunk). The examples illustrate different kinds of coding styles and give a large number of different classes to work with. The folder ‘benchmarks’ gives more examples, taken from the PolyBench/C benchmark set. Additionally, a folder ‘applications’ is included, containing example complete applications.

All examples can be run through Bones for a specific target using an automated Rake task. Executing ‘rake examples:generate’ or simply ‘rake’ will execute Bones for all examples for a given target. The target can be changed in the ‘Rakefile’ found in the root directory of Bones.

Limitations

Bones takes C99 source code as input. However, several coding styles are unsupported as of now or might yield worse performance compared to others. The numerous examples provided should give the user an idea of the possibilities and limitations of the tool. A complete list of coding guidelines and limitations will follow in the future. Currently, an initial list of major limitations and guidelines is given below. In this list, we use 'algorithm' to denote an algorithm captured by an algorithmic species.

A-Darwin

Introduction

The original algorithmic species theory included ASET, a polyhedral based algorithmic species extraction tool. Along with a new non-polyhedral theory, we present a new automatic extraction tool named A-Darwin (short for `automatic Darwin’).

The new tool is largely equal to ASET in terms of functionality, but is different internally. The tool is based on CAST, a C99 parser which allows analysis on an abstract syntax tree (AST). From the AST, the tool extracts the array references and constructs a 5 or 6-tuple for each loop nest. Following, merging is applied and the species are extracted. Finally, the species are inserted as pragma’s in the original source code. To perform the dependence tests in A-Darwin, we make use of a combination of the GCD and Banerjee tests. Together, these tests are conservative, i.e. we might not find all species.

Usage

The usage is as follows:

adarwin --application <input> [OPTIONS]

With the following flags:

      --application, -a <s>:   Input application file
--no-memory-annotations, -m:   Disable the printing of memory annotations
  --mem-remove-spurious, -r:   Memcopy optimisation: remove spurious copies
  --mem-copyin-to-front, -f:   Memcopy optimisation: move copyins to front
  --mem-copyout-to-back, -b:   Memcopy optimisation: move copyouts to back
    --mem-to-outer-loop, -l:   Memcopy optimisation: move copies to outer loops
           --fusion, -k <i>:   Type of kernel fusion to perform (0 -> disable)
            --print-arc, -c:   Print array reference characterisations (ARC) instead of species
               --silent, -s:   Become silent (no message printing)
  --only-alg-number, -o <i>:   Only generate code for the x-th species (99 -> all)
              --version, -v:   Print version and exit
                 --help, -h:   Show this message

A-Darwin can be invoked from the command-line. One arguments (-a) is mandatory, others are optional. This is an example of the usage of A-Darwin assuming the file ‘example.c’ to be present:

adarwin -a example.c -m -s

For now, it is recommended to use the ‘-m’ flag. The memory optimisation flags (‘-rfbl’) are not fully tested yet. For a more fine-grained classification, A-Darwin is able to print the internal array reference characterisations (ARC) instead (use the ‘-c’ flag).

Known limitations

Installation procedure

Installation of Bones and A-Darin is a simple matter of extracting the Bones/A-Darwin package to a directory of your choice. Bones can also be installed as a gem (‘gem install bones-compiler’). However, there are a number of prerequisites before doing this.

Prerequisites

Bones/A-Darwin requires the installation of Ruby, the Rubygems gem package manager and several gems:

  1. Any version of Ruby 1.8 or 1.9. Information on Ruby is found at www.ruby-lang.org

    • [OS X]: Ruby is pre-installed on any OS X system since Tiger (10.4).

    • [Linux]: Ruby is pre-installed on some Linux based systems. Most Linux package managers (yum, apt-get) will be able to provide a Ruby installation. Make sure that the ruby development package (‘ruby-devel’) is also installed, as it is required by one of the gems.

    • [Windows]: Ruby for Windows can be obtained from rubyinstaller.org

  2. The Rubygems gem package manager. Information on Rubygems can be found at rubygems.org

    • [OS X]: Rubygems is pre-installed on any OS X system since Tiger (10.4).

    • [Linux]: Most Linux package managers will be able to provide a Rubygems installation by installing the package ‘rubygems’.

    • [Windows]: Rubygems for Windows is obtained automatically when installing from rubyinstaller.org

  3. Bones/A-Darwin require the gems, trollop, cast, and symbolic. These gems can be installed by calling Rubygems from the command line, i.e.: ‘gem install trollop cast symbolic’.

For example, all prerequisites can be installed as follows on a Fedora, Red-Hat or CentOS system:

yum install ruby ruby-devel rubygems
gem install trollop cast symbolic

For an Ubuntu, Debian or Mint system, the equivalent commands are:

apt-get install ruby ruby-devel rubygems
gem install trollop cast symbolic

Installing Bones/A-Darwin manually

To install the tools manually, simply extract the ‘bones_x.x.tar.gz’ or ‘adarwin_x.x.tar.gz’ package into a directory of your choice. The Bones/A-Darwin executables are found in the ‘bin’ subdirectory. Including the path to the ‘bin’ directory to your environmental variable ‘PATH’ will make Bones/A-Darwin available from any directory on your machine. Starting at version 1.1, Bones and A-Darwin are also available as a gem (‘gem install bones-compiler’).

Documentation

There are two ways to go to obtain more information regarding Bones/A-Darwin. To obtain more information about the tools themselves, the ideas behind it and the algorithm classification, it is a good idea to read scientific publications. To get more information about the code structure, HTML documentation can be generated automatically using RDoc.

Code documentation

Code documentation can be generated automatically using RDoc. Navigate to the installation root of Bones/A-Darwin and use Rake to generate documentation: ‘rake rdoc’. More information on using Rake is provided later in this document. Next, open ‘rdoc/index.html’ to navigate through the documentation. The same documentation is also available on the web at parse.ele.tue.nl/tools/bones/rdoc/index.

Scientific publications

Scientific publications related to Bones/A-Darwin can be obtained from www.cedricnugteren.nl/publications.php. Several publications are relevant:

  1. Bones: An Automatic Skeleton-Based C-to-CUDA Compiler for GPUs, which provides details on the Bones source-to-source compiler, including optimizations in host-accelerator transfer and loop fusion in kernel code. When referring to GPU code generation using Bones, loop fusion or optimizations in host-accelerator transfer in scientific work, you are kindly asked to include the following citation:

    @INPROCEEDINGS{Nugteren2015a,
       author = {Cedric Nugteren and and Henk Corporaal},
       title = {Bones: An Automatic Skeleton-Based C-to-CUDA Compiler for GPUs},
       journal = {ACM Trans. Archit. Code Optim.},
       volume = {11},
       number = {4},
       year = {2015},
    }
  2. Algorithmic Species Revisited: A Program Code Classification Based on Array References, which provides details on the algorithm classification (the species) and A-Darwin (the tool). When referring to the algorithm classification in scientific work, you are kindly asked to include the following citation:

    @INPROCEEDINGS{Nugteren2013a,
       author = {Cedric Nugteren and Rosilde Corvino and Henk Corporaal},
       title = {Algorithmic Species Revisited: A Program Code Classification Based on Array References},
       booktitle = {MuCoCoS '13: International Workshop on Multi-/Many-core Computing Systems},
       year = {2013},
    }
  3. Automatic Skeleton-Based Compilation through Integration with an Algorithm Classification, which discusses the Bones source-to-source compiler. When referring to Bones in scientific work, you are kindly asked to include the following citation:

    @INPROCEEDINGS{Nugteren2013b,
      author    = {Cedric Nugteren and Pieter Custers and Henk Corporaal},
      title     = {Automatic Skeleton-Based Compilation through Integration with an Algorithm Classification},
      booktitle = {APPT '13: Advanced Parallel Processing Technology},
      year      = {2013},
    }

Rake

Rake is Ruby’s make and can be used to automate tasks. By invoking ‘rake -T’, a list of commands will become available. For example, for A-Darwin, the following rake commands are available:

rake adarwin[file]  # Extract species descriptions using A-Darwin
rake adarwin_test   # Test A-Darwin`s output against golden samples
rake clean          # Remove any temporary products.
rake clobber        # Remove any generated file.
rake clobber_rdoc   # Remove RDoc HTML files
rake rdoc           # Build RDoc HTML files
rake rerdoc         # Rebuild RDoc HTML files

With rake, A-Darwin can be tested on a set of examples ‘rake adarwin_test’. Pre-created golden samples are available in the ‘test’ folder.

Questions

Questions can be directed by email. You can find contact details on the personal page of the author at www.cedricnugteren.nl or on the project page at GitHub.