Easily write GPU accelerated programs

Write your application using .NET languages like C# and F#*

Kernel methods act like any other method. You don’t need to spend loads of time learning a new language or specific GPU's programming model and your application keeps the same .NET look and feel you're used to. *Please see our current list of supported languages.

Debug and release correct code more quickly

With integration into Microsoft’s Visual Studio®, now you can find errors at compile time instead of runtime. Visual Studio features like IntelliSense®, the Error List and unit-testing help you write your code fast while ensuring it's correctness. In future versions, GPU.NET will be compatible with NVIDIA's Parallel Nsight® so developers can make use of hardware debugging.

Write code that runs orders of magnitude faster than CPU-based code

Due to their many-core architecture, GPUs lend themselves very well to performing computations on large amounts of data in parallel. Not only can they execute certain calculations exponentially faster, they cost fractions of equal power CPU-based systems to own and operate. It has never been easier to create your own personal supercomputer so your team can make data driven decisions faster.

Be just as fast as native code

GPU.NET has its own compiler and runtime instead of only wrapping around proprietary libraries. It uses light-weight JIT compilation to generate code that is mapped to a hardware vendor's instruction set architecture (ISA). GPU.NET's complete control of the software stack means your code runs as fast as code written in proprietary languages in most cases. Find out more about how it works.

Deploy to all your end users with a single binary

Create applications that stay 100% managed, so they work on 32/64 bit Windows on the .NET framework, or 32/64 bit Linux and Mac OSX through Mono.

Have a built-in contingency plan

Your code won’t break if your end user doesn’t have GPUs, it will fall back to normal .NET, or a custom fallback method you define.

Protect your software against ever-changing hardware architecture

Due to GPU.NET's unique plug-in system, your code stays fast even when the target hardware changes. Your program will automatically support different types of hardware that your end users have and new hardware architectures as they come out, without having to re-write your code.

Accelerate your exisiting codebases

Using GPU.NET's minimalistic, directive-style API, methods that you want GPU-accelerated are marked with the [Kernel] attribute. Check out our Coding section to learn more.


How GPU.NET Works

Development

  1. The developer annotates their code using the [Kernel] attribute in the device methods library.
  2. The .NET assembly is then compiled as normal with the annotations to instruct the Build Tool which methods need to be executed on the GPU.
  3. The Build Tool analyzes the assembly and injects calls to the GPU.NET Runtime. The result is a single, cross-platform, GPU-accelerated .NET binary.
  1. At execution, the Runtime checks the system for available hardware.
  2. The runtime then passes the GPU (kernel) method to the correct vendor plug-in, so the method can be JIT compiled to the hardware vendor’s Instruction Set Architecture (ISA).
  3. Finally, the Runtime executes the compiled device code, and transfers the result back from the device.

Writing Code with GPU.NET

Host side code

// Create a standard .NET array of integers
const int Count = 0x1000000;
float[] a = new float[Count];
float[] b = new float[Count];

// Create an array to hold the output values
float[] c = new float[Count];

// Set grid/block size for GPU execution
Launcher.SetGridSize(256);
Launcher.SetBlockSize(128);

// Call the kernel method
AddGpu(a, b, c);

// The results are immediately available in 'c'.

Kernel Method

[Kernel]
private static void AddGpu(float[] a, float[] b, float[] c)
{
    // Get the thread id and total number of threads
    int ThreadId = BlockDimension.X * BlockIndex.X + ThreadIndex.X;
    int TotalThreads = BlockDimension.X * GridDimension.X;

    // Loop over the vectors 'a' and 'b', adding them
    // pairwise and storing the sums in 'c'
    for (int ElementIndex = ThreadId; ElementIndex < a.Length; ElementIndex += TotalThreads)
    {
        c[ElementIndex] = a[ElementIndex] + b[ElementIndex];
    }
}

Starting after the AddGpu method is called, the results from the GPU computation are stored in variable 'c', and can now be used in the host side code. It's that easy!

Click here if you'd like to view the API documentation

Check out our tutorials for step-by-step instructions on building a GPU-accelerated application, or visit our GitHub repository to download and try some code for yourself.