Parallel NSight Debugging on Single GPU!!


I remember and if I’m not wrong, on Oct, 2009 (to be very precise 😉 ) NVIDIA launched its first development environment for massively parallel computing; found its place inside Microsoft Visual Studio, the world’s most popular development environment, known as the NEXUS.

By the way the product is now known as the Parallel Nsight. The current release is 2.0. NVIDIA® Parallel Nsight™ brings lots of feature set to the massively parallel programmers and developers giving access to more tools and workflows they expect from developing on the CPU, support to Microsoft Visual Studio 2008/2010, support for CUDA Toolkit version 3.2/4.0, attach to process support, PTX/SASS assembly debugging, other advanced debugging and analysis capabilities, graphics performance and stability enhancements. No matter the environment is fantastic and helpful in many ways.

In the Past, lots of developers and enthusiasts have shown there interests in this particular tool. On top of it this tool is Free-of-charge for the visual studio developers. Remember that I’m stressing on “Visual Studio”!! In fact the environment is only available on Windows(Windows Vista and Windows 7, both x86 and x64 platforms) and visual studio.

NVIDIA® Parallel Nsight™ software supports GeForce 9, 200 and 400 Series Graphics Processors, as well as, select Quadro and Tesla GPUs. For complete list of supported GPU’s, see here. That means you must have atleast one of them to use the NSight.

NSight supports one or more hardware configuration, namely:

  • Single GPU
  • Dual GPU
  • Two Systems with Single GPU in each
  • Dual GPU System
    SLI MultiOS

To know more about these configurations, kindly visit this link->Hardware Configurations.

I guess that’s enough introduction to the NSight tool. You must try this tool as soon as possible if you are interested in massively parallel programming or if you are already doing so, grab a copy of it from NSight Developer site.

Coming back to the actual motive of this article, “Debugging on Single GPU system”.

Past few months I have seen lot many people asking “How can I debug my CUDA program on a single GPU system? I can’t debug my program on a single GPU system, its worthless downloading NSight for single GPU owners”. To be honest NSight was actually targeted towards enterprise development and scientific R&D. Over the years the GPU’s have become powerful, affordable and realization of power of GPU was brought by NVIDIA CUDA. So GPGPU is no more an enterprise or scientific R&D field.

Many enthusiasts have shown there interests and have started integrating the CUDA into there applications. CUDA has also found its place among the student projects and most of them have been using Single GPU systems.

Back to the topic:

Debugging CUDA applications on single GPU machine have always been a priority amongst the enthusiasts. This type of Debugging is also known as the “Local Debugging”.

Local Debugging, because both host and target is the same machine. As per the NVIDIA® Parallel Nsight™ you cannot perform the C/C++ debugging unless that particular machine has at-least two GPU’s each must adhere to the supported list of GPU’s. See the available hardware configurations that NVIDIA suggests. So technically speaking you cannot perform a debugging on a Single GPU machine. This is due to fact that debugging may produce undesirable results that might hang or force restart the display driver, and if you have only single GPU then you might not be able to debug the application as you would desire.

But don’t worry this article will show you how to do that even if you have a single GPU system. There is a simple trick behind this procedure.

So before starting to explore this tip/trick, check your gear first (Going with the current versions):

Assuming that you are familiar with GPU programming with CUDA and some background of NSight. If you have used CUDA only and no idea about this environment, then the following video might help you think a little! Have a good look at the features what NSight has to give you.

  • Download and Install NVIDIA CUDA Toolkit 4.0 (You must be a registered developer at NVIDIA Developer Site)
  • Install Microsoft Visual Studio 2010
  • Download and Install NVIDIA Parallel NSight 2.0 (You must be a registered developer at NVIDIA Developer Site)
  • NVIDIA Drivers (270.61 WHQL as on writing this article)
  • Motherboard Drivers and Display drivers.

To achieve debugging on a single GPU system,

  • You must switch of your system. Oh yes and I am not joking.
  • Boot your system and enter into the BIOS setting. Go to the advance BIOS Settings or search for an option that says “Display Init First” or something like that. Default must be PCI or PEG (PCI Express Graphics). Change it to “On-board”. Switch of again.
  • Now physically connect you display to the  on-board display output feed. If you have multiple monitors then switch to one of them which is connected to the on-board display. If you have single display then you can also use external display switch.
  • Boot the PC and log on to Windows. Install the motherboard drivers and display drivers if you haven’t already.
  • Now you can run Visual Studio and NSight to debug your program.
  • That’s all you are done.

At this time your primary display is on-board. PCIe GPU acts as secondary GPU. When you launch NSight it checks for dual display in the windows registry. So simply speaking a Single NVIDIA GPU system can be used for debugging.

There is one more thing I would like to clear at this stage. NVIDIA suggests and advertise that NSight is only compatible with GeForce 9 Series and above. In reality NSight can be used with few GeForce 8-Series also. For this your card must at-least support Compute Capability(CC) 1.1. For example GeForce 8400GS, 8500GT also support NSight. Rather all cards that support CC 1.1 and above are supported by NSight.

Learn more about Nsight at Dr.Dobbs

Android ADK opens new gates for hardware lovers!!


The new Android version 3.1 platform introduces Android Open Accessory support, which allows external USB hardware (an Android USB accessory) to interact with an Android-powered device in a special “accessory” mode. This new feature enables any accessory hardware such as  robotics controllers; docking stations; diagnostic and musical equipment; kiosks; card readers; digital cameras, keyboards, mice, game controllers and much more to be connected in an “accessory” mode, the connected accessory acts as the USB host (powers the bus and enumerates devices) and the Android-powered device acts as the USB device. Android USB accessories are specifically designed to attach to Android-powered devices and adhere to a simple protocol (Android accessory protocol) that allows them to detect Android-powered devices that support accessory mode. Accessories must also provide adhere to the charging power.

Early Android-powered devices are only capable of acting as a USB device and cannot initiate connections with external USB devices. With this new feature in 3.1 Android Open Accessory support overcomes this limitation and allows hardware lovers to build accessories that can interact with an assortment of Android-powered devices by allowing the accessory to initiate the connection.

The Android Open Accessory Development Kit (ADK) provides an implementation of an Android USB accessory that is based on the Arduino open source electronics prototyping platform, the accessory’s hardware design files, code that implements the accessory’s firmware, and the Android application that interacts with the accessory. The hardware design files and firmware code are contained in the ADK package download.

[via]

ENZO 2011


PathScale® ENZO is a complete GPGPU and multi-core solution, which tightly couples the best programming models with highly optimizing code generation for NVIDIA Tesla. ENZO reflects our dedication to and investment in GPGPU with over a decade of combined engineering time invested so far. By leveraging the HMPP open standard directives, ENZO does optimizations that quickly transform any existing C, C++ or Fortran codebase into highly efficient parallel code for GPU or multi-core systems.

ENZO highlights

  • High performance C, C++, and Fortran EKOPath compilers
  • HMPP C, C++ and Fortran compilers
  • PathScale C++ template and class libraries for GPGPU
  • CUDA compatible compiler
  • PathDB debugger with GPGPU support
  • PathAS assembler with GPGPU support
  • PSCNV open source compute Tesla driver
  • True GPGPU network Zero copy
  • Productivity tools for GPGPU programming
  • Only GPGPU solution for Linux, Solaris and FreeBSD

Register Here to Know more: http://www.pathscale.com/user