How to get CUDA Cores count on Linux

In this article you will learn how to get CUDA Cores count on Linux. As a text subject we will get CUDA core count on NVIDIA GeForce RTX 3080.

In this tutorial you will learn:

  • How to get CUDA Cores count using NVIDIA drivers
  • How to get CUDA Cores count using NVIDIA CUDA toolkit

NVIDIA RTX 3080 CUDA cores count

NVIDIA RTX 3080 CUDA cores count

Software Requirements and Conventions Used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Installed or upgraded Ubuntu 20.04 Focal Fossa
Software N/A
Other Privileged access to your Linux system as root or via the sudo command.
Conventions # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux commands to be executed as a regular non-privileged user

How to get CUDA cores count on Linux using NVIDIA driver

  1. First step is to install an appropriate driver for your NVIDIA graphics card. To do so follow one of our NVIDIA driver installation guides.
  2. Once you are ready simply execute the nvidia-settings command using the following command options. So for example here is a CUDA cores count for our NVIDIA RTX 3080 GPU:
    $ nvidia-settings -q CUDACores -t
    8704
    8704
    


How to get CUDA cores count on Linux using NVIDIA driver

    1. Let’s start be NVIDIA CUDA toolkit installation. Here are some CUDA toolkit installation examples on some common 64-bit Linux distributions:
      UBUNTU 20.04:

      $ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
      $ sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
      $ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
      $ sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
      $ sudo apt-get update
      $ sudo apt-get -y install cuda
      

      DEBIAN 10:

      # apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/7fa2af80.pub
      # add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/ /"
      # add-apt-repository contrib
      # apt-get update
      # apt-get -y install cuda
      

      RHEL 8 / CENTOS 8:

      $ sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
      $ sudo dnf clean all
      $ sudo dnf -y module install nvidia-driver:latest-dkms
      $ sudo dnf -y install cuda
      

      For more CUDA installation guides visit CUDA downloads.

    2. As part of your CUDA toolkit installation locate its deviceQuery directory.
      $ locate deviceQuery
      

      The above command should return output similar to the one below:

      $ locate deviceQuery
      /usr/local/cuda-11.4/extras/demo_suite/deviceQuery
      /usr/local/cuda-11.4/samples/1_Utilities/deviceQuery
      /usr/local/cuda-11.4/samples/1_Utilities/deviceQuery/Makefile
      /usr/local/cuda-11.4/samples/1_Utilities/deviceQuery/NsightEclipse.xml
      /usr/local/cuda-11.4/samples/1_Utilities/deviceQuery/deviceQuery.cpp
      ...
      


  1. Compile the deviceQuery source code:
    $ cd /usr/local/cuda-11.4/samples/1_Utilities/deviceQuery
    # make
    
  2. Execute the newly compiled binary to get CUDA core count for your NVIDIA GPU. :
    $ ./deviceQuery 
    ./deviceQuery Starting...
    
     CUDA Device Query (Runtime API) version (CUDART static linking)
    
    Detected 1 CUDA Capable device(s)
    
    Device 0: "NVIDIA GeForce RTX 3080"
      CUDA Driver Version / Runtime Version          11.4 / 11.4
      CUDA Capability Major/Minor version number:    8.6
      Total amount of global memory:                 10015 MBytes (10501423104 bytes)
      (068) Multiprocessors, (128) CUDA Cores/MP:    8704 CUDA Cores
      GPU Max Clock rate:                            1800 MHz (1.80 GHz)
      Memory Clock rate:                             9501 Mhz
      Memory Bus Width:                              320-bit
      L2 Cache Size:                                 5242880 bytes
      Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
      Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
      Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
      Total amount of constant memory:               65536 bytes
      Total amount of shared memory per block:       49152 bytes
      Total shared memory per multiprocessor:        102400 bytes
      Total number of registers available per block: 65536
      Warp size:                                     32
      Maximum number of threads per multiprocessor:  1536
      Maximum number of threads per block:           1024
      Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
      Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
      Maximum memory pitch:                          2147483647 bytes
      Texture alignment:                             512 bytes
      Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
      Run time limit on kernels:                     Yes
      Integrated GPU sharing Host Memory:            No
      Support host page-locked memory mapping:       Yes
      Alignment requirement for Surfaces:            Yes
      Device has ECC support:                        Disabled
      Device supports Unified Addressing (UVA):      Yes
      Device supports Managed Memory:                Yes
      Device supports Compute Preemption:            Yes
      Supports Cooperative Kernel Launch:            Yes
      Supports MultiDevice Co-op Kernel Launch:      Yes
      Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
    
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.4, CUDA Runtime Version = 11.4, NumDevs = 1
    Result = PASS
    


Comments and Discussions
Linux Forum