Updates in 2025.2
General
Added support for collecting C2C link information on Blackwell GPUs.
CPU call stack filtering now supports Python call stacks.
Instruction statistics now show warp- and thread-level instruction counts per opcode category. Added new metrics
sass__inst_executed_per_opcode_categoryandsass__thread_inst_executed_per_opcode_category. See the Metrics Reference for details.Enhanced several rules to produce tables pointing to the source location of interest.
Improved the NvRules API to support generic tables for the UI and CLI.
Improved the NvRules and Python Report Interface documentations to be more pythonic.
Added APIs to the Python Report Interface for querying rules and source markers in the report.
Added Occupancy Calculator Python Interface, which provides a Python-based interface for performing occupancy calculations and analysis of kernels on NVIDIA GPUs.
NVIDIA Nsight Compute
Added product-wide search functionality via a new search bar and tool window.
The Source page now shows scoreboard dependencies in SASS.
Converted more tooltips into interactive tooltips. Interactive tooltips can now be pinned and dragged.
Added source correlation navigation controls which allow navigation to the previous or next block of correlated lines.
NVIDIA Nsight Compute CLI
Added support for profiling MPS applications.
Added support for filtering kernels based on a renamed kernels configuration during profiling.
Resolved Issues
CUDA Graphs in the Resources View use the current UI theme.
Resolved several issues when interacting with timelines on the Details page.
Resolved issues with Python syntax highlighting on the Source page.
Disabled deprecated columns in the API Stream tool window.
Fixed that the Source page may show incorrect correlation when some source files were not resolved.
Reduced the number of replay passes required for collecting the
PmSampling.sectionon GH100 with applicable drivers.Resolved that
--native-includedid not work properly when using range replay andcu(da)ProfilerStop.Fixed an
Invalid or unsupported charset:ANSI_X3.4-1968error when using the CLI on some systems.Fixed that memory available for saving context state during replay may be computed incorrectly when the app was using managed memory.
Fixed that some metrics were not listed for collection in section files for GB20x GPUs.