Go to any of the CUDA examples:
And edit the Makefile to modify two things: NVCCFLAGS and gpuarch codes.
First is the NVCCFLAGS: append a "-ptx" to the end.
And the SMS code, which enumerate through all the different GPU architecture, just modify it to a single GPU architecture (for eg, 30 is used here):
And the details is explained here: (https://docs.nvidia.com/cuda/cuda-samples/index.html#getting-cuda-samples)
And then "make" will generate:
"/home/tthtlc/cuda-10.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -ptx -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -o dxtc.o -c dxtc.cu"/home/tthtlc/cuda-10.0"/bin/nvcc -ccbin g++ -m64 -ptx -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -o dxtc dxtc.o
mkdir -p ../../bin/x86_64/linux/release
cp dxtc ../../bin/x86_64/linux/release
Then edit the dxtc.o file, which consist of all the PTX instructions:
and the latest ISA specs is here:
For example:
PTX have a unique features in its memory model:
No comments:
Post a Comment