Rohan Reddy is a software engineer based in New York City.
Read more about me →
Iterating on General Matrix Multiplication in CUDA for optimal performance on NVIDIA GPUs