2nd Place - Flash Hogs (Github) Flash-HOG is an optimized kernel for running higher order gradient methods (HOG) for attention on NVIDIA Blackwell. Their kernel implements the backward pass of attention. Many research-level and current SoTA architectures depend on XLA to produce a kernel for this operation, and having an efficient kernel opens this approach up to wider use. They built a fast ...