Skip to content
View jeffhammond's full-sized avatar

Block or report jeffhammond

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Keynote talk delivered by Didem Unat at Intl Conference on Performance Engineering 2026

2 Updated May 7, 2026

KaMPIng: (Near) zero-overhead MPI wrapper for modern C++

C++ 69 7 Updated Apr 27, 2026

Simple message passing library

Cuda 30 7 Updated Aug 28, 2018

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 762 59 Updated Aug 6, 2025

Official inference framework for 1-bit LLMs

Python 38,975 3,554 Updated Mar 10, 2026

PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.

Python 155 67 Updated May 6, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,216 375 Updated Apr 20, 2026

Blue Gene/Q driver, see https://repo.anl-external.org/repos/bgq-driver/

C++ 5 2 Updated Jan 8, 2014

🍦 Never use cout/printf to debug again

C++ 741 38 Updated Apr 22, 2026

A Fortran linter, written in Rust and installable with Python.

Rust 196 25 Updated May 13, 2026

Parallel Computing -- Validation Suite: Validation engine for Exascale project benchmarks

C 16 2 Updated Mar 26, 2026

Fortran bindings generated using Coccinelle

C++ 2 Updated Feb 2, 2025

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

HTML 15,585 1,096 Updated Apr 19, 2026

Test if AVX vector loads and stores are atomic

C++ 35 5 Updated Jul 9, 2020

A place to store information for the tensor discussions and possible specifications.

C 24 6 Updated Jul 2, 2025

do a simple closed shell Hartree-Fock using McMurchie-Davidson to compute integrals

Python 89 18 Updated Jun 8, 2024

Data structures, algorithms, and C++ reference library

252 35 Updated Apr 18, 2026

LLM inference in Fortran

Fortran 63 9 Updated May 30, 2024

Efficient numerical computation of the Pfaffian for dense and banded skew-symmetric matrices

Python 2 1 Updated Mar 18, 2026
Julia 8 Updated Dec 9, 2024

A framework and suite of cases for testing a Fortran compiler

Python 11 3 Updated Nov 14, 2024

Fast CUDA matrix multiplication from scratch

Cuda 1,181 183 Updated Sep 2, 2025

Flexible and performant GEMM kernels in Julia

Julia 84 12 Updated May 14, 2026

Geometry optimization code that includes the TRIC coordinate system

Python 211 77 Updated Apr 20, 2026

optking: A molecular geometry optimization program

Python 27 14 Updated Apr 17, 2026

Molecular structure optimizer

Python 130 26 Updated Dec 17, 2022

Repository to host supporting information and code samples for Accelerated DFT

Jupyter Notebook 38 2 Updated Apr 29, 2025

LLM training in simple, raw C/CUDA

Cuda 29,901 3,589 Updated Jun 26, 2025
Next