The implementation guide provides reference code for Simon and Speck (64- and 128-bit block sized variants), including code for key scheduling, encryption, and decryption. It also describes the intended word ordering, and provides detailed test vectors and useful tips for improving performance on some ARM and x86 processors.
The simon-speck-supercop repository includes X86 and ARM implementations using the SSE4.2, AVX2, and NEON instruction sets for high performance. While the implementations are structured for the SUPERCOP benchmarking toolkit, they should be adaptable to other systems.
As an alternative to downloading the full SUPERCOP toolkit, just the Simon and Speck code is available in .tar.xz - 17 KB, .tar.gz - 50 KB, and .zip - 167 KB formats.
The University of Luxembourg Fair Evaluation of Lightweight Cryptographic Systems (FELICS) project includes our contributions of a range of small and fast implementations of Simon and Speck for the 8-bit AVR, 16-bit MSP430, and 32-bit Cortex-M microcontrollers.
We have unreleased implementations of Simon and Speck for ASICs and FPGAs that have been documented in our papers. The team is glad to answer questions about ASIC and FPGA implementation of the algorithms.