Release Notes
[0.9.0] — 2026-05-01
Added
- New layers: LRN and FakeQuant — Local Response Normalization (
LRN(size, alpha, beta, bias)) for AlexNet-style models, andFakeQuant(scale, zero_point, qmin, qmax)for simulated quantization. Both are wired through lexer, parser, IR, shape inference, and codegen. - Explicit per-side pool padding —
MaxPool2DandAvgPool2Dnow accept apadding: [top, left, bottom, right]parameter for asymmetric padding, propagated through shape inference and codegen.
Changed
- ONNX import: quantized CNN support —
nnc importnow mapsLRNONNX nodes, fusesGemm/MatMul → (Quantize/Dequantize) → Addbias chains, lowersQuantizeLinearnodes intoFakeQuantlayers, and recognizes asymmetricpadsattributes forConv,MaxPool, andAveragePoolshape inference. - ONNX import: tensor data decoding — initializers stored in
int32_data/int64_data(instead ofraw_data) are now decoded correctly, fixing imports for many torch-exported quantized models.
Fixed
- ONNX
Reshape → Flattenlowering —Reshapeto a[1, N]target shape is now lowered toFlatteneven when the input rank is < 3, matching how PyTorch exporters serialize the post-conv flatten. - ONNX
DequantizeLinearwith missing zero-point — empty zero-point initializers are now treated as zeros instead of failing to dequantize.
[0.8.0] — 2026-04-30
Added
nnc newproject scaffolding — generate a starter host-language project around a sample NNL model. Supports--project rust,go,cpp, andpython. The scaffold includes a samplemodel.nnl(configured withio: "none"), host-language boilerplate wired to the generated C ABI, a build script or build file for compiling the model artifact, and a README with run instructions.
Changed
- Improved missing-weight diagnostics (E003) —
nnc compilenow produces a structured, actionable error when required weights are missing. Errors list every missing tensor with its expected shape, identify whether the source is a directory of.npyfiles, an.npzarchive, or another path, and include ahint:to runnnc inspect <model>to view expected tensors and shapes. All missing weights are reported in a single error instead of stopping at the first one.
[0.7.0] — 2026-04-26
Added
- Compile-time memory check with optional
memory_limitconfig —nncnow computes total static memory (weights + workspace) and emits a W003 warning when it exceeds 256 MB. Addmemory_limit: "128MB"to the config block to turn this into a hard compile error (E009).nnc inspectnow shows a “Total memory” line. Accepted units: KB, MB, GB. io: "none"config option — skipsmain()generation, producing a pure library artifact. Use with--emit lib,--emit shared, or--emit objfor embedding models in host applications.io: "none"with--emit exeproduces a clear compile error.- Integration examples — new
examples/integration/directory with documented examples showing how to call an NNL-compiled model from C++, Rust, Go, and Python, using static/shared library linking and FFI.
[0.6.0] — 2026-04-23
Added
- New layers: Hardswish, Upsample, Conv1D, MaxPool1D, LayerNorm — five new layer types across all pipeline stages (lexer, parser, IR, shape inference, codegen, ONNX import), completing the Tier 4 roadmap from the ONNX spec.
- Hardswish activation —
Hardswish(x) = x * min(max(0, x+3), 6) / 6, unlocks MobileNetV3. ONNXHardSwishop imported automatically. - Upsample layer —
Upsample(scale: N)with nearest-neighbor interpolation for spatial upsampling. ONNXUpsampleandResizeops imported automatically. Unlocks YOLO-Tiny, U-Net, and encoder-decoder models. - Conv1D layer — 1D convolution with
filters,kernel,stride,paddingparameters. ONNXConvops with 3D weight tensors auto-detected as Conv1D. Enables audio, time-series, and keyword spotting models. - MaxPool1D layer — 1D max pooling with
kerneland optionalstride. ONNXMaxPoolops with 1Dkernel_shapeauto-detected. Enables audio and time-series models. - LayerNorm layer — Layer normalization with learnable
scaleandbiasover the last dimension, with configurableepsilon. ONNXLayerNormalizationop imported with epsilon and weights. Enables transformer-adjacent models.
[0.5.0] — 2026-04-23
Added
- New layers: GlobalAvgPool2D, ReLU6, LeakyReLU, SiLU, Mul — six new layer types across all pipeline stages (lexer, parser, IR, shape inference, codegen, ONNX import), unlocking ResNet-18, MobileNetV1/V2, and EfficientNet model families.
- Grouped / depthwise Conv2D —
Conv2Dnow accepts agroupsparameter (default 1) for grouped convolution, including depthwise separable convolution (groups == in_channels). ONNXConvgroupattribute is imported automatically. - ONNX external tensor data support —
nnc importcan now load weights stored as external data files (ONNXdata_location = EXTERNAL) with offset/length support, fixing import failures for models exported withtorch.onnx.export(..., use_external_data_format=True).
Fixed
- CHW→HWC weight permutation at Flatten→Dense boundary —
nnc importnow automatically detects the Flatten→Gemm pattern in ONNX graphs and permutes Dense weight matrix rows from PyTorch’s CHW flatten order to nnc’s HWC order, fixing incorrect inference results for all imported CNNs with Flatten→Dense transitions. - ONNX import empty tensor error —
nnc importnow produces a clear error message ("tensor '...' has no data") instead of a cryptic npy shape mismatch when tensor data is missing.
[0.4.0] — 2026-04-23
Added
--version/-Vflag —nnc --versionnow prints the version fromCargo.toml.--emit cflag —nnc compile model.nnl --emit cwrites the generated.cand.hfiles directly without invoking the C compiler, useful for debugging and auditing generated code.
Fixed
- Concat codegen for multi-dimensional tensors — fixed incorrect flat
memcpyin Concat codegen that produced wrong results when concatenating 3D (HWC) tensors along the channel axis. Now generates proper strided copies for arbitrary concat axes. - ONNX import protobuf decode failure — fixed incorrect field tag numbers in
AttributeProtothat caused all ONNX imports to fail with a protobuf wire type error. Added missingfloatsfield (tag 7). - Unsupported precision silently accepted —
precision: "int8"andprecision: "float64"now produce a compile error instead of silently generating incorrect float32 code. - Website hero demo — the output example now shows the realistic workflow (raw bytes piped through Python) instead of implying the binary outputs formatted text.
- Website copyright year — updated from
© 2024to© 2024–2025. - README DESIGN.md link — corrected broken link to point to
docs/src/DESIGN.md.
[0.3.0] — 2025-04-23
Fixed
- Conv2D rectangular kernel correctness — fixed a bug where non-square kernels (e.g.,
kernel: [3, 5]) produced incorrect inference results due to a variable shadowing issue in the generated C code. Square kernels were unaffected. The same shadowing fix was applied to MaxPool2D and AvgPool2D codegen for consistency.
[0.2.0] — 2025-04-20
Initial public release.
Added
- NNLang DSL with
version 0.2syntax for defining neural network models - Layers: Input, Dense, Conv2D, MaxPool2D, AvgPool2D, Flatten, BatchNorm, Dropout, Add, Concat, ReLU, Sigmoid, Softmax
- C code generation backend with static memory allocation (no heap, no runtime dependencies)
- Output formats:
exe,obj,lib,shared,header - Cross-compilation via
--target-tripleflag - SIMD target hints:
generic,avx2,avx512,arm_neon - Weight loading from
.npyfiles and.npzarchives - ONNX model import via
nnc import nnc inspectcommand for model summary and shape informationnnc testcommand for verifying inference correctness against expected outputs- Explicit graph connections with
connections { }block and skip connections - Liveness-based buffer reuse for minimal activation memory footprint
- mdbook documentation site