Home > EulerStack > Tutorials > Mixers > 4. Hyena in detail

4. Hyena in detail

One-Line Summary

"An FFT-based very long convolution kernel, generated implicitly by a small network, mixes sequences sub-quadratically — catches long-range patterns without attention."

How Does It Work?

Standard 1D convolution: kernel of size K lets each token mix with K neighbors. Hyena's insight: "Make the kernel size the full sequence length N so each token sees all past tokens. But don't learn N parameters directly — have a small network generate them implicitly."

Concretely:

  1. Filter generator: a small MLP + sinusoidal features + exponential decay window takes position t and outputs filter value h(t).
  2. FFT convolution: the long filter and input are FFT'd → elementwise multiply → inverse FFT = convolution in O(N log N).
  3. Gating / multi-order: Hyena operator is stacked with gating for expressiveness.

Result: full-sequence long-range dependencies without Attention, at O(N log N).

Strengths

Weaknesses

Where Does It Shine?

Hyena is strongest on "non-text, extremely long sequences":

In EulerStack, arch_expert_research's Phase 1 uses Hyena alongside Mamba for "bulk token processing" — early layers benefit from capturing broad structural patterns rather than exact matching.

Real-World Use

When Is Hyena Good?

Scenario Hyena quality
DNA / audio / long sensor data ★★★★★ original domain
LLM early layers (bulk processing) ★★★★ (in a hybrid)
Very long context (≥128K) ★★★★ linear scaling
Short chat / ICL-centric ★★ (Attention wins)
Coding (exact symbol recall) ★★ (Attention + Mamba better)

EulerStack YAML

layer_templates:
  hyena_layer:
    mixer:
      type: hyena
      hyena:
        depth: 2
        filter_hidden: 64
        filter_decay: 0.0
    ffn:
      type: gated_mlp
      activation: swiglu
    # Note: Hyena is stateless — no state section.

Stage 5 Phase 1 example (mamba + hyena):

layer_schedule:
  - template: mamba_layer
    repeat: 2
  - template: hyena_layer
    repeat: 1
  - template: mamba_layer
    repeat: 2
  - template: hyena_layer
    repeat: 1

Papers