Guides
Performance
ShaderPad does a lot of work under the hood to make your graphics pipeline performant by default. If you want to fine-tune performance even further, the most valuable changes typically involve reducing texture bandwidth and avoiding unnecessary transfers between the CPU and GPU.
Use The Smallest Format That Works
The default RGBA8 pipeline is fine for many effects, but it is not always the cheapest choice.
Prefer:
R8orR32Ffor single-channel masksRGBA8for normal color workRGBA32Fonly when the effect truly needs float precision- Integer formats only when GLSL logic depends on integer sampling
Smaller formats reduce memory traffic in both ordinary rendering and history buffers.
Channel count matters as much as numeric precision. For example, RGBA8 and R32F both use 32 bits per pixel. If an effect only needs one scalar value, dropping from RGBA to R can have a significant impact.
Keep Chained Passes On The Same WebGL Context
This is one of the biggest ShaderPad-specific wins.
If multiple passes share the same canvas or context, ShaderPad can keep chained textures on the GPU. If the source and destination are on different contexts, it has to read pixels back to the CPU and upload them again.
Prefer this:
const sharedCanvas = new OffscreenCanvas(width, height)
const passA = new ShaderPad(fragmentA, { canvas: sharedCanvas })
const passB = new ShaderPad(fragmentB, { canvas: sharedCanvas })
This is especially important for:
- Multi-pass pipelines
- Webcam or video texture preprocessing
- High-resolution intermediate textures
Use Lower Resolution For Intermediate Passes
Not every pass needs full output resolution.
Downsample when a pass is only used for:
- Blur or bloom inputs
- Masks and segmentation
- Coarse simulation fields
- Preprocessing before a full-resolution composite
const lowResPass = new ShaderPad(fragmentShaderSrc, {
canvas: { width: 512, height: 512 },
})
If your passes already share one WebGL context, resizing that shared canvas between intermediate steps can be much cheaper than splitting the work across two separate contexts. It keeps the data on the GPU instead of forcing readbacks and reuploads. This is not a good fit for effects that rely on history, since those buffers need a stable size.
Reduce History Size
History increases texture bandwidth, so use it mindfully.
- Disable history entirely when the effect or texture does not sample prior frames
- Use
skipHistoryWrite: trueon updates or steps that should not become a new history frame
For instance, let's say you want to store one sample per second for the past 10 seconds. Instead of storing every rendered frame, you can do something like this:
const shader = new ShaderPad(fragmentShaderSrc, {
canvas,
history: 10,
})
let lastStoredSecond = -1
shader.play(time => {
const currentSecond = Math.floor(time)
if (currentSecond === lastStoredSecond) {
return { skipHistoryWrite: true }
}
lastStoredSecond = currentSecond
})
Prefer Partial Updates For Data Textures
If a typed-array texture changes in a small region, update only that region instead of reallocating or reuploading the whole texture.
shader.updateTextures({
u_data: {
data: patch,
width: patchWidth,
height: patchHeight,
x: 0,
y: 0,
isPartial: true,
},
})
Choose Filtering Deliberately
NEAREST is usually the more performant choice, so start there unless you specifically need interpolation.
- Use
NEARESTfor data textures such as masks, landmarks, IDs, integer buffers, and history lookups that should stay discrete - Use
LINEARfor imagery, smooth scaling, blur inputs, or anything that should blend between texels
If a texture encodes values rather than pixels, NEAREST is usually the right answer.
Batched MediaPipe Detection
ShaderPad's MediaPipe plugins are designed to avoid redundant model work in chained renders.
If you attach a MediaPipe plugin to multiple ShaderPad instances, ShaderPad shares a detector as long as these match:
- The plugin type (
face,pose,hands, orsegmenter) - The
textureName - The plugin options that affect detector setup
- The underlying media source object
That means you can run a chained pipeline like this without paying for multiple pose detections per frame:
const camera = document.querySelector('video')!
const sharedCanvas = new OffscreenCanvas(1, 1)
const preprocess = new ShaderPad(preprocessFrag, {
canvas: sharedCanvas,
plugins: [pose({ textureName: 'u_video', options: { maxPoses: 1 } })],
})
const composite = new ShaderPad(compositeFrag, {
canvas: sharedCanvas,
plugins: [pose({ textureName: 'u_video', options: { maxPoses: 1 } })],
textures: { u_scene: preprocess },
})
preprocess.updateTextures({ u_video: camera })
composite.updateTextures({ u_video: camera })
On the first pass, the shared detector runs and caches the result for that source frame. On later passes in the same render chain, the plugin sees the same source and video timestamp, skips a new MediaPipe call, and just publishes the cached landmark or mask textures to each subscriber.
To keep batching effective:
- Reuse the same
HTMLVideoElement,HTMLImageElement,HTMLCanvasElement, orOffscreenCanvas - Keep
textureNameidentical across passes - Keep plugin options aligned across passes
- Avoid creating duplicate plugins with tiny option differences unless you really need separate detectors
This is especially useful when several passes need the same tracking data, such as a mask-generation pass plus a later stylization or composite pass.
Be Deliberate With Plugins
MediaPipe plugins do real work outside the fragment shader. Enable them only when required, and configure the smallest useful outputs:
- Lower history depths when tracking history is short-lived
- Avoid extra model outputs unless the shader uses them
- Update the source texture once and reuse it across passes
Profile The Whole Workflow
With ShaderPad, the slow part is often not the fragment shader itself. It can be:
- Updating a large texture every step
- Keeping unnecessary history buffers alive
- Transferring chained data across contexts
- Storing more channels or precision than the effect needs
The most significant performance optimizations are often structural.