Releases · replicate/cog

New features

Server-Sent Event prediction streams. HTTP prediction requests can now ask for Accept: text/event-stream to receive start, output, log, metric, and terminal completed events for predictors that explicitly opt in with @streaming / @cog.streaming. Reconnecting clients can replay retained in-flight prediction events with PUT /predictions/{id}. (#3019)
JSON-native union inputs. Cog now supports union-typed predictor inputs in generated schemas and HTTP request handling, allowing compatible JSON-native values to be resolved correctly. (#3048)

Improvements

Example models now live in the Cog repository. The example models from replicate/cog-examples have moved into this repository under examples/, with updated run.py / BaseRunner examples for common workflows including images, training, notebooks, streaming, concurrency, context, and Replicate API usage. (#3055)
Experimental warning for cog weights. Every cog weights subcommand now prints a warning that the weights workflow is experimental and should not be relied on in production workflows yet. (#3025)
Static schema parser target resolution. The static Python schema parser now uses run() as the primary prediction entry point while preserving legacy predict() fallback behavior, and resolves inherited and imported targets more consistently. (#3027)

Bug fixes

cog.Secret inputs now work with coglet-backed predictions. Predictors that annotate inputs as Secret, Optional[Secret], or Secret | None now receive cog.types.Secret values at runtime again, restoring behavior that regressed in the Rust/coglet runtime rewrite. (#3057)
cog doctor --fix now shows available remediation text. Findings without an auto-fix now display their remediation message instead of incorrectly saying no auto-fix is available. (#3031)
Cog-managed weight uploads now send the correct layer media type. Weight layer uploads now propagate the Cog weight media type during registry finalization while preserving regular image-layer behavior. (#3033)
Predictor validation now handles PEP 563 string annotations. Predictor files using from __future__ import annotations no longer reject valid setup() -> None methods or accept invalid run() -> None methods because None annotations were stored as strings. (#3034)
cog serve now mounts weights like cog run. Models served locally now receive configured weights consistently with local prediction runs. (#3044)
Omitted optional inputs now resolve to None at the HTTP edge. Optional predictor inputs omitted from HTTP requests are now passed as None instead of being treated inconsistently by request handling. (#3051)

Changelog

d6c2b70 Bump version to 0.21.0-rc.3 (#3050)
63c65df feat: support JSON-native union inputs (#3048)

Changelog

a1b710a Bump version to 0.21.0-rc.2 (#3045)
9b9f310 chore: regen lockfile
8428257 ci: pin mise version in release workflows (#3046)
9075ab5 fix: mount weights in cog serve like cog run does (#3044)

New features

Server-Sent Event prediction streams. HTTP prediction requests can now ask for Accept: text/event-stream to receive start, output, log, metric, and terminal completed events for predictors that explicitly opt in with @streaming / @cog.streaming. Reconnecting clients can replay retained in-flight prediction events with PUT /predictions/{id}. (#3019)

Improvements

Experimental warning for cog weights. Every cog weights subcommand now prints a warning that the weights workflow is experimental and should not be relied on in production workflows yet. (#3025)
Static schema parser target resolution. The static Python schema parser now uses run() as the primary prediction entry point while preserving legacy predict() fallback behavior, and resolves inherited and imported targets more consistently. (#3027)

Bug fixes

cog doctor --fix now shows available remediation text. Findings without an auto-fix now display their remediation message instead of incorrectly saying no auto-fix is available. (#3031)
Cog-managed weight uploads now send the correct layer media type. Weight layer uploads now propagate the Cog weight media type during registry finalization while preserving regular image-layer behavior. (#3033)
Predictor validation now handles PEP 563 string annotations. Predictor files using from __future__ import annotations no longer reject valid setup() -> None methods or accept invalid run() -> None methods because None annotations were stored as strings. (#3034)

New features

cog run command. The cog predict command has been renamed to cog run with full backward compatibility. cog predict still works as an alias. (#3015)
Model refs for cog push and weights commands. You can now reference models by name (e.g., r8.im/user/model) instead of full image URLs when pushing or managing weights. (#3018)
Multi-source weights and HTTPS weight sources. Weights can now be fetched from multiple sources, including direct HTTPS URLs. (#3008)
Opaque annotations in schema generation. Predictor inputs can use the new Opaque annotation to exclude fields from the generated schema. (#3001)

Improvements

Runtime schema generation fully removed. The legacy runtime Python schema generation path has been completely removed. Cog exclusively uses static schema generation, making builds faster and more reliable. (#3003)
Centralized build state and cleaner Docker context. Cog now stores build state in a dedicated .cog/ directory that is automatically filtered from the Docker build context. (#3000)
Support durable base-image build context. Base image builds now support a durable build context for better caching and reliability. (#3004)
Support uv-managed Python installs in generated Dockerfiles. Generated Dockerfiles now properly handle Python installations managed by uv. (#2999)

Bug fixes

Pushing a model with a version tag now emits a clean URL. The Replicate model URL printed after cog push no longer includes the image tag (e.g., :latest), preventing 404 errors when users click the link. (#3020)
Prefer latest torch patch when resolving unpatched versions. When resolving PyTorch compatibility, Cog now correctly selects the latest available patch version for unpatched version specs. (#3009)
Deterministic compatibility matrix output. Compatibility matrices are now sorted for consistent, deterministic output. (#3006)

Changelog

0780450 Bump version to 0.19.3 (#2995)
cac3e91 feat: experimental managed weights (#2974)

Changelog

427d486 Bump version to 0.19.2 (#2984)
d9c2b14 fix: cap fuzz test input size to prevent timeout (#2981)
4269575 fix: support typing_extensions TypedDict at runtime (#2983)

Bug fixes

Support for TypedDict in schema generation. Fixed an issue where TypedDict type annotations would cause schema generation to fail. (#2978)
Build order fix for resource exhaustion. Reordered coglet wheel build to run after cog/sdk builds to prevent resource exhaustion during release builds. (#2977)

Maintenance

Removed dead Go code. Cleaned up unused code identified by deadcode analysis. (#2979)
Removed accidentally committed folder. Deleted a folder that shouldn't have been in the repository. (#2972)
CI lockfile improvements. Switched to strict lockfile mode and regenerated mise.lock. (#2975)

Dependencies

Updated golang.org/x/crypto from 0.49.0 to 0.50.0 (#2965)
Updated golang.org/x/term from 0.41.0 to 0.42.0 (#2933)
Updated google.golang.org/grpc from 1.79.3 to 1.80.0 (#2967)
Updated rand from 0.8.5 to 0.8.6 in /crates (#2958)

New features

cog doctor command. Diagnose common Cog setup issues, check configuration, and verify that everything is working correctly. Run cog doctor to validate your environment. (#2923)

Improvements

Static schema generation is now the default. Cog now generates prediction schemas statically from your predictor's type annotations rather than importing and inspecting the Python code at build time. This makes builds faster and more reliable. Use COG_LEGACY_SCHEMA=1 to opt out if you encounter issues. (#2950)
Test harness improvements. The Cog integration test harness now runs tests in parallel and provides better error reporting. (#2944)

Bug fixes

Separate-weights builds with r8.im image names work correctly. Schema validation no longer fails when building with separate weights and using r8.im/... image names. (#2954)
Static schema generation handles more edge cases. Fixed issues with certain type annotation patterns in static schema generation. (#2948)
Secret = Input(default=None) is treated as optional. Secret inputs with None defaults are now correctly identified as optional in the generated schema. (#2949)

Breaking changes

cog run is now cog exec. cog run still works as a hidden alias with a deprecation warning -- existing scripts won't break yet, but update them. (#2916)

Bug fixes

async def setup() actually runs now. In 0.17.x, async setup coroutines were silently dropped -- setup appeared to succeed but none of the code executed, causing AttributeError on every prediction. (#2921)
Async setup shares the event loop with predict. Models that create event-loop-bound resources in setup() (httpx clients, aiohttp sessions, asyncio queues) no longer crash because setup and predict run on different loops. (#2927)
dict and list[dict] work as input types. These were supported as outputs but rejected as inputs, breaking chat-style message inputs. (#2928)
list[X] | None works as an input type. The type system only had Required, Optional, and Repeated -- not optional-and-repeated. Both the Python SDK and Go schema generator now handle this correctly. (#2882)
Unknown prediction inputs are dropped instead of rejected. Coglet was returning 422 for unrecognized input fields, breaking backwards compatibility when models upgraded to new Cog. Unknown fields are now silently stripped and logged at warn level. (#2943)
Metrics bugs in coglet. Fixed precision loss for large integer increments, empty/malformed metric key panics, missing metrics in error/cancel responses, and inconsistent metrics in state snapshots. (#2896)

Improvements

Push progress during image export. cog push now shows status during the docker save phase instead of sitting silent while large images export to disk. (#2797)
Metric name validation. record_metric() enforces naming rules -- must start with a letter, no consecutive underscores, max 128 chars, max 4 segments. predict_time and the cog. prefix are reserved. (#2911)

Uh oh!

Releases: replicate/cog

v0.21.0

New features

Improvements

Bug fixes

Uh oh!

v0.21.0-rc.3

Changelog

Uh oh!

v0.21.0-rc.2

Changelog

Uh oh!

v0.21.0-rc.1

New features

Improvements

Bug fixes

Uh oh!

v0.20.0

New features

Improvements

Bug fixes

Uh oh!

v0.19.3

Changelog

Uh oh!

v0.19.2

Changelog

Uh oh!

v0.19.1

Bug fixes

Maintenance

Dependencies

Uh oh!

v0.19.0

New features

Improvements

Bug fixes

Uh oh!

v0.18.0

Breaking changes

Bug fixes

Improvements

Uh oh!