`mix fly_deploy.hot`

Hot code upgrades for Elixir applications running on Fly.io without restarts or downtime.

FlyDeploy enables zero-downtime deployments by upgrading running BEAM processes with new code while preserving their state. Unlike traditional deployments that restart your entire application, hot upgrades suspend processes, swaps in new code, migrates state via code_change callbacks, then resumes processes for seamless code upgrades.

Features

Zero-downtime deployments - upgrade code without restarting your application
State preservation and migration - running processes maintain their state through upgrades and migrate state with standard OTP code_change/3 callbacks
Automatic startup reapply - machines that restart after hot upgrades automatically load the new code
Concurrent upgrades - upgrade all machines in parallel for faster deployments
Safe process suspension - only affects processes using changed modules
Phoenix LiveView auto-reload - LiveView pages automatically re-render after upgrade without page refresh
S3 storage - stores release tarballs for distribution to machines

Comparison with OTP releases and release handlers

FlyDeploy provides a simplified approach to hot code upgrades compared to traditional OTP releases and the release_handler module. Understanding the differences will help you choose the best tool for your needs.

Traditional OTP release upgrades

OTP's release_handler provides the canonical hot upgrade mechanism for Erlang/Elixir applications:

Requires .appup files for each application defining upgrade instructions
Requires a .relup file describing the complete release upgrade path
Uses :release_handler.install_release/1 to perform upgrades
Manages dependencies and startup order automatically
Handles complex upgrade scenarios (adding/removing applications, changing supervision trees)
Persists upgrade state within the release structure itself
Well-tested over decades in production telecom systems

FlyDeploy's simplified approach

FlyDeploy takes a different approach optimized for containerized deployments and simplified upgrades, where we accept that some changes require cold deploys:

No .appup or .relup files required - upgrades work automatically
Detects changed modules using :code.modified_modules() after loading new beams
Upgrades individual processes using :sys.suspend/1, :sys.change_code/4, and :sys.resume/1
Stores upgrade metadata in external storage (S3/Tigris/etc) rather than in the release
Builds on Docker images for distribution rather than release tarballs
Optimized for typical code upgrades where frequent upgrades to supervision tree structures, upgrading deps, or careful ordering of the upgrade process is not required

Key Differences

State Management:

OTP has complex dependency tracking and upgrade ordering guarantees
FlyDeploy relies on processes implementing code_change/3 for state migration, with no strict ordering guarantees
FlyDeploy detects changes automatically via :code.modified_modules()

Metadata Storage:

OTP stores upgrade history in releases/RELEASES file on disk
FlyDeploy stores metadata in S3 for distribution across ephemeral machines

Build Artifacts:

OTP requires .appup and .relup files with detailed upgrade instructions
FlyDeploy requires no additional build artifacts - just standard compilation on a mix release build server

Limitations

Compared to OTP's release_handler, FlyDeploy cannot:

Add/remove applications or dependencies - The supervision tree is built once at startup
Change supervision tree structure or process hierarchy - Cannot add/remove child processes dynamically
Trigger Application.config_change/3 callbacks - Configuration changes are not detected or propagated
Upgrade the Erlang VM or OTP version - VM version is fixed in the Docker image
Handle multi-step upgrade paths with intermediate versions - Each upgrade is standalone
Upgrade NIFs or port drivers - Native code requires a restart
Guarantee specific module load ordering - instead all changed OTP module pids are suspended ahead of code_change, code is upgraded, then all processes resume. There is no specific order of code change, but no intermediate processes resume until the entire upgrade is complete.

When to Use Each

Use FlyDeploy when:

Deploying to containerized environments
Most of your changes are to the top level OTP application modules and process that you wholly own
You want simple hot upgrades without .appup files

Use OTP release_handler when:

Adding/removing applications at runtime
Requiring complex multi-step upgrade paths, with specific module upgrade ordering
Needing VM upgrades without downtime
Requiring telecom-grade reliability guarantees

Installation

Add fly_deploy to your list of dependencies in mix.exs:

def deps do
  [
    {:fly_deploy, "~> 0.1.0"}
  ]
end

Quick Start

1. Configure Fly Secrets

Set up AWS credentials for storage:

fly storage create -a myapp -n my-releases-bucket

Or for existing creds:

fly secrets set AWS_ACCESS_KEY_ID=<key> AWS_SECRET_ACCESS_KEY=<secret>

You will also need to set a secret on the app of the Fly API token for the orchestrator machine:

fly secrets set FLY_API_TOKEN=$(fly tokens create machine-exec)

2. Add Startup Hook

In your Application.start/2, you must call startup_reapply_current/1before starting your supervision tree. This will reapply any previously applied hot upgrade on top of the running container image, allowing hot deploys to survive machine restarts.

defmodule MyApp.Application do
  use Application

  def start(_type, _args) do
    # Apply any hot upgrade that builds on top of our static container image on startup
    FlyDeploy.startup_reapply_current(:my_app)

    children = [
      # your supervision tree
    ]

    Supervisor.start_link(children, strategy: :one_for_one)
  end
end

3. Deploy

Run a hot deployment:

mix fly_deploy.hot

That's it. Your application will upgrade to the new code without restarting.

How It Works

Hot Upgrade Process

When you run mix fly_deploy.hot:

Build Phase - Creates a new Docker image with fly deploy --build-only
Orchestrator Phase - Spawns a temporary machine with the new image
Tarball Phase - Orchestrator extracts all .beam files and creates a tarball
Upload Phase - Uploads tarball to S3 storage
Metadata Phase - Updates deployment metadata with hot upgrade information
Reload Phase - Each running machine downloads the tarball
Extract Phase - Beam files are extracted and copied to disk
Detection Phase - :code.modified_modules() identifies changed modules
Suspension Phase - Processes using changed modules are suspended with :sys.suspend/1
Code Load Phase - Old modules are purged and new versions loaded
Migration Phase - :sys.change_code/4 is called on each process
Resume Phase - Processes are resumed with :sys.resume/1

Total suspension time is typically under 1 second.

Startup Reapply

When a machine restarts after a hot upgrade (due to crashes, scaling, or restarts):

FlyDeploy.startup_reapply_current/1 checks for current hot upgrades
Compares the machine's Docker image ref with stored metadata
If refs match and a hot upgrade exists, downloads and applies it
Uses :c.lm() to load all modified modules before supervision tree starts
No process suspension needed since supervision tree hasn't started yet

This ensures machines that restart remain consistent with machines that were hot-upgraded.

Configuration

Environment Variables

Required:

AWS_ACCESS_KEY_ID - S3 access key
AWS_SECRET_ACCESS_KEY - S3 secret key
FLY_API_TOKEN - Fly API token (usually auto-set)
FLY_APP_NAME - Application name (auto-set by Fly)
FLY_IMAGE_REF - Docker image reference (auto-set by Fly)

Optional:

AWS_ENDPOINT_URL_S3 - S3 endpoint (defaults to https://t3.storage.dev)
AWS_REGION - AWS region (defaults to auto for Tigris)

fly.toml Configuration

Environment variables from your [env] section are automatically passed to the orchestrator:

[env]
  AWS_ENDPOINT_URL_S3 = "https://t3.storage.dev"
  AWS_REGION = "auto"

Mix Configuration

In config/config.exs:

config :fly_deploy,
  bucket: "my-custom-bucket",  # Optional - defaults to BUCKET_NAME env var
  max_concurrency: 10,         # Max concurrent machine upgrades (default: 20)
  suspend_timeout: 15_000,     # Timeout for suspending each process in ms (default: 10_000)
  env: %{
    "CUSTOM_VAR" => "value"
  }

Bucket Configuration: FlyDeploy looks up the S3 bucket name from:

Mix config (:bucket key above) - if explicitly configured
BUCKET_NAME environment variable - automatically set by fly storage create

If neither is configured, deployment will fail. When you run fly storage create, the BUCKET_NAME env var is automatically set on all your machines, so no additional configuration is needed.

CLI Options

The mix fly_deploy.hot task supports several options:

--config - Path to fly.toml file (default: "fly.toml")
--skip-build - Skip building and use existing image (requires --image)
--image - Use specific pre-built image
--dry-run - Show what would be done without executing
--force - Override deployment lock (use with caution)
--lock-timeout - Lock expiry timeout in seconds (default: 300)

Examples

Basic hot deployment:

mix fly_deploy.hot

Use staging configuration:

mix fly_deploy.hot --config fly-staging.toml

Preview changes without executing:

mix fly_deploy.hot --dry-run

Use pre-built image:

mix fly_deploy.hot --skip-build --image registry.fly.io/my-app:deployment-123

Safety and Error Handling

FlyDeploy uses a 4-phase upgrade cycle to safely upgrade running processes:

Phase 1: Suspend all changed processes - All affected processes are suspended with :sys.suspend/1 before any code loading
Phase 2: Load all changed code - New code is loaded globally using :code.purge/1 and :code.load_file/1 while processes are safely suspended
Phase 3: Upgrade all processes - Each suspended process has :sys.change_code/4 called to trigger its code_change/3 callback
Phase 4: Resume all processes - All processes are resumed with :sys.resume/1
Phase 5: Trigger LiveView reloads (if applicable) - Phoenix LiveView pages automatically re-render with new code

This 4-phase approach eliminates race conditions where one upgraded process calls another that still has old code.

Phoenix LiveView Integration

If you're using Phoenix LiveView, FlyDeploy automatically triggers re-renders after hot upgrades:

Detects upgraded LiveView modules by checking for Phoenix.LiveView or Phoenix.LiveComponent behaviors
Finds all active LiveView processes
Sends {:phoenix_live_reload, "fly_deploy", source_path} messages directly to each LiveView PID
LiveView automatically re-renders with the new code

CSS Hot Reload

When static assets like CSS change during a hot upgrade, users would normally need to hard refresh to see the new styles. The hot_reload_css component automatically reloads stylesheets when the static manifest changes.

Add to your app layout (e.g., app.html.heex):

<FlyDeploy.Components.hot_reload_css socket={@socket} />

For multiple stylesheets, specify the asset name:

<FlyDeploy.Components.hot_reload_css socket={@socket} asset="app.css" />
<FlyDeploy.Components.hot_reload_css socket={@socket} asset="admin.css" />

Requirements:

Phoenix LiveView 1.1+ with colocated hooks support
Static assets digested via mix phx.digest (standard for production)
cache_static_manifest configured in your endpoint

The component uses a colocated JavaScript hook that detects when the static manifest version changes and automatically updates stylesheet hrefs, preserving any CDN/static host configuration.

Rollback Strategy

Hot upgrades are forward-only. Once new code is loaded into the BEAM VM, FlyDeploy cannot roll it back. If a hot upgrade causes issues, perform a cold deploy to a known good version:

fly deploy

The cold deploy will replace both the base Docker image and any hot upgrade state. This is similar to how OTP release upgrades work - they are also forward-only unless you build explicit downgrade instructions.

Storage Structure

FlyDeploy stores two types of objects in S3:

Release Tarballs

Path: releases/<app>-<version>.tar.gz

Contains all .beam files from /app/lib/**/ebin/*.beam with relative paths like:

lib/my_app-1.2.3/ebin/Elixir.MyModule.beam
lib/my_app-1.2.3/ebin/Elixir.MyModule.Server.beam

Deployment Metadata

Path: releases/<app>-current.json

Tracks current deployment state:

{
  "image_ref": "registry.fly.io/my-app:deployment-01K93Q...",
  "hot_upgrade": {
    "version": "1.2.3",
    "source_image_ref": "registry.fly.io/my-app:deployment-01K94R...",
    "tarball_url": "https://t3.storage.dev/bucket/releases/my_app-1.2.3.tar.gz",
    "deployed_at": "2024-01-15T10:30:00Z"
  }
}

When to use cold deploy instead of hot upgrade

Adding/removing services from supervision tree (database, cache, etc.)
Changing port numbers, protocols, or network config
Enabling/disabling major features that affect app structure
Upgrading dependencies that change supervision requirements

Testing

Run E2E tests (requires a deployed Fly app):

mix test