Skip to main content

Upgrades & Maintenance

Keep your self-hosted Align deployment up to date.

Upgrade Process

Step 1: Review Release Notes

Before upgrading, check the release notes:

  • GitHub Releases
  • Breaking changes
  • New configuration options
  • Database migration notes

Step 2: Backup Database

Always backup before upgrading:

# For RDS
aws rds create-db-snapshot \
--db-instance-identifier align-prod \
--db-snapshot-identifier align-pre-upgrade-$(date +%Y%m%d)

# For self-hosted PostgreSQL
kubectl exec -it deploy/postgresql -n align -- \
pg_dump -U align align > backup-$(date +%Y%m%d).sql

Step 3: Update Helm Repository

helm repo update align

Step 4: Review Value Changes

Compare your values with the new chart defaults:

# Show new default values
helm show values align/align > new-defaults.yaml

# Diff with your current values
diff -u your-values.yaml new-defaults.yaml

Step 5: Perform Upgrade

# Dry run first
helm upgrade align align/align \
--namespace align \
--values your-values.yaml \
--dry-run

# If dry run looks good, apply
helm upgrade align align/align \
--namespace align \
--values your-values.yaml

Step 6: Verify Upgrade

# Check pods are running
kubectl get pods -n align

# Check logs for errors
kubectl logs -l app.kubernetes.io/name=align -n align

# Verify migrations ran
kubectl get jobs -n align

Database Migrations

Migrations run automatically as Helm pre-upgrade hooks.

Manual Migration (if needed)

If automatic migration fails:

# Run migrations manually
kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: Job
metadata:
name: align-migrations-manual
namespace: align
spec:
template:
spec:
containers:
- name: migrations
image: align/migrations:latest
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: align-database
key: url
restartPolicy: Never
EOF

# Check status
kubectl logs job/align-migrations-manual -n align

Rollback Migration

If a migration causes issues:

  1. Check migration status in the database:
SELECT * FROM schema_migrations ORDER BY version DESC LIMIT 10;
  1. Revert to previous version (if supported):
# Rollback to previous Helm release
helm rollback align 1 -n align
warning

Not all migrations are reversible. Always backup before upgrading.


Version Compatibility

Align VersionKubernetesPostgreSQLHelm
0.8.x1.25+15+3.10+
0.7.x1.24+14+3.10+
0.6.x1.23+14+3.8+

Rollback

If an upgrade fails:

Helm Rollback

# List release history
helm history align -n align

# Rollback to previous revision
helm rollback align [REVISION] -n align

Database Rollback

If you need to restore database:

# For RDS - restore from snapshot
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier align-prod-restored \
--db-snapshot-identifier align-pre-upgrade-20240115

# For self-hosted PostgreSQL
kubectl exec -i deploy/postgresql -n align -- \
psql -U align align < backup-20240115.sql

Maintenance Tasks

Log Rotation

Logs are handled by Kubernetes. Configure log retention in your cluster.

Database Maintenance

# Run VACUUM ANALYZE (during low-traffic period)
kubectl exec -it deploy/postgresql -n align -- \
psql -U align align -c "VACUUM ANALYZE;"

# Check table sizes
kubectl exec -it deploy/postgresql -n align -- \
psql -U align align -c "\dt+ *"

Clean Up Old Data

Align automatically manages data retention based on your settings. To manually clean:

-- Clean old telemetry (example: older than 90 days)
DELETE FROM telemetry_events
WHERE created_at < NOW() - INTERVAL '90 days';

Certificate Renewal

If using cert-manager, certificates renew automatically. Verify:

kubectl get certificates -n align
kubectl describe certificate app-tls -n align

Health Checks

Verify All Services

# Check all pods
kubectl get pods -n align

# Check services
kubectl get svc -n align

# Check ingress
kubectl get ingress -n align

API Health Check

curl https://api.yourdomain.com/health
# Expected: {"status":"ok"}

Database Health

kubectl exec -it deploy/align-gateway -n align -- \
curl http://localhost:8080/health/db

Monitoring

Prometheus Metrics

Enable Prometheus scraping:

podMonitor:
enabled: true
interval: 30s
path: /metrics

Key Metrics

MetricDescriptionAlert Threshold
http_requests_totalTotal API requests
http_request_duration_secondsRequest latencyp99 > 5s
pg_connectionsDatabase connections> 80% max
brain_llm_requests_totalLLM API calls
brain_llm_errors_totalLLM errors> 5% error rate

Alerting

Example Prometheus alert rules:

groups:
- name: align
rules:
- alert: AlignAPIHighLatency
expr: histogram_quantile(0.99, http_request_duration_seconds_bucket{job="align-gateway"}) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "High API latency"

- alert: AlignPodNotReady
expr: kube_pod_status_ready{namespace="align"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Align pod not ready"

Support

For self-hosted support: