Quick: Deploying GPT-OSS 120B (Ollama) on Serverless Containers
Overview
Pre-requisites
Preparing a Custom Container Image (Optional)
FROM ollama/ollama:0.12.6
# Install curl for health checks
RUN apt-get update && \
apt-get install -y curl && \
rm -rf /var/lib/apt/lists/*
# Create a robust startup script inside the image
RUN cat > /start-ollama.sh <<'EOF'
#!/bin/bash
set -e
echo "=== Ollama Container Starting ==="
echo "Model storage path: ${OLLAMA_MODELS:-/data/.ollama/models}"
echo "Host binding: ${OLLAMA_HOST:-0.0.0.0:8000}"
# Set default values for model storage and host
export OLLAMA_MODELS=${OLLAMA_MODELS:-/data/.ollama/models}
export OLLAMA_HOST=${OLLAMA_HOST:-0.0.0.0:8000}
echo "Creating models directory: ${OLLAMA_MODELS}"
mkdir -p "${OLLAMA_MODELS}"
# Start Ollama server in the background
OLLAMA_PORT=${OLLAMA_HOST##*:}
echo "Starting Ollama server on port ${OLLAMA_PORT}..."
ollama serve &
OLLAMA_PID=$!
# Wait for the Ollama API to become available
echo "Waiting for Ollama API to be ready..."
TIMEOUT=600
ELAPSED=0
while ! curl -s http://localhost:${OLLAMA_PORT}/api/tags >/dev/null 2>&1; do
if [ $ELAPSED -ge $TIMEOUT ]; then
echo "ERROR: Ollama failed to start within $TIMEOUT seconds"
kill $OLLAMA_PID 2>/dev/null
exit 1
fi
sleep 1
ELAPSED=$((ELAPSED + 1))
done
echo "✓ Ollama API is ready!"
# If a model is specified in the environment variable, download it
if [ -n "$OLLAMA_PULL_MODEL" ]; then
echo "Model requested: $OLLAMA_PULL_MODEL"
if ollama list | grep -q "^${OLLAMA_PULL_MODEL}"; then
echo "✓ Model $OLLAMA_PULL_MODEL already exists."
else
echo "→ Downloading model: $OLLAMA_PULL_MODEL. This may take a while..."
if ollama pull "$OLLAMA_PULL_MODEL"; then
echo "✓ Model download successful!"
else
echo "ERROR: Failed to download model."
fi
fi
fi
# Trap signals for graceful shutdown
trap "echo 'Shutting down...'; kill $OLLAMA_PID; exit 0" SIGTERM SIGINT
echo "=== Ollama server is running on ${OLLAMA_HOST} ==="
wait $OLLAMA_PID
EOF
# Make the startup script executable
RUN chmod +x /start-ollama.sh
# Set default environment variables. These can be overridden at runtime.
ENV OLLAMA_MODELS=/data/.ollama/models
ENV OLLAMA_HOST=0.0.0.0:8000
ENV OLLAMA_PULL_MODEL=llama3:8b
# Add a healthcheck to let Docker know when the container is ready
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:8000/api/tags || exit 1
LABEL maintainer="[email protected]" \
version="1.0" \
description="Ollama with automatic model download support"
# Expose the default port
EXPOSE 8000
# Set the entrypoint to our startup script
ENTRYPOINT ["/start-ollama.sh"]Deployment Steps
1. Navigate to New Deployment
2. Basic Configuration

3. Container Image Configuration
4. Networking and Ports

5. Health Check Configuration

6. Storage and Scaling
7. Deploy

First-Time Startup
Connecting to the Endpoint



Last updated
Was this helpful?