Setup Ollama for running self-hosted LLMs on macOS

I needed a self-hosted LLM to run on my M4 Mac. This is how I did it.

Table of Contents

Purpose

My new project requires a self-hosted LLM server to run in an offline or air-gapped environment.

I have set up a development environment on my M4 Mac. This is how I did it, and why.

Stack

  • Ollama - A tool for running open-source LLMs locally via a simple API server.
  • GPT-OSS 20B - A 20B parameter open-source LLM.

Why that stack?

I chose Ollama because:

  • It has good M4 GPU support
  • It was easy to install via brew
  • Simple CLI/API interface

I chose GPT-OSS 20B because:

  • It is a 20B parameter LLM
  • It is a good balance of performance and cost
  • It is designed for reasoning, agentic tasks, and developer use cases
  • US developer (OpenAI)
  • Open source (Apache 2.0 license)
  • Relatively new (August 2025)

Installation

1. Install Ollama

# Install via Homebrew
brew install ollama

# Verify installation
ollama --version

2. Pull GPT-OSS Model

# Pull the 20B parameter model
ollama pull gpt-oss:20b

# This will download ~12GB, takes 5-15 minutes depending on connection

# Optional: Pull the larger 120B model if you have more RAM
# ollama pull gpt-oss:120b

3. Configure Ollama to Listen on All Interfaces (Optional)

I need to access Ollama from Kubernetes pods, so I need to configure it to listen on all interfaces.

# Set environment variable
export OLLAMA_HOST=0.0.0.0:11434

# Add to shell profile for persistence
echo 'export OLLAMA_HOST=0.0.0.0:11434' >> ~/.zshrc

4. Start Ollama Service

# Start Ollama server
ollama serve

This runs in the foreground. For background service, see LaunchAgent setup below.

5. Test the Model

# In a new terminal, test the model
ollama run gpt-oss:20b "Write a Python function to parse GitLab webhook payloads"

# Test chain-of-thought reasoning
ollama run gpt-oss:20b "Explain step-by-step how to implement a GitLab webhook parser"

Running Ollama as Background Service

Easiest method - Homebrew manages the LaunchAgent for you:

# Start Ollama service (starts now and on boot)
brew services start ollama

# Verify it's running
brew services list | grep ollama
curl http://localhost:11434/api/tags

Configure to listen on all interfaces (optional):

# Set environment variable for brew service
# Create or edit Homebrew's service plist
mkdir -p ~/Library/LaunchAgents

# Homebrew creates the plist, but we need to add OLLAMA_HOST
# Stop the service first
brew services stop ollama

# Edit the plist to add environment variable
# Location: ~/Library/LaunchAgents/homebrew.mxcl.ollama.plist
# Add this inside the <dict> tag:
cat >> ~/Library/LaunchAgents/homebrew.mxcl.ollama.plist <<'EOF'
    <key>EnvironmentVariables</key>
    <dict>
        <key>OLLAMA_HOST</key>
        <string>0.0.0.0:11434</string>
    </dict>
EOF

# Restart the service
brew services restart ollama

Manage the service:

# Start service
brew services start ollama

# Stop service
brew services stop ollama

# Restart service
brew services restart ollama

# Check status
brew services list

# View logs
tail -f ~/Library/Logs/homebrew.mxcl.ollama.log
tail -f ~/Library/Logs/homebrew.mxcl.ollama.err.log

Option 2: Manual LaunchAgent (Alternative)

If you need more control over the LaunchAgent configuration:

# Create custom LaunchAgent plist
cat > ~/Library/LaunchAgents/com.ollama.server.plist <<EOF
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.ollama.server</string>
    <key>ProgramArguments</key>
    <array>
        <string>/opt/homebrew/bin/ollama</string>
        <string>serve</string>
    </array>
    <key>EnvironmentVariables</key>
    <dict>
        <key>OLLAMA_HOST</key>
        <string>0.0.0.0:11434</string>
    </dict>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <true/>
    <key>StandardOutPath</key>
    <string>/tmp/ollama.log</string>
    <key>StandardErrorPath</key>
    <string>/tmp/ollama.error.log</string>
</dict>
</plist>
EOF

# Load the LaunchAgent
launchctl load ~/Library/LaunchAgents/com.ollama.server.plist

# Start the service
launchctl start com.ollama.server

# Verify it's running
curl http://localhost:11434/api/tags

Manage manual LaunchAgent:

# Stop service
launchctl stop com.ollama.server

# Restart service
launchctl stop com.ollama.server
launchctl start com.ollama.server

# Unload service
launchctl unload ~/Library/LaunchAgents/com.ollama.server.plist

# View logs
tail -f /tmp/ollama.log
tail -f /tmp/ollama.error.log

My Hardware Specifications

  • Machine: Apple M4 Mac
  • RAM: 32GB unified memory
  • GPU: M4 Metal GPU
  • Cores: 10
  • Architecture: ARM64

Glossary and Links

Subscribe to Rachel from Hobthross

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe