Self-Hosted AI Assistant: Complete Setup Guide

Apr 2026  ·  12 min read  ·  By the GetMyPersonalAI team

Self-hosting an AI assistant sounds straightforward until you're three hours deep in CUDA drivers at 11pm, wondering why your model only answers in Portuguese. I've been there. This guide covers both paths honestly: full DIY with Ollama, and managed self-hosting where someone else handles the infrastructure.

Neither path is right for everyone. By the end of this you'll know exactly which one fits your situation.

Why bother self-hosting at all?

Before diving into the how, it's worth being clear on the why. Self-hosted AI has two real advantages over services like ChatGPT:

The tradeoff is setup complexity, ongoing maintenance, and the fact that consumer-grade hardware runs smaller models. A local Mistral-7B is genuinely good, but it's not GPT-4. That gap is closing fast, but it's still real.

Path 1: Full DIY with Ollama

Ollama is the best tool currently available for running LLMs locally. It handles model downloads, quantization, and an OpenAI-compatible API endpoint. Here's the full setup process.

Step 1: Install Ollama

On macOS or Linux, installation is a single command:

curl -fsSL https://ollama.com/install.sh | sh

On Windows, download the installer from ollama.com. After installation, Ollama runs as a background service and exposes a local API at http://localhost:11434.

Step 2: Pull a model

Model selection is the most consequential decision. Here's how to think about it:

For most users starting out: llama3:8b for general use, or mistral:7b-instruct if you want something tuned for following instructions.

Step 3: Test the API

Once a model is pulled, verify it's working:

curl http://localhost:11434/api/generate \ -d '{"model": "llama3", "prompt": "What is 2+2?", "stream": false}'

You should get a JSON response with a response field. If you do, your local LLM is running.

Step 4: Connect a frontend

Ollama's API is OpenAI-compatible, which means most AI tools work with it. Options range from Open WebUI (a full ChatGPT-like web interface) to building your own Telegram bot (covered in the next post). For a quick test, Open WebUI is the fastest path:

docker run -d -p 3000:8080 \ --add-host=host.docker.internal:host-gateway \ -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \ ghcr.io/open-webui/open-webui:main

Visit http://localhost:3000 and you have a working AI chat interface backed by your local model.

Realistic setup time: If everything goes smoothly — no CUDA issues, no model download interruptions, no Docker network problems — plan for 3–5 hours for a first-time setup. Experienced Linux users can do it in under an hour. Windows users often hit driver issues that add time. Budget a full evening.

The ongoing maintenance reality

Setup is one-time; maintenance is forever. Running your own AI infrastructure means:

None of this is insurmountable. But it's real work, and it compounds over time.

Path 2: Managed self-hosting

Managed self-hosting means the infrastructure is still yours — a real server running under your account, with your data — but someone else handles the setup, maintenance, and operations. This is what GetMyPersonalAI does.

The difference in experience is stark:

The tradeoff is control. With full DIY, you choose the exact model, configuration, and can modify anything. With managed self-hosting, you're working within the parameters of what's offered. For most users, that's fine — the out-of-box configuration is well-tuned.

Cost comparison

Let's be concrete about money. These are real numbers, not marketing estimates.

Setup Monthly Cost Setup Time Maintenance Data Location
ChatGPT Plus $20/mo 2 minutes None OpenAI servers
DIY: local laptop $0 (hardware you own) 3–5 hours 1–2 hrs/month Your laptop
DIY: cloud GPU instance $40–80/mo (t3.xlarge + GPU) 5–8 hours 2–4 hrs/month Your cloud instance
GetMyPersonalAI $19.99/mo 60 seconds None Your EC2 instance

The DIY local laptop path is technically cheapest if you already own the hardware. But it only works while your computer is on, and your model quality is limited by your RAM. A 7B model running on a MacBook Pro is good, not great.

The DIY cloud path gives you a 24/7 server with a larger model. But once you factor in an EC2 instance with enough RAM to run a real model — a g4dn.xlarge or similar — you're at $40–80/mo before counting your time. And there's real time involved: initial setup, debugging, monthly maintenance. If your time is worth anything, the cost advantage shrinks fast.

Which path is right for you?

Here's the honest decision framework:

Choose full DIY if:

Choose managed self-hosting if:

The goal of self-hosting isn't self-hosting. The goal is private AI that works reliably. If DIY gets you there, great. If managed self-hosting gets you there with less friction, that's a legitimate choice — and one that costs about the same as a ChatGPT subscription.

Skip the setup. Get private AI in 60 seconds.

GetMyPersonalAI deploys your own EC2-hosted AI assistant — no server setup, no maintenance. Private by architecture, not policy.

Start your $1 trial

More from the blog

Privacy
Is ChatGPT Safe? What Actually Happens to Your Data
Tutorial
How to Set Up a Private AI Bot on Telegram