Tutorial: Ultimate Privacy Smart Home with Local LLMs

Stop sending your voice to the cloud. With a $300 mini-pc and Home Assistant, you can build a voice assistant that is smarter than Alexa, faster than Siri, and 100% private.

Why Local?

Privacy: No one is listening.
Speed: No cloud latency. Responses are near-instant.
Continuity: Works when the internet is down.

Prerequisites

Hardware: A Mini PC (NUC, Beelink) with at least 16GB RAM. (Raspberry Pi 5 is okay for basic stuff, but struggles with good LLMs).
Software: Home Assistant OS installed.
Voice Hardware: ESP32-S3 Box (or any “Home Assistant Satellite” compatible device).

Step 1: Install “LocalAI” or “Ollama” Add-on

We recommend Ollama for ease of use in 2026.

Go to Home Assistant Settings -> Add-ons.
Search for “Ollama” and install.
Start the add-on and check the logs to ensure it’s running.

Step 2: Download a Model

You need a “Quantized” model that fits in your RAM.

Recommendation: Llama-4-8b-instruct-q4. It’s lightweight but incredibly smart at following instructions.
In the Ollama config, set the model to pull llama4.

Step 3: Configure “Assist” Pipeline

Go to Settings -> Voice Assistants.
Create a new Assistant pipeline.
Conversation Agent: Select “Ollama”.
Speech-to-Text (STT): Use Faster-Whisper (runs locally).
Text-to-Speech (TTS): Use Piper (great neural voices, runs locally).

Step 4: System Prompt Engineering

This is the secret sauce. You need to tell the LLM it controls a home. System Prompt:

You are a helpful smart home assistant named Jarvis.
You answer briefly over voice.
You have access to the following tools: turn_on, turn_off, set_temperature.
Current time is {{ now() }}.

Step 5: Testing

Speak to your ESP32 Box: “Turn off the lights and set the living room to 72 degrees.” The LLM will parse this into two commands and execute them via Home Assistant’s Intent API.

Troubleshooting

Slow Responses? Your model is too big for your RAM. Try a Phi-4 model or a 4-bit quantization.
Hallucinations? Make sure your system prompt strictly lists the available devices.

The Result

You now have a Star Trek-like computer that controls your house, understands complex context (“I’m going to bed” -> locks doors, turns off lights, lowers blinds), and doesn’t share a single byte of data with Big Tech.