Set Up Your Local AI

Scribely works with a local language model for private, free AI assistance. Pick a provider, download a model, and you're ready to go.

1

Pick a local AI provider

Both options are free and run entirely on your Mac. Your data never leaves your machine. Pick whichever you prefer — they both work great with Scribely.

Ollama

Command-line tool that makes running local models dead simple. Download, run one command, done.

Download Ollama

How to get started

1

Install the app

Download from ollama.com, open the .dmg, and drag Ollama to Applications. Launch it once — it runs in the menu bar.

2

Pull a model

Open Terminal and run: ollama pull qwen3:1.7b — the model downloads automatically (~1 GB).

3

That's it

Ollama runs a local server on port 11434. Scribely detects it automatically — no configuration needed.

LM Studio

Beautiful desktop app with a visual interface for browsing, downloading, and running local models.

Download LM Studio

How to get started

1

Install the app

Download from lmstudio.ai, open the .dmg, and drag LM Studio to Applications.

2

Download a model

Open LM Studio, go to the Discover tab (magnifying glass icon), search for "Qwen3 1.7B Instruct", and click Download.

3

Start the server

Go to the Developer tab (</> icon), and click "Start Server". Scribely connects automatically.

2

Choose a model

For real-time meeting assistance, you want a small, fast model. Anything under 3B parameters runs smoothly alongside your meeting without hogging resources. We recommend 1.5B-class models for the best speed-to-quality ratio.

Why small models? Scribely runs the model continuously during meetings. A 1.5B model responds in under a second on Apple Silicon while using minimal RAM. Larger models (7B+) are smarter but can cause lag and drain your battery.
Recommended

Qwen 3 1.7B

~1.1 GB

Best balance of speed and quality at this size. Excellent for meeting Q&A and summarization.

Ollama

ollama pull qwen3:1.7b

LM Studio

Search for Qwen3-1.7B-Instruct in the Discover tab

Lightweight

Gemma 3 1B

~0.8 GB

Google's compact model. Fastest option, great if you want minimal resource usage.

Ollama

ollama pull gemma3:1b

LM Studio

Search for gemma-3-1b-it in the Discover tab

Higher quality

Llama 3.2 3B

~2.0 GB

Slightly larger but noticeably smarter. Good if your Mac has 16 GB+ RAM.

Ollama

ollama pull llama3.2:3b

LM Studio

Search for Llama-3.2-3B-Instruct in the Discover tab

3

Connect to Scribely

Open Scribely and go to Settings → LLM

Select "Ollama" or "LM Studio" as your provider. Scribely auto-detects the local server.

Pick your downloaded model

The model you pulled or downloaded will appear in the model dropdown. Select it.

You're all set

Start a meeting and ask questions — everything runs locally, privately, and for free.

Don't have Scribely yet?

Download Scribely