SOPApril 17, 2026

Offline‑First AI SOP: Travel‑Day Stack for Summaries, Tags, and Drafts

A copy‑paste SOP for running summaries, tags, and short drafts fully offline on travel days. Includes model/run profiles, SQLite WAL schema, watchers, a Litestream sync worker, conflict policy, and ‘Travel Day Mode’ scripts for macOS and Windows.

From EpisodeBuild an Offline AI Stack That Works When Your WiFi Doesn't

When you’re boarding in 12 minutes, this SOP keeps the critical loop alive: capture audio, transcribe offline, summarize/tag/draft locally on a lightweight model, and queue everything to SQLite for safe sync when you’re back online. Follow this to ship an offline‑first pattern in a weekend and run it reliably on travel days.

Architecture at a glance:

Local runners: LM Studio, GPT4All Desktop, or Ollama (choose one) for 7–13B instruct models; Whisper.cpp or faster‑whisper for offline STT.
Queue: SQLite with WAL. One table for work items, one for canonical docs/notes. Triggers for updated_at.
Watchers: Watchman (or fswatch) turns file drops into queue entries.
Sync: Litestream to S3‑compatible storage (default). LiteFS is an alternative if you need multi‑node replication later.
Security: Full‑file DB encryption (SQLCipher or SEE). Keys from OS keychain, not .env.

Run profiles you will pick in Step 1:

Battery Saver: 7–8B instruct, Q4 quantization, CPU/NPU only, threads ≈ physical cores − 1, context 2–4k. Targets: summaries, tags, short drafts.
Throughput: 13B instruct with partial GPU offload (if you have it), threads ≈ physical cores, context 4–8k. Use when plugged in or on long rides.

Conservative hardware notes (plan for margin, not the floor):

7–8B models: ≥8 GB RAM recommended; practical Q4 footprints ~5–8 GB depending on family/quantization.
13B models: ≥16 GB RAM recommended; offload GPU layers if available to keep tokens/sec decent.
70B+ models: not for travel day. Keep this in the cloud queue.

Expected outcome: a portable, GUI‑friendly stack that works fully offline for 2–6 hours, queues everything safely, and resumes syncing/hand‑offs without manual cleanup.

1
Decide your run profile and set guardrails
Pick Battery Saver or Throughput based on the next 3 hours of power and work.
- Battery Saver (default): offline summaries/tags/short drafts; 7–8B Q4, CPU/NPU only; context ≤4k; threads = physical cores − 1.
- Throughput: plugged‑in or power bank; 13B with partial GPU offload; context 4–8k; threads = physical cores. Guardrails:
- If tokens/sec < 15 on summaries, fall back to Battery Saver.
- If battery < 35% with ≥60 min left, force Battery Saver until charging. Outcome: A chosen profile with clear triggers to scale down.
2
Install a local LLM runner with a local API
Choose one runner you’re comfortable operating:
- LM Studio (GUI + OpenAI‑compatible local API)
- GPT4All Desktop (GUI + optional local server)
- Ollama (CLI + Windows GUI; OpenAI‑compatible via adapters) Action:
- Start the runner and confirm a local API base URL. Set these in an .env file you’ll create in Step 4:
```
LLM_API_BASE=http://localhost:[PORT]
LLM_MODEL=[your-7-8B-or-13B-instruct]
LLM_CTX=4096
LLM_TEMPERATURE=0.2
LLM_TOP_P=0.9
```
Outcome: A reachable local LLM endpoint with a selected instruct model.

Install offline STT

Pick one:

Whisper.cpp (portable binary) for zero‑Python setups.
faster‑whisper (Python) for GPU acceleration when available. Create an audio inbox folder (Step 4) and test:

# whisper.cpp example
./main -m ./models/ggml-medium.en.bin -f ./inbox/audio/test.m4a -otxt -of ./inbox/audio/test
# faster-whisper (Python) example
pip install faster-whisper
python - &lt;&lt;&#39;PY&#39;
from faster_whisper import WhisperModel
m=WhisperModel(&quot;medium.en&quot;, compute_type=&quot;int8&quot;)
segments,info=m.transcribe(&quot;./inbox/audio/test.m4a&quot;)
open(&quot;./inbox/audio/test.txt&quot;,&quot;w&quot;).write(&quot;\n&quot;.join(s.text for s in segments))
print(&quot;ok&quot;)
PY

Outcome: You can drop audio in inbox/audio and get a local transcript .txt beside it.

Create project folders and environment

Make a simple, portable layout:

~/travel-day-stack/
  .env                # LLM vars + DB path
  db/                 # SQLite lives here (encrypted)
  inbox/
    audio/            # raw audio drops
    notes/            # .md or .txt drops
  out/
    summaries/
    tags/
    drafts/
  bin/
    enqueue.sh
    worker.py
    travel_day_on.sh
    travel_day_off.sh
    TravelDayOn.ps1
    TravelDayOff.ps1
  sync/
    litestream.yml
    docker-compose.yml

Example .env (don’t store secrets here in production):

DB_PATH=./db/ops.sqlite3
LLM_API_BASE=http://localhost:11434/v1      # set to your runner’s OpenAI-compatible endpoint
LLM_MODEL=local-7b-instruct-q4              # replace with your actual model id/name
LLM_CTX=4096
LLM_TEMPERATURE=0.2
LLM_TOP_P=0.9
SYNC_BUCKET=s3://your-bucket/sqlite
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1

Outcome: A clean workspace and environment configuration.

Create SQLite with WAL and the queue schema

Initialize the database with safe defaults and two tables: docs (canonical) and queue (work items).

sqlite3 $DB_PATH &lt;&lt;&#39;SQL&#39;
PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;
PRAGMA busy_timeout=5000;
PRAGMA foreign_keys=ON;

CREATE TABLE IF NOT EXISTS docs (
  doc_id TEXT PRIMARY KEY,
  title TEXT,
  src_path TEXT,
  text TEXT,
  summary TEXT,
  tags_json TEXT,      -- JSON array of strings
  model TEXT,
  hash TEXT,
  origin TEXT DEFAULT &#39;local&#39;, -- local|cloud
  updated_at INTEGER NOT NULL DEFAULT (strftime(&#39;%s&#39;,&#39;now&#39;)),
  created_at INTEGER NOT NULL DEFAULT (strftime(&#39;%s&#39;,&#39;now&#39;))
);

CREATE TABLE IF NOT EXISTS queue (
  id INTEGER PRIMARY KEY,
  kind TEXT NOT NULL CHECK (kind IN (&#39;stt&#39;,&#39;summarize&#39;,&#39;tag&#39;,&#39;draft&#39;)),
  doc_id TEXT,
  src_path TEXT,
  payload_json TEXT,
  priority INTEGER NOT NULL DEFAULT 5,
  status TEXT NOT NULL DEFAULT &#39;enqueued&#39; CHECK (status IN (&#39;enqueued&#39;,&#39;processing&#39;,&#39;done&#39;,&#39;failed&#39;)),
  attempts INTEGER NOT NULL DEFAULT 0,
  result_json TEXT,
  error TEXT,
  idempotency_key TEXT UNIQUE,
  updated_at INTEGER NOT NULL DEFAULT (strftime(&#39;%s&#39;,&#39;now&#39;)),
  created_at INTEGER NOT NULL DEFAULT (strftime(&#39;%s&#39;,&#39;now&#39;)),
  FOREIGN KEY(doc_id) REFERENCES docs(doc_id)
);

CREATE INDEX IF NOT EXISTS idx_queue_status ON queue(status, priority, created_at);
CREATE TRIGGER IF NOT EXISTS trg_docs_updated AFTER UPDATE ON docs
BEGIN
  UPDATE docs SET updated_at=strftime(&#39;%s&#39;,&#39;now&#39;) WHERE doc_id=OLD.doc_id;
END;
CREATE TRIGGER IF NOT EXISTS trg_queue_updated AFTER UPDATE ON queue
BEGIN
  UPDATE queue SET updated_at=strftime(&#39;%s&#39;,&#39;now&#39;) WHERE id=OLD.id;
END;
SQL

Outcome: A WAL‑backed queue ready for offline work.

Add a cross‑platform enqueue helper

Turn file drops into queue items with a deterministic id and idempotency key. bin/enqueue.sh:

#!/usr/bin/env bash
set -euo pipefail
DB=&quot;${DB_PATH:-./db/ops.sqlite3}&quot;
PATH_IN=&quot;$1&quot;                       # /full/path/to/file
KIND=&quot;$2&quot;                           # stt|summarize|tag|draft
HASH=$(shasum -a 256 &quot;$PATH_IN&quot; | awk &#39;{print $1}&#39;)
DOC_ID=$(basename &quot;$PATH_IN&quot; | sed &#39;s/\.[^.]*$//&#39;)
TITLE=&quot;$DOC_ID&quot;
IDEMP=&quot;$KIND:$HASH&quot;
# Insert/ensure doc row
sqlite3 &quot;$DB&quot; &quot;INSERT OR IGNORE INTO docs(doc_id,title,src_path,hash) VALUES(&#39;$DOC_ID&#39;,&#39;$TITLE&#39;,&#39;$PATH_IN&#39;,&#39;$HASH&#39;);&quot;
# Enqueue work
sqlite3 &quot;$DB&quot; &quot;INSERT OR IGNORE INTO queue(kind,doc_id,src_path,payload_json,priority,idempotency_key) VALUES(&#39;$KIND&#39;,&#39;$DOC_ID&#39;,&#39;$PATH_IN&#39;,&#39;{}&#39;,5,&#39;$IDEMP&#39;);&quot;
echo &quot;enqueued $KIND for $DOC_ID&quot;

Windows PowerShell variant (bin/Enqueue.ps1):

param([string]$PathIn,[string]$Kind)
$DB=$env:DB_PATH
$hash=(Get-FileHash -Algorithm SHA256 $PathIn).Hash.ToLower()
$docId=[IO.Path]::GetFileNameWithoutExtension($PathIn)
$idemp=&quot;$Kind:$hash&quot;
sqlite3 $DB &quot;INSERT OR IGNORE INTO docs(doc_id,title,src_path,hash) VALUES(&#39;$docId&#39;,&#39;$docId&#39;,&#39;$PathIn&#39;,&#39;$hash&#39;);&quot;
sqlite3 $DB &quot;INSERT OR IGNORE INTO queue(kind,doc_id,src_path,payload_json,priority,idempotency_key) VALUES(&#39;$Kind&#39;,&#39;$docId&#39;,&#39;$PathIn&#39;,&#39;{}&#39;,5,&#39;$idemp&#39;);&quot;
Write-Output &quot;enqueued $Kind for $docId&quot;

Outcome: One command to add work reliably without duplicates.

Wire up file watchers for audio and notes

Use Watchman (macOS/Linux/Windows) or fswatch (macOS/Linux). Examples: Watchman config (watchman.json):

{
  &quot;inbox_audio&quot;: {
    &quot;root&quot;: &quot;./inbox/audio&quot;,
    &quot;pattern&quot;: &quot;**/*.(m4a|mp3|wav)&quot;,
    &quot;command&quot;: [&quot;bash&quot;,&quot;-lc&quot;,&quot;./bin/enqueue.sh ${WATCHMAN_MATCH} stt&quot;]
  },
  &quot;inbox_notes&quot;: {
    &quot;root&quot;: &quot;./inbox/notes&quot;,
    &quot;pattern&quot;: &quot;**/*.(md|txt)&quot;,
    &quot;command&quot;: [&quot;bash&quot;,&quot;-lc&quot;,&quot;./bin/enqueue.sh ${WATCHMAN_MATCH} summarize &amp;&amp; ./bin/enqueue.sh ${WATCHMAN_MATCH} tag&quot;]
  }
}

fswatch example (macOS/Linux):

fswatch -0 ./inbox/audio | xargs -0 -n1 -I{} bash -lc &#39;./bin/enqueue.sh &quot;{}&quot; stt&#39;
fswatch -0 ./inbox/notes | xargs -0 -n1 -I{} bash -lc &#39;./bin/enqueue.sh &quot;{}&quot; summarize &amp;&amp; ./bin/enqueue.sh &quot;{}&quot; tag&#39;

Outcome: Dropping a file automatically creates queue items.

Run the offline worker to drain the queue

A minimal worker loops: claim → do work → write results → ack. Python example (bin/worker.py):

#!/usr/bin/env python3
import json, os, sqlite3, subprocess, time, requests
DB=os.getenv(&#39;DB_PATH&#39;,&#39;./db/ops.sqlite3&#39;)
API=os.getenv(&#39;LLM_API_BASE&#39;)
MODEL=os.getenv(&#39;LLM_MODEL&#39;)
HEADERS={&#39;Content-Type&#39;:&#39;application/json&#39;}

PROMPT_SUMMARY=lambda text: f&quot;Summarize in 5 bullets. Keep it factual.\n\nText:\n{text[:12000]}&quot;
PROMPT_TAGS=lambda text: f&quot;Return 5-8 comma-separated tags capturing topics, entities, and next actions.\n\nText:\n{text[:8000]}&quot;
PROMPT_DRAFT=lambda text: f&quot;Draft a 120-200 word follow-up email with a clear CTA based on this note.\n\nNote:\n{text[:8000]}&quot;

def chat(prompt):
    body={&quot;model&quot;: MODEL, &quot;messages&quot;: [{&quot;role&quot;:&quot;user&quot;,&quot;content&quot;: prompt}], &quot;temperature&quot;: float(os.getenv(&#39;LLM_TEMPERATURE&#39;,&#39;0.2&#39;)), &quot;top_p&quot;: float(os.getenv(&#39;LLM_TOP_P&#39;,&#39;0.9&#39;)),
           &quot;max_tokens&quot;: 500, &quot;stream&quot;: False}
    r=requests.post(f&quot;{API}/chat/completions&quot;, headers=HEADERS, data=json.dumps(body), timeout=120)
    r.raise_for_status()
    return r.json()[&quot;choices&quot;][0][&quot;message&quot;][&quot;content&quot;].strip()

con=sqlite3.connect(DB)
con.row_factory=sqlite3.Row
con.execute(&#39;PRAGMA busy_timeout=5000&#39;)

while True:
    cur=con.execute(&quot;SELECT id,kind,doc_id,src_path FROM queue WHERE status=&#39;enqueued&#39; ORDER BY priority,created_at LIMIT 1&quot;)
    row=cur.fetchone()
    if not row:
        time.sleep(1); continue
    qid=row[&#39;id&#39;]
    con.execute(&quot;UPDATE queue SET status=&#39;processing&#39;, attempts=attempts+1 WHERE id=?&quot;,(qid,)); con.commit()
    try:
        if row[&#39;kind&#39;]==&#39;stt&#39;:
            # Prefer whisper.cpp if present
            out_txt=f&quot;{row[&#39;src_path&#39;]}.txt&quot;
            if os.path.exists(&#39;./main&#39;):
                subprocess.run([&#39;./main&#39;,&#39;-m&#39;,&#39;./models/ggml-medium.en.bin&#39;,&#39;-f&#39;,row[&#39;src_path&#39;],&#39;-otxt&#39;,&#39;-of&#39;,row[&#39;src_path&#39;]],check=True)
            else:
                # fallback: faster-whisper via python callable script or skip
                raise RuntimeError(&#39;No STT binary found&#39;)
            text=open(out_txt).read()
            con.execute(&quot;UPDATE docs SET text=? WHERE doc_id=?&quot;,(text,row[&#39;doc_id&#39;]))
            # chain: enqueue summarize+tag
            for k in (&#39;summarize&#39;,&#39;tag&#39;):
                con.execute(&quot;INSERT OR IGNORE INTO queue(kind,doc_id,src_path,payload_json,priority,idempotency_key) VALUES(?,?,?,?,?,?)&quot;,
                            (k,row[&#39;doc_id&#39;],out_txt,&#39;{}&#39;,5,f&quot;{k}:{row[&#39;doc_id&#39;]}:chain&quot;))
        elif row[&#39;kind&#39;] in (&#39;summarize&#39;,&#39;tag&#39;,&#39;draft&#39;):
            text=con.execute(&quot;SELECT text FROM docs WHERE doc_id=?&quot;,(row[&#39;doc_id&#39;],)).fetchone()[0]
            if not text: raise RuntimeError(&#39;No text for doc&#39;)
            prompt = PROMPT_SUMMARY(text) if row[&#39;kind&#39;]==&#39;summarize&#39; else (PROMPT_TAGS(text) if row[&#39;kind&#39;]==&#39;tag&#39; else PROMPT_DRAFT(text))
            out=chat(prompt)
            if row[&#39;kind&#39;]==&#39;summarize&#39;:
                con.execute(&quot;UPDATE docs SET summary=?, model=? WHERE doc_id=?&quot;,(out,MODEL,row[&#39;doc_id&#39;]))
            elif row[&#39;kind&#39;]==&#39;tag&#39;:
                tags=[t.strip() for t in out.replace(&#39;\n&#39;,&#39; &#39;).split(&#39;,&#39;) if t.strip()]
                con.execute(&quot;UPDATE docs SET tags_json=?, model=? WHERE doc_id=?&quot;,(json.dumps(tags),MODEL,row[&#39;doc_id&#39;]))
            else:
                open(f&quot;out/drafts/{row[&#39;doc_id&#39;]}.md&quot;,&quot;w&quot;).write(out)
        con.execute(&quot;UPDATE queue SET status=&#39;done&#39;, result_json=? WHERE id=?&quot;,(json.dumps({&quot;ok&quot;:True}),qid))
        con.commit()
    except Exception as e:
        con.execute(&quot;UPDATE queue SET status=&#39;failed&#39;, error=? WHERE id=?&quot;,(str(e),qid)); con.commit()

Run it:

python3 ./bin/worker.py

Outcome: Queue drains offline; outputs land in docs + out/ folders.

Set up Litestream sync to S3‑compatible storage

Use Litestream as the default way to ship WAL changes to S3 when online. Create sync/litestream.yml:

dbs:
  - path: /data/ops.sqlite3
    replicas:
      - url: ${SYNC_BUCKET}
        snapshot-interval: 1h
        retention: 72h

Create sync/docker-compose.yml:

version: &quot;3.8&quot;
services:
  litestream:
    image: litestream/litestream:0.3
    environment:
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
      - AWS_REGION=${AWS_REGION}
    volumes:
      - ../db:/data
      - ./litestream.yml:/etc/litestream.yml:ro
    command: [&quot;replicate&quot;,&quot;-config&quot;,&quot;/etc/litestream.yml&quot;]
    restart: unless-stopped

Start it when you have connectivity; it will idle gracefully offline:

(cd sync &amp;&amp; docker compose up -d)

Alternative (advanced): LiteFS for multi‑node replication. Use only if you need live reads/writes across nodes; it requires FUSE privileges in Docker and a lease/primary strategy. Outcome: DB changes stream to object storage without you thinking about it.

10
Define your sync conflict policy
Conflicts are inevitable. Keep it boring and deterministic:
- Identity: docs.doc_id is the canonical key. Include docs.hash of the source to detect content changes.
- Writes: docs is last‑writer‑wins by updated_at IF origin differs. Local edits set origin='local'. Cloud transforms set origin='cloud'.
- Merges: tags_json merges as a set union; duplicates removed case‑insensitively.
- Idempotency: queue.idempotency_key = kind:hash (or kind:doc_id:chain for chained items) prevents duplicate work.
- Failures: queue.status='failed' items are retried up to attempts≤3 with exponential backoff; then parked for manual review.
- Manual override: a one‑liner tool can set origin and bump updated_at when you must force a version:
```
sqlite3 $DB_PATH &quot;UPDATE docs SET summary=?, origin=&#39;local&#39;, updated_at=strftime(&#39;%s&#39;,&#39;now&#39;) WHERE doc_id=&#39;[DOC_ID]&#39;;&quot;
```
Outcome: Everyone knows what wins and how to resolve the rare tie.

Add ‘Travel Day Mode’ scripts (macOS)

Toggle low‑power settings and start only what you need. bin/travel_day_on.sh:

#!/usr/bin/env bash
set -euo pipefail
source ./.env
pmset -a lowpowermode 1 || true
export LLM_TEMPERATURE=0.1
export LLM_CTX=${LLM_CTX:-4096}
# Start runner GUI/API yourself (LM Studio/GPT4All/Ollama)
# Start watchers and worker
(nohup fswatch -0 ./inbox/audio | xargs -0 -n1 -I{} bash -lc &#39;./bin/enqueue.sh &quot;{}&quot; stt&#39; &gt;/tmp/audio.watch.log 2&gt;&amp;1 &amp;)
(nohup fswatch -0 ./inbox/notes | xargs -0 -n1 -I{} bash -lc &#39;./bin/enqueue.sh &quot;{}&quot; summarize &amp;&amp; ./bin/enqueue.sh &quot;{}&quot; tag&#39; &gt;/tmp/notes.watch.log 2&gt;&amp;1 &amp;)
(nohup python3 ./bin/worker.py &gt;/tmp/worker.log 2&gt;&amp;1 &amp;)
echo &quot;Travel Day Mode: ON&quot;

bin/travel_day_off.sh:

#!/usr/bin/env bash
pkill -f worker.py || true
pkill -f fswatch || true
pmset -a lowpowermode 0 || true
echo &quot;Travel Day Mode: OFF&quot;

Outcome: One command pre‑flight; your laptop runs lean and keeps producing.

Add ‘Travel Day Mode’ scripts (Windows)

Switch to a power‑saving plan and start watchers/worker. bin/TravelDayOn.ps1:

$env:LLM_TEMPERATURE=&quot;0.1&quot;
# Choose a power plan GUID beforehand: powercfg /L
# Example: set to Power saver if available
# powercfg /S SCHEME_MAX  # replace with your saver GUID
Start-Process powershell -ArgumentList &quot;-NoProfile -Command python .\bin\worker.py&quot; -WindowStyle Minimized
# Use PowerShell FileSystemWatcher for notes
$fw1=New-Object IO.FileSystemWatcher (Resolve-Path .\inbox\notes), &#39;*.*&#39;; $fw1.EnableRaisingEvents=$true
Register-ObjectEvent $fw1 Created -Action { &amp; .\bin\Enqueue.ps1 $Event.SourceEventArgs.FullPath &#39;summarize&#39;; &amp; .\bin\Enqueue.ps1 $Event.SourceEventArgs.FullPath &#39;tag&#39; } | Out-Null
$fw2=New-Object IO.FileSystemWatcher (Resolve-Path .\inbox\audio), &#39;*.*&#39;; $fw2.EnableRaisingEvents=$true
Register-ObjectEvent $fw2 Created -Action { &amp; .\bin\Enqueue.ps1 $Event.SourceEventArgs.FullPath &#39;stt&#39; } | Out-Null
Write-Output &quot;Travel Day Mode: ON&quot;

bin/TravelDayOff.ps1:

Get-Process python -ErrorAction SilentlyContinue | Where-Object { $_.Path -like &#39;*\\bin\\worker.py*&#39; } | Stop-Process -Force
# Unregister all watcher events
Get-EventSubscriber | Unregister-Event
Write-Output &quot;Travel Day Mode: OFF&quot;

Outcome: Windows laptops behave just as well offline.

13
Encrypt the database at rest
Prefer SQLCipher (open‑source) or SEE (commercial). Example with SQLCipher:
- Create encrypted DB and migrate:
```
# new encrypted db
sqlcipher ./db/ops.enc &lt;&lt;&#39;SQL&#39;
PRAGMA key=&#39;file:./dbkey?cipher=chacha20&amp;kdf_iter=256000&#39;;
ATTACH DATABASE &#39;./db/ops.sqlite3&#39; AS plaintext KEY &#39;&#39;;
SELECT sqlcipher_export(&#39;main&#39;,&#39;plaintext&#39;);
DETACH DATABASE plaintext;
SQL
```
- Store the key reference in the OS keychain (recommended) and inject at runtime, not in .env.
- Update your scripts to use ops.enc and sqlcipher instead of sqlite3. Outcome: A stolen laptop leaks nothing from your queue.
14
Run a 2‑hour offline drill
Practice it before you need it.
- Disconnect all networks. Start Travel Day Mode.
- Drop: one 5–10 min audio, two .md notes.
- Observe: queue growth, worker throughput (tokens/sec in runner logs), CPU temps, battery drain.
- After 2 hours, reconnect and start Litestream. Confirm:
  
  queue has 0 enqueued/processing; failed≤1 with clear error.
  
  docs rows updated; out/summaries, out/drafts populated.
  
  latest DB snapshot exists in S3. Outcome: Confidence the loop survives real outages.
15
Operational checks and rollback
Add boring checks to avoid surprises:
- On boot: verify WAL is active PRAGMA journal_mode; → wal.
- Hourly: sqlite3 $DB_PATH "SELECT status, COUNT(*) FROM queue GROUP BY 1;" → alert if processing stalls > 10 min.
- Corruption plan: stop writes, restore from Litestream snapshot, then re‑enqueue from src files by hashing and inserting missing items only.
- Disk guard: keep 2–5 GB free; Litestream snapshots need headroom. Outcome: Clear defaults when something goes sideways.
16
Escalate heavy work to cloud after reconnect
Keep travel‑day work light. When back online:
- Automatically enqueue cloud tasks for anything that exceeded thresholds offline (e.g., context > 6k, slow tokens/sec):
```
sqlite3 $DB_PATH &quot;INSERT OR IGNORE INTO queue(kind,doc_id,payload_json,priority,idempotency_key) \
SELECT &#39;draft&#39;, doc_id, json_object(&#39;target&#39;,&#39;cloud&#39;,&#39;reason&#39;,&#39;long_context&#39;), 7, &#39;cloud-draft:&#39;||doc_id FROM docs WHERE length(text) &gt; 6000;&quot;
```
- Your cloud worker (outside this SOP) reads from the same queue table replicated via Litestream and writes results back. Outcome: The laptop does the first 80%; the cloud finishes the rest when you land.
17
Throughput tuning (safe defaults)
Only touch these if you must, and reset after travel day.
- Threads: physical cores − 1 (Battery Saver) or = cores (Throughput).
- Context: 2–4k for summaries/tags; don’t waste RAM on 16k unless required.
- Quantization: Q4 for 7–8B; Q5 buys modest quality at higher power; avoid FP16 on battery.
- GPU offload: only when plugged in or with a large power bank; cap layers to keep temps sane. Outcome: Predictable perf without melting your battery.