llm-observability-stack is an umbrella Helm chart for a local, single-node, GPU-capable k3s workstation. It packages a practical LLM demo environment around Ollama, Open WebUI, a LangChain-based proxy/demo API, LangSmith tracing, and an optional in-cluster Python toolbox for diagnostics.
GitHub repository: https://github.com/waqasm86/llm-observability-stack
- Local k3s + NVIDIA workstation experiments
- GGUF-based Ollama model serving
- Open WebUI chat demos on a single node
- LangSmith-traced proxy traffic for observability walkthroughs
- Kubernetes networking, notebook, and troubleshooting exercises
- Vendored
ollamaHelm chart in charts/ollama - Vendored
open-webuiHelm chart in charts/open-webui - A local FastAPI-based LangChain demo/proxy in langchain-demo
- An in-cluster Python toolbox image in python-toolbox
- Optional Redis resources for Open WebUI websocket/state flows
- Optional LangSmith dashboard seeder CronJob
- Optional etcd simulation resources for troubleshooting drills
- Custom root templates in templates that glue the stack together
Primary request path:
- Browser ->
open-webui(LoadBalanceron local k3s) open-webui->langchain-demoproxy (/ollama/api/*)langchain-demo->ollamalangchain-demo-> LangSmith API when tracing is configuredpython-toolbox-> in-cluster DNS/service checks and optional LangSmith support scripts
Typical local exposure strategy:
open-webuiis exposed directly for browser useollamaandlangchain-demostayClusterIP- host access to internal APIs is done with
kubectl port-forward - the local example profile keeps
pythonToolbox.enabled: true
llm-observability-stack/
├── Chart.yaml
├── values.yaml
├── values.local-k3s.example.yaml
├── templates/ # root chart manifests
├── charts/ # vendored dependency charts
├── langchain-demo/ # local image source for proxy/demo API
├── python-toolbox/ # local image source for in-cluster helpers
├── jupyter-notebooks/ # notebook walkthroughs and diagnostics
├── docs/ # extended project documentation
├── hack/ # local image build/import helpers
└── tests/ # smoke tests and chart checks
Core documentation:
- docs/README.md
- docs/QUICKSTART.md
- docs/CONFIG-PROFILES.md
- docs/ARCHITECTURE.md
- docs/OPERATIONS-RUNBOOK.md
- docs/NOTEBOOKS-GUIDE.md
- docs/PROJECT-DOCUMENTATION.md
- docs/KUBERNETES-NETWORKING.md
- docs/KUBECTL-COMMAND-REFERENCE.md
- docs/PYTHON-KUBERNETES-AUTOMATION.md
- docs/GITHUB-PUBLISHING.md
Component documentation:
- langchain-demo/README.md
- python-toolbox/README.md
- hack/README.md
- jupyter-notebooks/README.md
- jupyter-notebooks/llm-observability-in-action/README.md
- k3s cluster reachable from
kubectl - NVIDIA runtime configured on the node
- NVIDIA device plugin already installed in the cluster
- Helm 3
- Docker or
nerdctl - A local GGUF model file on host storage
- Python 3.11 for notebook workflows
Recommended quick checks:
kubectl get nodes -o wide
kubectl get pods -n nvidia-device-plugin
helm version- Copy the local values template:
cp values.local-k3s.example.yaml values.local-k3s.yaml- Edit
values.local-k3s.yamland set:
- the GGUF host path values for Ollama
- LangSmith credentials or existing secret references
- Open WebUI secret handling
- any local service exposure overrides you want to keep
Profile guidance:
- generic defaults and local-example differences: docs/CONFIG-PROFILES.md
- sanitized local example: values.local-k3s.example.yaml
- Build/import local images:
./hack/build-local-image.sh langchain-demo 0.1.1 ./langchain-demo
./hack/import-local-image-to-k3s.sh langchain-demo 0.1.1
./hack/build-local-image.sh python-toolbox 0.2.0 ./python-toolbox
./hack/import-local-image-to-k3s.sh python-toolbox 0.2.0- Install or upgrade:
helm dependency build .
helm upgrade --install llm-observability-stack . \
-n llm-observability --create-namespace \
-f values.local-k3s.yaml- Verify:
kubectl get all -n llm-observability
kubectl get svc -n llm-observabilityBrowser access:
- Open WebUI:
http://localhost:8080/
Port-forward internal APIs when needed:
kubectl port-forward -n llm-observability svc/ollama 11434:11434
kubectl port-forward -n llm-observability svc/langchain-demo 8000:8000Notebook launch:
PYTHON_BIN="${PYTHON_BIN:-python3.11}"
cd jupyter-notebooks
"${PYTHON_BIN}" -m jupyter lab./hack/build-local-image.sh langchain-demo 0.1.1 ./langchain-demo
./hack/import-local-image-to-k3s.sh langchain-demo 0.1.1
kubectl rollout restart deploy/langchain-demo -n llm-observability./hack/build-local-image.sh python-toolbox 0.2.0 ./python-toolbox
./hack/import-local-image-to-k3s.sh python-toolbox 0.2.0
kubectl rollout restart deploy/python-toolbox -n llm-observabilityhelm upgrade --install llm-observability-stack . \
-n llm-observability --create-namespace \
-f values.local-k3s.yaml \
--set pythonToolbox.enabled=falsehelm template llm-observability-stack . > /tmp/rendered-default.yaml
helm template llm-observability-stack . -f values.local-k3s.example.yaml > /tmp/rendered-example.yamlRecommended local validation:
helm lint .
helm template llm-observability-stack . > /tmp/rendered-default.yaml
helm template llm-observability-stack . -f values.local-k3s.example.yaml > /tmp/rendered-local.yaml
pytest -q tests- If
open-webuiis reachable but internal API notebooks fail, check port-forwards forollamaandlangchain-demo. - If
langchain-demois unhealthy, inspect:
kubectl logs -n llm-observability deploy/langchain-demo --tail=100
kubectl describe deploy -n llm-observability langchain-demo- If Ollama is slow or unavailable, inspect:
kubectl logs -n llm-observability deploy/ollama --tail=100
kubectl top pods -n llm-observability
watch -n 0.5 nvidia-smi- If GPU scheduling fails, inspect:
kubectl get pods -n nvidia-device-plugin
kubectl get nodes -o json | jq '.items[0].status.allocatable'- If notebook diagnostics fail inside the cluster, inspect:
kubectl get pods -n llm-observability -l app.kubernetes.io/name=python-toolbox -o wide
kubectl exec -it -n llm-observability deploy/python-toolbox -- bash- GitHub remote:
origin -> https://github.com/waqasm86/llm-observability-stack.git - Publishing guide: docs/GITHUB-PUBLISHING.md
- Local-only artifacts and secrets are excluded by .gitignore
Never commit:
values.local-k3s.yaml.env*.webui_secret_key- TLS keys/certs
- rendered debug manifests
- large model binaries
- notebook checkpoint directories
- Profile reference: docs/CONFIG-PROFILES.md
open-webuiis intended for direct browser useollamaandlangchain-demoare internal by default- the stack is optimized for local Xubuntu + k3s + NVIDIA workflows, not generic multi-node production deployment