feat: LLM routing by tier (free→Ollama, pro→Timeweb)
Some checks failed
Build and Deploy GooSeek / build-and-deploy (push) Failing after 8m25s
Some checks failed
Build and Deploy GooSeek / build-and-deploy (push) Failing after 8m25s
- Add tier-based provider routing in llm-svc - free tier → Ollama (local qwen3.5:9b) - pro/business → Timeweb Cloud AI - Add /api/v1/embed endpoint for embeddings via Ollama - Update Ollama client: qwen3.5:9b default, remove auth - Add GenerateEmbedding() function for qwen3-embedding:0.6b - Add Ollama K8s deployment with GPU support (RTX 4060 Ti) - Add monitoring stack (Prometheus, Grafana, Alertmanager) - Add Grafana dashboards for LLM and security metrics - Update deploy.sh with monitoring and Ollama deployment Made-with: Cursor
This commit is contained in:
@@ -5,6 +5,7 @@ go 1.24
|
||||
toolchain go1.24.13
|
||||
|
||||
require (
|
||||
github.com/gofiber/adaptor/v2 v2.2.1
|
||||
github.com/gofiber/fiber/v2 v2.52.0
|
||||
github.com/golang-jwt/jwt/v5 v5.2.1
|
||||
github.com/google/uuid v1.6.0
|
||||
@@ -12,6 +13,7 @@ require (
|
||||
github.com/ledongthuc/pdf v0.0.0-20240201131950-da5b75280b06
|
||||
github.com/lib/pq v1.10.9
|
||||
github.com/minio/minio-go/v7 v7.0.70
|
||||
github.com/prometheus/client_golang v1.19.0
|
||||
github.com/redis/go-redis/v9 v9.4.0
|
||||
github.com/sashabaranov/go-openai v1.20.0
|
||||
go.uber.org/zap v1.27.0
|
||||
|
||||
Reference in New Issue
Block a user