feat: LLM routing by tier (free→Ollama, pro→Timeweb)
Some checks failed
Build and Deploy GooSeek / build-and-deploy (push) Failing after 8m25s

- Add tier-based provider routing in llm-svc
  - free tier → Ollama (local qwen3.5:9b)
  - pro/business → Timeweb Cloud AI
- Add /api/v1/embed endpoint for embeddings via Ollama
- Update Ollama client: qwen3.5:9b default, remove auth
- Add GenerateEmbedding() function for qwen3-embedding:0.6b
- Add Ollama K8s deployment with GPU support (RTX 4060 Ti)
- Add monitoring stack (Prometheus, Grafana, Alertmanager)
- Add Grafana dashboards for LLM and security metrics
- Update deploy.sh with monitoring and Ollama deployment

Made-with: Cursor
This commit is contained in:
home
2026-03-03 02:25:22 +03:00
parent 5ac082a7c6
commit 7a40ff629e
19 changed files with 1759 additions and 35 deletions

View File

@@ -15,6 +15,7 @@ import (
"github.com/gofiber/fiber/v2/middleware/cors"
"github.com/gofiber/fiber/v2/middleware/logger"
"github.com/gooseek/backend/pkg/config"
"github.com/gooseek/backend/pkg/metrics"
"github.com/gooseek/backend/pkg/middleware"
"github.com/redis/go-redis/v9"
)
@@ -73,6 +74,11 @@ func main() {
AllowHeaders: "Origin, Content-Type, Accept, Authorization",
AllowMethods: "GET, POST, PUT, PATCH, DELETE, OPTIONS",
}))
app.Use(metrics.PrometheusMiddleware(metrics.MetricsConfig{
ServiceName: "api-gateway",
}))
app.Get("/metrics", metrics.MetricsHandler())
app.Use(middleware.JWT(middleware.JWTConfig{
Secret: cfg.JWTSecret,