feat: LLM routing by tier (free→Ollama, pro→Timeweb)
Some checks failed
Build and Deploy GooSeek / build-and-deploy (push) Failing after 8m25s
Some checks failed
Build and Deploy GooSeek / build-and-deploy (push) Failing after 8m25s
- Add tier-based provider routing in llm-svc - free tier → Ollama (local qwen3.5:9b) - pro/business → Timeweb Cloud AI - Add /api/v1/embed endpoint for embeddings via Ollama - Update Ollama client: qwen3.5:9b default, remove auth - Add GenerateEmbedding() function for qwen3-embedding:0.6b - Add Ollama K8s deployment with GPU support (RTX 4060 Ti) - Add monitoring stack (Prometheus, Grafana, Alertmanager) - Add Grafana dashboards for LLM and security metrics - Update deploy.sh with monitoring and Ollama deployment Made-with: Cursor
This commit is contained in:
@@ -23,6 +23,10 @@ data:
|
||||
AUTH_SVC_URL: "http://auth-svc:3050"
|
||||
TRAVEL_SVC_URL: "http://travel-svc:3035"
|
||||
ADMIN_SVC_URL: "http://admin-svc:3040"
|
||||
OLLAMA_BASE_URL: "http://ollama:11434"
|
||||
OLLAMA_MODEL: "qwen3.5:9b"
|
||||
OLLAMA_EMBEDDING_MODEL: "qwen3-embedding:0.6b"
|
||||
OLLAMA_NUM_PARALLEL: "2"
|
||||
DEFAULT_LLM_MODEL: "${DEFAULT_LLM_MODEL}"
|
||||
DEFAULT_LLM_PROVIDER: "${DEFAULT_LLM_PROVIDER}"
|
||||
TIMEWEB_API_BASE_URL: "${TIMEWEB_API_BASE_URL}"
|
||||
@@ -50,5 +54,6 @@ stringData:
|
||||
GEMINI_API_KEY: "${GEMINI_API_KEY}"
|
||||
JWT_SECRET: "${JWT_SECRET}"
|
||||
TIMEWEB_API_KEY: "${TIMEWEB_API_KEY}"
|
||||
OLLAMA_API_TOKEN: "${OLLAMA_API_TOKEN}"
|
||||
POSTGRES_USER: "gooseek"
|
||||
POSTGRES_PASSWORD: "gooseek"
|
||||
|
||||
Reference in New Issue
Block a user