feat: LLM routing by tier (free→Ollama, pro→Timeweb)
Some checks failed
Build and Deploy GooSeek / build-and-deploy (push) Failing after 8m25s
Some checks failed
Build and Deploy GooSeek / build-and-deploy (push) Failing after 8m25s
- Add tier-based provider routing in llm-svc - free tier → Ollama (local qwen3.5:9b) - pro/business → Timeweb Cloud AI - Add /api/v1/embed endpoint for embeddings via Ollama - Update Ollama client: qwen3.5:9b default, remove auth - Add GenerateEmbedding() function for qwen3-embedding:0.6b - Add Ollama K8s deployment with GPU support (RTX 4060 Ti) - Add monitoring stack (Prometheus, Grafana, Alertmanager) - Add Grafana dashboards for LLM and security metrics - Update deploy.sh with monitoring and Ollama deployment Made-with: Cursor
This commit is contained in:
@@ -79,8 +79,11 @@ type Config struct {
|
||||
TimewebProxySource string
|
||||
|
||||
// Ollama (local LLM)
|
||||
OllamaBaseURL string
|
||||
OllamaModelKey string
|
||||
OllamaBaseURL string
|
||||
OllamaModelKey string
|
||||
OllamaEmbeddingModel string
|
||||
OllamaNumParallel int
|
||||
OllamaAPIToken string
|
||||
|
||||
// Timeouts
|
||||
HTTPTimeout time.Duration
|
||||
@@ -160,8 +163,11 @@ func Load() (*Config, error) {
|
||||
TimewebAPIKey: getEnv("TIMEWEB_API_KEY", ""),
|
||||
TimewebProxySource: getEnv("TIMEWEB_X_PROXY_SOURCE", "gooseek"),
|
||||
|
||||
OllamaBaseURL: getEnv("OLLAMA_BASE_URL", "http://ollama:11434"),
|
||||
OllamaModelKey: getEnv("OLLAMA_MODEL", "llama3.2"),
|
||||
OllamaBaseURL: getEnv("OLLAMA_BASE_URL", "http://ollama:11434"),
|
||||
OllamaModelKey: getEnv("OLLAMA_MODEL", "qwen3.5:9b"),
|
||||
OllamaEmbeddingModel: getEnv("OLLAMA_EMBEDDING_MODEL", "qwen3-embedding:0.6b"),
|
||||
OllamaNumParallel: getEnvInt("OLLAMA_NUM_PARALLEL", 2),
|
||||
OllamaAPIToken: getEnv("OLLAMA_API_TOKEN", ""),
|
||||
|
||||
HTTPTimeout: time.Duration(getEnvInt("HTTP_TIMEOUT_MS", 60000)) * time.Millisecond,
|
||||
LLMTimeout: time.Duration(getEnvInt("LLM_TIMEOUT_MS", 120000)) * time.Millisecond,
|
||||
|
||||
Reference in New Issue
Block a user