Полный коммит проекта Perplexica

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-20 14:56:42 +03:00
parent d389676f50
commit b65d24c1e8
222 changed files with 30405 additions and 0 deletions
--- a/docs/API/SEARCH.md
+++ b/docs/API/SEARCH.md
@@ -0,0 +1,190 @@
+# Perplexica Search API Documentation
+
+## Overview
+
+Perplexica’s Search API makes it easy to use our AI-powered search engine. You can run different types of searches, pick the models you want to use, and get the most recent info. Follow the following headings to learn more about Perplexica's search API.
+
+## Endpoints
+
+### Get Available Providers and Models
+
+Before making search requests, you'll need to get the available providers and their models.
+
+#### **GET** `/api/providers`
+
+**Full URL**: `http://localhost:3000/api/providers`
+
+Returns a list of all active providers with their available chat and embedding models.
+
+**Response Example:**
+
+```json
+{
+  "providers": [
+    {
+      "id": "550e8400-e29b-41d4-a716-446655440000",
+      "name": "OpenAI",
+      "chatModels": [
+        {
+          "name": "GPT 4 Omni Mini",
+          "key": "gpt-4o-mini"
+        },
+        {
+          "name": "GPT 4 Omni",
+          "key": "gpt-4o"
+        }
+      ],
+      "embeddingModels": [
+        {
+          "name": "Text Embedding 3 Large",
+          "key": "text-embedding-3-large"
+        }
+      ]
+    }
+  ]
+}
+```
+
+Use the `id` field as the `providerId` and the `key` field from the models arrays when making search requests.
+
+### Search Query
+
+#### **POST** `/api/search`
+
+**Full URL**: `http://localhost:3000/api/search`
+
+**Note**: Replace `localhost:3000` with your Perplexica instance URL if running on a different host or port
+
+### Request
+
+The API accepts a JSON object in the request body, where you define the enabled search `sources`, chat models, embedding models, and your query.
+
+#### Request Body Structure
+
+```json
+{
+  "chatModel": {
+    "providerId": "550e8400-e29b-41d4-a716-446655440000",
+    "key": "gpt-4o-mini"
+  },
+  "embeddingModel": {
+    "providerId": "550e8400-e29b-41d4-a716-446655440000",
+    "key": "text-embedding-3-large"
+  },
+  "optimizationMode": "speed",
+  "sources": ["web"],
+  "query": "What is Perplexica",
+  "history": [
+    ["human", "Hi, how are you?"],
+    ["assistant", "I am doing well, how can I help you today?"]
+  ],
+  "systemInstructions": "Focus on providing technical details about Perplexica's architecture.",
+  "stream": false
+}
+```
+
+**Note**: The `providerId` must be a valid UUID obtained from the `/api/providers` endpoint. The example above uses a sample UUID for demonstration.
+
+### Request Parameters
+
+- **`chatModel`** (object, required): Defines the chat model to be used for the query. To get available providers and models, send a GET request to `http://localhost:3000/api/providers`.
+
+  - `providerId` (string): The UUID of the provider. You can get this from the `/api/providers` endpoint response.
+  - `key` (string): The model key/identifier (e.g., `gpt-4o-mini`, `llama3.1:latest`). Use the `key` value from the provider's `chatModels` array, not the display name.
+
+- **`embeddingModel`** (object, required): Defines the embedding model for similarity-based searching. To get available providers and models, send a GET request to `http://localhost:3000/api/providers`.
+
+  - `providerId` (string): The UUID of the embedding provider. You can get this from the `/api/providers` endpoint response.
+  - `key` (string): The embedding model key (e.g., `text-embedding-3-large`, `nomic-embed-text`). Use the `key` value from the provider's `embeddingModels` array, not the display name.
+
+- **`sources`** (array, required): Which search sources to enable. Available values:
+
+  - `web`, `academic`, `discussions`.
+
+- **`optimizationMode`** (string, optional): Specifies the optimization mode to control the balance between performance and quality. Available modes:
+
+  - `speed`: Prioritize speed and return the fastest answer.
+  - `balanced`: Provide a balanced answer with good speed and reasonable quality.
+  - `quality`: Prioritize answer quality (may be slower).
+
+- **`query`** (string, required): The search query or question.
+
+- **`systemInstructions`** (string, optional): Custom instructions provided by the user to guide the AI's response. These instructions are treated as user preferences and have lower priority than the system's core instructions. For example, you can specify a particular writing style, format, or focus area.
+
+- **`history`** (array, optional): An array of message pairs representing the conversation history. Each pair consists of a role (either 'human' or 'assistant') and the message content. This allows the system to use the context of the conversation to refine results. Example:
+
+  ```json
+  [
+    ["human", "What is Perplexica?"],
+    ["assistant", "Perplexica is an AI-powered search engine..."]
+  ]
+  ```
+
+- **`stream`** (boolean, optional): When set to `true`, enables streaming responses. Default is `false`.
+
+### Response
+
+The response from the API includes both the final message and the sources used to generate that message.
+
+#### Standard Response (stream: false)
+
+```json
+{
+  "message": "Perplexica is an innovative, open-source AI-powered search engine designed to enhance the way users search for information online. Here are some key features and characteristics of Perplexica:\n\n- **AI-Powered Technology**: It utilizes advanced machine learning algorithms to not only retrieve information but also to understand the context and intent behind user queries, providing more relevant results [1][5].\n\n- **Open-Source**: Being open-source, Perplexica offers flexibility and transparency, allowing users to explore its functionalities without the constraints of proprietary software [3][10].",
+  "sources": [
+    {
+      "content": "Perplexica is an innovative, open-source AI-powered search engine designed to enhance the way users search for information online.",
+      "metadata": {
+        "title": "What is Perplexica, and how does it function as an AI-powered search ...",
+        "url": "https://askai.glarity.app/search/What-is-Perplexica--and-how-does-it-function-as-an-AI-powered-search-engine"
+      }
+    },
+    {
+      "content": "Perplexica is an open-source AI-powered search tool that dives deep into the internet to find precise answers.",
+      "metadata": {
+        "title": "Sahar Mor's Post",
+        "url": "https://www.linkedin.com/posts/sahar-mor_a-new-open-source-project-called-perplexica-activity-7204489745668694016-ncja"
+      }
+    }
+        ....
+  ]
+}
+```
+
+#### Streaming Response (stream: true)
+
+When streaming is enabled, the API returns a stream of newline-delimited JSON objects using Server-Sent Events (SSE). Each line contains a complete, valid JSON object. The response has `Content-Type: text/event-stream`.
+
+Example of streamed response objects:
+
+```
+{"type":"init","data":"Stream connected"}
+{"type":"sources","data":[{"content":"...","metadata":{"title":"...","url":"..."}},...]}
+{"type":"response","data":"Perplexica is an "}
+{"type":"response","data":"innovative, open-source "}
+{"type":"response","data":"AI-powered search engine..."}
+{"type":"done"}
+```
+
+Clients should process each line as a separate JSON object. The different message types include:
+
+- **`init`**: Initial connection message
+- **`sources`**: All sources used for the response
+- **`response`**: Chunks of the generated answer text
+- **`done`**: Indicates the stream is complete
+
+### Fields in the Response
+
+- **`message`** (string): The search result, generated based on the query and enabled `sources`.
+- **`sources`** (array): A list of sources that were used to generate the search result. Each source includes:
+  - `content`: A snippet of the relevant content from the source.
+  - `metadata`: Metadata about the source, including:
+    - `title`: The title of the webpage.
+    - `url`: The URL of the webpage.
+
+### Error Handling
+
+If an error occurs during the search process, the API will return an appropriate error message with an HTTP status code.
+
+- **400**: If the request is malformed or missing required fields (e.g., no `sources` or `query`).
+- **500**: If an internal server error occurs during the search.
--- a/docs/architecture/README.md
+++ b/docs/architecture/README.md
@@ -0,0 +1,38 @@
+# Perplexica Architecture
+
+Perplexica is a Next.js application that combines an AI chat experience with search.
+
+For a high level flow, see [WORKING.md](WORKING.md). For deeper implementation details, see [CONTRIBUTING.md](../../CONTRIBUTING.md).
+
+## Key components
+
+1. **User Interface**
+
+   - A web based UI that lets users chat, search, and view citations.
+
+2. **API Routes**
+
+   - `POST /api/chat` powers the chat UI.
+   - `POST /api/search` provides a programmatic search endpoint.
+   - `GET /api/providers` lists available providers and model keys.
+
+3. **Agents and Orchestration**
+
+   - The system classifies the question first.
+   - It can run research and widgets in parallel.
+   - It generates the final answer and includes citations.
+
+4. **Search Backend**
+
+   - A meta search backend is used to fetch relevant web results when research is enabled.
+
+5. **LLMs (Large Language Models)**
+
+   - Used for classification, writing answers, and producing citations.
+
+6. **Embedding Models**
+
+   - Used for semantic search over user uploaded files.
+
+7. **Storage**
+   - Chats and messages are stored so conversations can be reloaded.
--- a/docs/architecture/WORKING.md
+++ b/docs/architecture/WORKING.md
@@ -0,0 +1,72 @@
+# How Perplexica Works
+
+This is a high level overview of how Perplexica answers a question.
+
+If you want a component level overview, see [README.md](README.md).
+
+If you want implementation details, see [CONTRIBUTING.md](../../CONTRIBUTING.md).
+
+## What happens when you ask a question
+
+When you send a message in the UI, the app calls `POST /api/chat`.
+
+At a high level, we do three things:
+
+1. Classify the question and decide what to do next.
+2. Run research and widgets in parallel.
+3. Write the final answer and include citations.
+
+## Classification
+
+Before searching or answering, we run a classification step.
+
+This step decides things like:
+
+- Whether we should do research for this question
+- Whether we should show any widgets
+- How to rewrite the question into a clearer standalone form
+
+## Widgets
+
+Widgets are small, structured helpers that can run alongside research.
+
+Examples include weather, stocks, and simple calculations.
+
+If a widget is relevant, we show it in the UI while the answer is still being generated.
+
+Widgets are helpful context for the answer, but they are not part of what the model should cite.
+
+## Research
+
+If research is needed, we gather information in the background while widgets can run.
+
+Depending on configuration, research may include web lookup and searching user uploaded files.
+
+## Answer generation
+
+Once we have enough context, the chat model generates the final response.
+
+You can control the tradeoff between speed and quality using `optimizationMode`:
+
+- `speed`
+- `balanced`
+- `quality`
+
+## How citations work
+
+We prompt the model to cite the references it used. The UI then renders those citations alongside the supporting links.
+
+## Search API
+
+If you are integrating Perplexica into another product, you can call `POST /api/search`.
+
+It returns:
+
+- `message`: the generated answer
+- `sources`: supporting references used for the answer
+
+You can also enable streaming by setting `stream: true`.
+
+## Image and video search
+
+Image and video search use separate endpoints (`POST /api/images` and `POST /api/videos`). We generate a focused query using the chat model, then fetch matching results from a search backend.
--- a/docs/installation/UPDATING.md
+++ b/docs/installation/UPDATING.md
@@ -0,0 +1,81 @@
+# Update Perplexica to the latest version
+
+To update Perplexica to the latest version, follow these steps:
+
+## For Docker users (Using pre-built images)
+
+Simply pull the latest image and restart your container:
+
+```bash
+docker pull itzcrazykns1337/perplexica:latest
+docker stop perplexica
+docker rm perplexica
+docker run -d -p 3000:3000 -v perplexica-data:/home/perplexica/data --name perplexica itzcrazykns1337/perplexica:latest
+```
+
+For slim version:
+
+```bash
+docker pull itzcrazykns1337/perplexica:slim-latest
+docker stop perplexica
+docker rm perplexica
+docker run -d -p 3000:3000 -e SEARXNG_API_URL=http://your-searxng-url:8080 -v perplexica-data:/home/perplexica/data --name perplexica itzcrazykns1337/perplexica:slim-latest
+```
+
+Once updated, go to http://localhost:3000 and verify the latest changes. Your settings are preserved automatically.
+
+## For Docker users (Building from source)
+
+1. Navigate to your Perplexica directory and pull the latest changes:
+
+   ```bash
+   cd Perplexica
+   git pull origin master
+   ```
+
+2. Rebuild the Docker image:
+
+   ```bash
+   docker build -t perplexica .
+   ```
+
+3. Stop and remove the old container, then start the new one:
+
+   ```bash
+   docker stop perplexica
+   docker rm perplexica
+   docker run -p 3000:3000 -p 8080:8080 --name perplexica perplexica
+   ```
+
+4. Once the command completes, go to http://localhost:3000 and verify the latest changes.
+
+## For non-Docker users
+
+1. Navigate to your Perplexica directory and pull the latest changes:
+
+   ```bash
+   cd Perplexica
+   git pull origin master
+   ```
+
+2. Install any new dependencies:
+
+   ```bash
+   npm i
+   ```
+
+3. Rebuild the application:
+
+   ```bash
+   npm run build
+   ```
+
+4. Restart the application:
+
+   ```bash
+   npm run start
+   ```
+
+5. Go to http://localhost:3000 and verify the latest changes. Your settings are preserved automatically.
+
+---