AI Configuration
The AI Configuration category controls the AI assistant backend that powers the Ask AI feature in Hawkra workspaces. You can choose between Google's Gemini cloud API or a self-hosted local LLM server.
Settings Reference
Gemini Model
| Key | gemini_model |
| Type | Dropdown |
| Default | gemini-2.0-flash |
| Encrypted | No |
Selects which Google Gemini model to use when LLM mode is set to cloud. The available options are:
| Model | Description |
|---|---|
| gemini-2.0-flash | Fast responses with good quality. Recommended for most use cases where speed matters. Lowest API cost per request. |
| gemini-2.0-pro | Higher quality responses with deeper reasoning. Good balance of quality and speed for complex security analysis. |
| gemini-2.5-pro | Latest model with the best quality. Best for complex multi-step analysis where response quality is the top priority. |
This setting has no effect when LLM mode is set to local.
LLM Mode
| Key | llm_mode |
| Type | Dropdown (cloud or local) |
| Default | cloud |
| Encrypted | No |
Determines which AI backend processes Ask AI requests:
- cloud — Uses the Google Gemini API. Requires a valid
GEMINI_API_KEY. Your selected context and questions are sent to Google's servers for processing. - local — Uses a self-hosted LLM server running on your infrastructure. Requires
LOCAL_LLM_SERVERto be configured. All data stays within your network.
Local LLM Server
| Key | local_llm_server |
| Type | String |
| Default | Empty |
| Encrypted | No |
The URL of your local LLM inference server. This is only used when LLM mode is set to local.
Examples:
http://ollama:11434(Ollama running as a Docker service on the same network)http://192.168.1.50:11434(Ollama on a separate machine)http://localhost:8080(llama.cpp or vLLM running locally)
Gemini API Key
| Key | gemini_api_key |
| Type | String |
| Default | Empty |
| Encrypted | Yes |
Your Google AI Studio API key for accessing the Gemini API. This is only required when LLM mode is set to cloud. The key is stored encrypted in the database and appears masked on the settings page.
Getting a Gemini API Key
- Go to ai.google.dev.
- Click Get API Key in the top navigation.
- Sign in with your Google account if prompted.
- Click Create API Key and select or create a Google Cloud project.
- Your API key is generated immediately. Copy it.
- Return to the Hawkra admin dashboard, click Change next to the Gemini API Key field, paste the key, and save.
The Gemini API has a generous free tier for initial testing. For production usage, review Google's pricing at ai.google.dev/pricing.
Setting Up a Local LLM with Ollama
Ollama is the recommended way to run a local LLM for Hawkra. It provides a simple API server that is compatible with Hawkra's local LLM integration.
Option 1: Ollama on the Same Host
If you want to run Ollama alongside Hawkra on the same server, add it to your Docker Compose configuration:
services:
ollama:
image: ollama/ollama
container_name: hawkra-ollama
volumes:
- ollama_data:/root/.ollama
ports:
- "11434:11434"
restart: unless-stopped
volumes:
ollama_data:
After starting the Ollama container, pull a model:
docker exec hawkra-ollama ollama pull llama3
Then configure Hawkra:
| Setting | Value |
|---|---|
| LLM Mode | local |
| Local LLM Server | http://ollama:11434 |
If Ollama is on the same Docker network as Hawkra, use the container name (ollama) as the hostname. If it is on a different network, use the host machine's IP address.
Option 2: Ollama on a Separate Machine
- Install Ollama on the target machine following the instructions at ollama.com/download.
- Pull a model:
ollama pull llama3 - Ensure the Ollama server is accessible from your Hawkra server on port 11434.
- Configure Hawkra:
| Setting | Value |
|---|---|
| LLM Mode | local |
| Local LLM Server | http://<ollama-server-ip>:11434 |
Recommended Models
| Model | Size | Notes |
|---|---|---|
llama3 | 8B | Good balance of quality and resource usage |
llama3:70b | 70B | Higher quality but requires significant GPU memory |
mistral | 7B | Fast and efficient for general tasks |
mixtral | 8x7B | MoE architecture, good quality with moderate resources |
Using a local LLM means your data never leaves your infrastructure. There are no API costs, no rate limits, and no dependency on external services. This is ideal for air-gapped environments or organizations with strict data sovereignty requirements.
Configuration via Environment Variables
| Setting | Environment Variable |
|---|---|
| Gemini Model | GEMINI_MODEL |
| LLM Mode | LLM_MODE |
| Local LLM Server | LOCAL_LLM_SERVER |
| Gemini API Key | GEMINI_API_KEY |
When both cloud and local modes are available, switching between them in the admin dashboard takes effect immediately for new AI requests. There is no need to restart the server.