Search and browse our complete library of help articles.
Your API key is the credential that authenticates your requests to the Alveare inference API. Here is how to obtain one:
alv_test_... — for sandbox/testing (no billing, limited rate)alv_live_... — for production use (billed against your plan)Store your API key in an environment variable like ALVEARE_API_KEY. Never hardcode it in source files or commit it to version control.
Once you have your API key, you can make your first request with a simple cURL command:
curl -X POST https://api.alveare.ai/v1/infer \
-H "Authorization: Bearer alv_test_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"specialist": "summarise",
"prompt": "Summarise this text: Alveare provides private AI inference.",
"max_tokens": 128
}'
You should receive a JSON response containing:
result — the model's response texttokens_used — how many tokens were consumedlatency_ms — server-side processing time in millisecondsIf you receive an error, check that your API key is correct and that you are using the right endpoint URL.
The Alveare Python SDK provides a convenient wrapper around the REST API. Install it using pip:
pip install alveare
Basic usage:
import alveare
client = alveare.Client(api_key="alv_live_your_key_here")
response = client.infer(
specialist="summarise",
prompt="Summarise this quarterly report...",
max_tokens=256
)
print(response.result)
print(f"Tokens used: {response.tokens_used}")
The SDK handles retries, error parsing, and connection pooling automatically. It requires Python 3.8 or later. For async usage, use alveare.AsyncClient.
Install the Alveare TypeScript/JavaScript SDK via npm or yarn:
npm install @alveare/sdk # or yarn add @alveare/sdk
Basic usage:
import { Alveare } from '@alveare/sdk';
const client = new Alveare({ apiKey: 'alv_live_your_key_here' });
const response = await client.infer({
specialist: 'summarise',
prompt: 'Summarise this quarterly report...',
maxTokens: 256,
});
console.log(response.result);
The SDK supports both CommonJS and ESM. It works in Node.js 18+ and modern browsers (with appropriate CORS headers). TypeScript types are included out of the box.
The Alveare CLI lets you interact with your hives and specialists from the terminal.
npm install -g @alveare/cli # or brew install alveare
Configure your credentials:
alveare auth login # Enter your API key when prompted # Or set it via environment variable: export ALVEARE_API_KEY=alv_live_your_key_here
Common commands:
alveare infer --specialist summarise --prompt "..." — run an inferencealveare specialists list — list available specialistsalveare usage — view current billing period usagealveare hives status — check hive healthEach Alveare plan includes a set number of monthly requests and specialists:
Your current usage is always visible in the Dashboard > Usage tab. You will receive email alerts at 75%, 90%, and 100% of your request limit. Exceeding your limit does not immediately cut off service — see the overage charges article for details.
You can change your plan at any time from Dashboard > Billing > Change Plan.
If you downgrade from a plan with more specialists than the lower plan allows, you will need to select which specialists to keep. Inactive specialists are archived (not deleted) and can be restored if you upgrade again.
Every new account gets a 7-day free trial on any plan. During the trial:
When the trial ends, you can choose to subscribe to any plan. If you do not subscribe, your API key is deactivated but your account and configuration are preserved for 30 days. Simply subscribe to reactivate everything.
If you exceed your plan's monthly request limit, requests continue to work but are billed at the overage rate:
You can set a hard spending cap in Dashboard > Billing > Spending Limits. Once the cap is reached, requests will return a 429 Too Many Requests error until the next billing cycle. This prevents unexpected charges.
Overage charges are billed at the end of each billing cycle alongside your plan subscription.
You can cancel at any time from Dashboard > Billing > Cancel Plan.
If you are cancelling due to an issue we can help with, please submit a ticket first. We may be able to resolve your concern.
Alveare uses bearer token authentication. Include your API key in the Authorization header:
Authorization: Bearer alv_live_abc123...
There are two key types:
alv_test_... — Sandbox keys. Requests hit a test environment with lower rate limits. No billing. Useful for development and CI/CD.alv_live_... — Production keys. Requests are billed against your plan. Full rate limits and SLAs apply.Both key types are 48 characters long (including the prefix). If you receive a 401 Unauthorized error, verify that you are using the correct key type for your target environment and that the key has not been rotated.
Rate limits protect the infrastructure and ensure fair usage. Limits are per API key:
When you exceed the rate limit, the API returns 429 Too Many Requests with a Retry-After header indicating how many seconds to wait.
Best practices:
Retry-After header value as the minimum wait time.Alveare provides two endpoint styles:
Native endpoint (/v1/infer):
POST https://api.alveare.ai/v1/infer
{
"specialist": "summarise",
"prompt": "...",
"max_tokens": 256
}
OpenAI-compatible endpoint (/v1/chat/completions):
POST https://api.alveare.ai/v1/chat/completions
{
"model": "summarise",
"messages": [{"role": "user", "content": "..."}],
"max_tokens": 256
}
The OpenAI-compatible endpoint lets you switch from OpenAI by changing only the base URL and API key. The model field maps to your specialist name. Both endpoints return equivalent data; the response format matches whichever style you use.
Default server-side timeout is 30 seconds. Most requests complete in under 1 second. If a request exceeds the timeout, you will receive a 504 Gateway Timeout.
Common causes of timeouts:
max_tokens values (e.g., >4096)Retry strategy:
429, 500, 502, 503, and 504 errors.400, 401, or 403 errors — these indicate client-side issues.Common HTTP status codes returned by the Alveare API:
error.message field in the response body.Authorization header.All error responses include a JSON body with error.code, error.message, and error.request_id. Include the request_id when contacting support.
Alveare provides pre-configured specialists optimized for specific tasks. Choosing the right one ensures better results and lower token usage:
Each specialist has a tuned system prompt and parameter set. Using a specialist that matches your task typically produces better output than using a general-purpose one. You can list available specialists via the CLI: alveare specialists list
A cognitive hive is Alveare's core architecture innovation. Instead of loading a separate model for each specialist, a single language model (e.g., Mistral 7B) is loaded once in GPU memory, and multiple specialists share it.
How it works:
This is why Alveare can offer 10+ specialists at a fraction of the cost of running 10 separate model deployments. You get the equivalent of 10 fine-tuned models for the GPU cost of one.
Even though specialists have optimized system prompts, the quality of your input prompt matters:
Token budget tip: Set max_tokens to the minimum needed for your expected output. This prevents runaway generation and reduces costs.
Professional and Scale plan users can define custom system prompts for their specialists, overriding the default ones.
Setting a custom system prompt:
alveare specialists update summarise \ --system-prompt "You are a financial analyst. Summarise reports focusing on key metrics, risks, and recommendations. Use bullet points. Never exceed 200 words."
Or via the API:
PUT https://api.alveare.ai/v1/specialists/summarise
{
"system_prompt": "You are a financial analyst..."
}
Custom system prompts let you tailor specialist behavior to your domain without fine-tuning. Changes take effect immediately for all subsequent requests.
Tips:
alv_test_ keys first.Tokens are the units that language models use to process text. As a rough guide, 1 token is approximately 4 characters or 0.75 words in English.
What counts toward your usage:
How to reduce token usage:
max_tokens to the minimum needed.classify specialist for yes/no decisions instead of chat.Every API response includes a tokens_used field. Monitor this in your Dashboard under Usage > Token Breakdown.
Regular key rotation is a security best practice. To rotate your key:
Important: You can have up to 5 active keys at once, which allows zero-downtime rotation. Generate the new key, deploy it, verify it works, then revoke the old one.
If you believe a key has been compromised, revoke it immediately from the Dashboard or via the CLI: alveare auth revoke KEY_PREFIX
Alveare is designed with data privacy as a core principle:
For enterprise data processing agreements (DPAs), contact privacy@alveare.ai.
Alveare's architecture supports the following compliance frameworks:
For detailed compliance documentation or to request a BAA/DPA, email compliance@alveare.ai.
Alveare supports multiple API keys per account, which is useful for team environments:
Team management (Professional+ plans):
Follow these best practices to keep your Alveare account secure:
alv_live_ keys in production environments.If you suspect unauthorized access, immediately revoke all API keys and contact security@alveare.ai.