Getting Started

How to get your API key

Your API key is the credential that authenticates your requests to the Alveare inference API. Here is how to obtain one:

  1. Sign up at alveare.ai and complete your account registration.
  2. Navigate to the Dashboard and click API Keys in the sidebar.
  3. Click Generate New Key. You will see two options:
    • alv_test_... — for sandbox/testing (no billing, limited rate)
    • alv_live_... — for production use (billed against your plan)
  4. Copy your key immediately. For security, the full key is only shown once.

Store your API key in an environment variable like ALVEARE_API_KEY. Never hardcode it in source files or commit it to version control.

Was this helpful?

Making your first API request

Once you have your API key, you can make your first request with a simple cURL command:

curl -X POST https://api.alveare.ai/v1/infer \
  -H "Authorization: Bearer alv_test_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "specialist": "summarise",
    "prompt": "Summarise this text: Alveare provides private AI inference.",
    "max_tokens": 128
  }'

You should receive a JSON response containing:

  • result — the model's response text
  • tokens_used — how many tokens were consumed
  • latency_ms — server-side processing time in milliseconds

If you receive an error, check that your API key is correct and that you are using the right endpoint URL.

Was this helpful?

Installing the Python SDK

The Alveare Python SDK provides a convenient wrapper around the REST API. Install it using pip:

pip install alveare

Basic usage:

import alveare

client = alveare.Client(api_key="alv_live_your_key_here")

response = client.infer(
    specialist="summarise",
    prompt="Summarise this quarterly report...",
    max_tokens=256
)

print(response.result)
print(f"Tokens used: {response.tokens_used}")

The SDK handles retries, error parsing, and connection pooling automatically. It requires Python 3.8 or later. For async usage, use alveare.AsyncClient.

Was this helpful?

Installing the TypeScript SDK

Install the Alveare TypeScript/JavaScript SDK via npm or yarn:

npm install @alveare/sdk
# or
yarn add @alveare/sdk

Basic usage:

import { Alveare } from '@alveare/sdk';

const client = new Alveare({ apiKey: 'alv_live_your_key_here' });

const response = await client.infer({
  specialist: 'summarise',
  prompt: 'Summarise this quarterly report...',
  maxTokens: 256,
});

console.log(response.result);

The SDK supports both CommonJS and ESM. It works in Node.js 18+ and modern browsers (with appropriate CORS headers). TypeScript types are included out of the box.

Was this helpful?

Setting up the CLI

The Alveare CLI lets you interact with your hives and specialists from the terminal.

npm install -g @alveare/cli
# or
brew install alveare

Configure your credentials:

alveare auth login
# Enter your API key when prompted

# Or set it via environment variable:
export ALVEARE_API_KEY=alv_live_your_key_here

Common commands:

  • alveare infer --specialist summarise --prompt "..." — run an inference
  • alveare specialists list — list available specialists
  • alveare usage — view current billing period usage
  • alveare hives status — check hive health
Was this helpful?
Billing & Plans

Understanding your plan limits

Each Alveare plan includes a set number of monthly requests and specialists:

  • Solo ($49/mo): 1 shared hive, 3 specialists, 10K requests/month
  • Starter ($499/mo): 1 dedicated hive, 3 specialists, 100K requests/month
  • Professional ($1,499/mo): 3 hives, 10 specialists, 500K requests/month
  • Scale ($2,999/mo): 10 hives, unlimited specialists, 2M requests/month

Your current usage is always visible in the Dashboard > Usage tab. You will receive email alerts at 75%, 90%, and 100% of your request limit. Exceeding your limit does not immediately cut off service — see the overage charges article for details.

Was this helpful?

How to upgrade or downgrade your plan

You can change your plan at any time from Dashboard > Billing > Change Plan.

  • Upgrades take effect immediately. You will be charged a prorated amount for the remainder of the current billing cycle.
  • Downgrades take effect at the start of your next billing cycle. You keep your current plan's features until then.

If you downgrade from a plan with more specialists than the lower plan allows, you will need to select which specialists to keep. Inactive specialists are archived (not deleted) and can be restored if you upgrade again.

Was this helpful?

How the 7-day free trial works

Every new account gets a 7-day free trial on any plan. During the trial:

  • You get full access to the plan's hives and specialists.
  • You can make up to 10,000 requests (full limit on Solo, subset on higher plans).
  • No credit card is required to start.
  • You will receive a reminder email on day 5.

When the trial ends, you can choose to subscribe to any plan. If you do not subscribe, your API key is deactivated but your account and configuration are preserved for 30 days. Simply subscribe to reactivate everything.

Was this helpful?

Understanding overage charges

If you exceed your plan's monthly request limit, requests continue to work but are billed at the overage rate:

  • Solo: $0.01 per additional request
  • Starter: $0.005 per additional request
  • Professional: $0.003 per additional request
  • Scale: $0.002 per additional request

You can set a hard spending cap in Dashboard > Billing > Spending Limits. Once the cap is reached, requests will return a 429 Too Many Requests error until the next billing cycle. This prevents unexpected charges.

Overage charges are billed at the end of each billing cycle alongside your plan subscription.

Was this helpful?

How to cancel your subscription

You can cancel at any time from Dashboard > Billing > Cancel Plan.

  • Cancellation takes effect at the end of your current billing cycle. You retain full access until then.
  • No partial refunds are issued for unused time in the current cycle.
  • Your API keys are deactivated at the end of the cycle.
  • Your account data (specialist configurations, usage history) is preserved for 90 days. Re-subscribing within that window restores everything.

If you are cancelling due to an issue we can help with, please submit a ticket first. We may be able to resolve your concern.

Was this helpful?
API & SDKs

Authentication and API key formats (alv_live_ vs alv_test_)

Alveare uses bearer token authentication. Include your API key in the Authorization header:

Authorization: Bearer alv_live_abc123...

There are two key types:

  • alv_test_... — Sandbox keys. Requests hit a test environment with lower rate limits. No billing. Useful for development and CI/CD.
  • alv_live_... — Production keys. Requests are billed against your plan. Full rate limits and SLAs apply.

Both key types are 48 characters long (including the prefix). If you receive a 401 Unauthorized error, verify that you are using the correct key type for your target environment and that the key has not been rotated.

Was this helpful?

Rate limiting and how to handle 429 errors

Rate limits protect the infrastructure and ensure fair usage. Limits are per API key:

  • Solo: 20 requests/second
  • Starter: 50 requests/second
  • Professional: 200 requests/second
  • Scale: 500 requests/second
  • Test keys: 10 requests/second

When you exceed the rate limit, the API returns 429 Too Many Requests with a Retry-After header indicating how many seconds to wait.

Best practices:

  • Implement exponential backoff with jitter.
  • Use the Retry-After header value as the minimum wait time.
  • Queue requests client-side and process them at a steady rate.
  • The SDKs handle retries automatically with sensible defaults.
Was this helpful?

OpenAI-compatible endpoint vs native endpoint

Alveare provides two endpoint styles:

Native endpoint (/v1/infer):

POST https://api.alveare.ai/v1/infer
{
  "specialist": "summarise",
  "prompt": "...",
  "max_tokens": 256
}

OpenAI-compatible endpoint (/v1/chat/completions):

POST https://api.alveare.ai/v1/chat/completions
{
  "model": "summarise",
  "messages": [{"role": "user", "content": "..."}],
  "max_tokens": 256
}

The OpenAI-compatible endpoint lets you switch from OpenAI by changing only the base URL and API key. The model field maps to your specialist name. Both endpoints return equivalent data; the response format matches whichever style you use.

Was this helpful?

Request timeouts and retries

Default server-side timeout is 30 seconds. Most requests complete in under 1 second. If a request exceeds the timeout, you will receive a 504 Gateway Timeout.

Common causes of timeouts:

  • Very high max_tokens values (e.g., >4096)
  • Complex prompts with large context windows
  • Temporary infrastructure scaling events

Retry strategy:

  • Retry on 429, 500, 502, 503, and 504 errors.
  • Do not retry 400, 401, or 403 errors — these indicate client-side issues.
  • Use exponential backoff: 1s, 2s, 4s, with a maximum of 3 retries.
  • Both the Python and TypeScript SDKs implement this automatically.
Was this helpful?

Error codes and troubleshooting

Common HTTP status codes returned by the Alveare API:

  • 200 OK — Request succeeded.
  • 400 Bad Request — Invalid JSON, missing required fields, or invalid parameter values. Check the error.message field in the response body.
  • 401 Unauthorized — Invalid or missing API key. Verify your Authorization header.
  • 403 Forbidden — Your API key does not have access to the requested specialist or feature. Check your plan level.
  • 404 Not Found — Unknown specialist name or invalid endpoint path.
  • 429 Too Many Requests — Rate limit exceeded. See the rate limiting article.
  • 500 Internal Server Error — Unexpected server error. Safe to retry.
  • 502 Bad Gateway — Hive temporarily unreachable. Retry after a brief delay.
  • 503 Service Unavailable — Maintenance or scaling in progress. Check status.alveare.ai.
  • 504 Gateway Timeout — Request exceeded the 30-second timeout.

All error responses include a JSON body with error.code, error.message, and error.request_id. Include the request_id when contacting support.

Was this helpful?
Specialists

Choosing the right specialist for your task

Alveare provides pre-configured specialists optimized for specific tasks. Choosing the right one ensures better results and lower token usage:

  • summarise — condensing long documents into key points
  • classify — categorizing text (sentiment, topic, intent)
  • extract — pulling structured data from unstructured text (names, dates, amounts)
  • qa — answering questions given a context passage
  • chat — conversational interactions with memory
  • code — code generation, explanation, and debugging
  • translate — language translation
  • rewrite — tone adjustment, grammar correction, paraphrasing

Each specialist has a tuned system prompt and parameter set. Using a specialist that matches your task typically produces better output than using a general-purpose one. You can list available specialists via the CLI: alveare specialists list

Was this helpful?

How cognitive hives share a single model

A cognitive hive is Alveare's core architecture innovation. Instead of loading a separate model for each specialist, a single language model (e.g., Mistral 7B) is loaded once in GPU memory, and multiple specialists share it.

How it works:

  1. One base model is loaded onto dedicated GPU(s).
  2. Each specialist is defined by a system prompt, parameter set (temperature, top_p, etc.), and optional LoRA adapter.
  3. When a request arrives for a specialist, the hive applies that specialist's configuration before inference.
  4. Context switching between specialists is near-instantaneous since the model weights are already in memory.

This is why Alveare can offer 10+ specialists at a fraction of the cost of running 10 separate model deployments. You get the equivalent of 10 fine-tuned models for the GPU cost of one.

Was this helpful?

Optimizing prompts for better results

Even though specialists have optimized system prompts, the quality of your input prompt matters:

  • Be specific: Instead of "summarise this", say "summarise this financial report in 3 bullet points focusing on revenue trends."
  • Provide context: Include relevant background information directly in the prompt.
  • Set format expectations: Specify if you want JSON, bullet points, a single paragraph, etc.
  • Use examples: For classification or extraction tasks, include 1-2 examples of the desired output.
  • Keep it concise: Avoid unnecessary preamble. Shorter prompts use fewer tokens and often produce better results.

Token budget tip: Set max_tokens to the minimum needed for your expected output. This prevents runaway generation and reduces costs.

Was this helpful?

Using custom system prompts (Professional+ plans)

Professional and Scale plan users can define custom system prompts for their specialists, overriding the default ones.

Setting a custom system prompt:

alveare specialists update summarise \
  --system-prompt "You are a financial analyst. Summarise reports focusing on key metrics, risks, and recommendations. Use bullet points. Never exceed 200 words."

Or via the API:

PUT https://api.alveare.ai/v1/specialists/summarise
{
  "system_prompt": "You are a financial analyst..."
}

Custom system prompts let you tailor specialist behavior to your domain without fine-tuning. Changes take effect immediately for all subsequent requests.

Tips:

  • Keep system prompts under 500 tokens for best performance.
  • Include output format instructions in the system prompt rather than repeating them in every request.
  • Test changes with alv_test_ keys first.
Was this helpful?

Understanding token usage and costs

Tokens are the units that language models use to process text. As a rough guide, 1 token is approximately 4 characters or 0.75 words in English.

What counts toward your usage:

  • Input tokens: your prompt + the specialist's system prompt
  • Output tokens: the model's generated response
  • Both input and output tokens count toward your plan's request count (1 request = 1 API call regardless of token count)

How to reduce token usage:

  • Set max_tokens to the minimum needed.
  • Trim unnecessary context from prompts.
  • Use the classify specialist for yes/no decisions instead of chat.
  • Batch multiple small items into a single prompt where possible.

Every API response includes a tokens_used field. Monitor this in your Dashboard under Usage > Token Breakdown.

Was this helpful?
Account & Security

Rotating your API key

Regular key rotation is a security best practice. To rotate your key:

  1. Go to Dashboard > API Keys.
  2. Click Generate New Key to create a replacement.
  3. Update your application with the new key.
  4. Once your application is using the new key, click Revoke on the old key.

Important: You can have up to 5 active keys at once, which allows zero-downtime rotation. Generate the new key, deploy it, verify it works, then revoke the old one.

If you believe a key has been compromised, revoke it immediately from the Dashboard or via the CLI: alveare auth revoke KEY_PREFIX

Was this helpful?

Data privacy and where your data is processed

Alveare is designed with data privacy as a core principle:

  • Data isolation: Your requests are processed on your dedicated hive. No other customer's data touches your infrastructure.
  • No training: Your data is never used to train or fine-tune models for other customers.
  • No logging of prompts: We do not log your prompt content or model responses. Only metadata (timestamps, token counts, latency) is retained for billing and monitoring.
  • Data residency: Hives are deployed in the region you select (US-East, US-West, EU-West, AP-Southeast). Data does not leave your selected region.
  • Encryption: All data is encrypted in transit (TLS 1.3) and at rest (AES-256).

For enterprise data processing agreements (DPAs), contact privacy@alveare.ai.

Was this helpful?

Compliance (HIPAA, SOC 2, GDPR readiness)

Alveare's architecture supports the following compliance frameworks:

  • SOC 2 Type II: Audit in progress. Expected completion Q2 2026. Our infrastructure is designed to meet all Trust Service Criteria.
  • HIPAA: Business Associate Agreements (BAAs) are available for Scale plan customers. Dedicated hives with no prompt logging meet HIPAA technical safeguard requirements.
  • GDPR: EU-West region deployment ensures data stays within the EU. We support data subject access requests (DSARs) and right-to-deletion. DPAs are available on request.
  • ISO 27001: On our roadmap for 2026.

For detailed compliance documentation or to request a BAA/DPA, email compliance@alveare.ai.

Was this helpful?

Multiple API keys and team access

Alveare supports multiple API keys per account, which is useful for team environments:

  • Up to 5 active keys per account (both test and live combined).
  • Each key can be labeled for identification (e.g., "production", "staging", "alice-dev").
  • All keys share the same plan limits and usage pool.
  • Individual key usage is tracked separately in the Dashboard.

Team management (Professional+ plans):

  • Invite team members via Dashboard > Team.
  • Roles: Owner (full access), Admin (manage keys and specialists), Viewer (read-only dashboard access).
  • Each team member logs in with their own email but shares the organization's plan.
Was this helpful?

Account security best practices

Follow these best practices to keep your Alveare account secure:

  • Never hardcode API keys in source files. Use environment variables or a secrets manager.
  • Use test keys for development. Only use alv_live_ keys in production environments.
  • Rotate keys regularly — at least every 90 days, or immediately if you suspect a compromise.
  • Enable email alerts for unusual usage spikes in Dashboard > Alerts.
  • Set spending caps to prevent runaway costs from compromised keys.
  • Review active keys monthly. Revoke any keys that are no longer in use.
  • Use labeled keys so you can quickly identify which key is used where.
  • Restrict access — only give Admin roles to team members who need to manage keys.

If you suspect unauthorized access, immediately revoke all API keys and contact security@alveare.ai.

Was this helpful?