AI Provider Setup · BYOK

Connect Together AI

Open-weight Llama and Qwen on fast serverless infra.

Together AI serves open-weight frontier models — Llama 3.1, Qwen 2.5 — on hosted serverless GPUs. Good pick when you want open-weight characteristics (reproducibility, eventual self-hosting) without running GPUs yourself.

Best used for

Key details

Where to generate your key
https://api.together.xyz/settings/api-keys
Expected key format
Long hex token — no fixed prefix
Environment variable name (if self-hosting)
TOGETHER_API_KEY
Provider pricing
https://www.together.ai/pricing
Official docs
Together AI docs

Step-by-step setup

  1. 1 Create a Together account

    Sign up at together.ai. New accounts receive a small amount of trial credit.

  2. 2 Add billing

    Open api.together.xyz/settings/billing and add a card. Trial credit runs out fast with frontier models — add billing before you rely on it.

  3. 3 Open API Keys

    Navigate to api.together.xyz/settings/api-keys. You can create multiple keys per account.

  4. 4 Create a key labeled Admaxxer

    Click Create new key, name it Admaxxer, and create. The key is shown once — copy it immediately.

  5. 5 Paste into Admaxxer

    Open Settings then AI Providers, find the Together AI row, paste the key, and click Connect.

  6. 6 Pick a default model

    meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo is our recommended pick — great price-performance. Use the 405B variant when you need max quality, or Qwen 2.5 for multilingual.

  7. 7 If the connection test fails

    Confirm the key is fresh, billing is active, and the model id matches Together's exact casing (they are case-sensitive, e.g. Meta-Llama not meta-llama).

Test the connection inside Admaxxer

Once you've pasted your key into /settings/ai-providers, hit Test. Admaxxer makes a single no-cost call against the provider's /v1/models (or equivalent) endpoint and reports back:

Available models

These are the pinned model IDs Admaxxer tests against. The model picker in chat will show these plus any live-catalog entries the provider returns.

Model ID Display name Context Tools Best for
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo Llama 3.1 70B Instruct Turbo (recommended) 128K Yes Best price-performance on Together. Strong reasoning, tool use, and speed.
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo Llama 3.1 405B Instruct Turbo 128K Yes Largest Llama Together hosts. Use when you need maximum quality, expect higher latency.
Qwen/Qwen2.5-72B-Instruct-Turbo Qwen 2.5 72B Instruct Turbo 128K Yes Strongest non-Llama open-weight option. Great for multilingual (esp. Chinese) workloads.

Common errors and fixes

401

When it happens: On Connect if the key is wrong or revoked.

Fix: Regenerate at api.together.xyz/settings/api-keys and paste the new key.

402

When it happens: When trial credit runs out and no card is on file.

Fix: Add billing at api.together.xyz/settings/billing.

429

When it happens: Under concurrent load — per-model limits vary.

Fix: Space out parallel requests. Together publishes per-model limits in their docs.

model_not_found

When it happens: If the model id casing is wrong (e.g. meta-llama vs Meta-Llama).

Fix: Copy model ids exactly as they appear in the Together catalog — they are case-sensitive.

timeout

When it happens: On 405B during peak demand — very large models get queued.

Fix: Retry with backoff, or drop to the 70B tier for lower latency.