InternalAutomation

Custom AI Models · Local AI Build

Custom AI Models

Fine-tune the latest open-weight language and vision models on your own data, using SFT and LoRA adapters and context windows sized to your workflow, so you get production accuracy and keep ownership of the result.

  • open-weight
  • you own the weights
  • self-hostable
  • SFT + LoRA
claimsinvoicesticketstranscripts

Your documents, embedded — drifting from noise into the categories the model learns to tell apart.

Open
Open-weight families
Access the leading open-weight models from the Qwen, Kimi, and GLM families, fine-tuned on your data.
~30
Days to first fine-tune
From your data to a model running in production, then improved from real usage.
Yours
Weights + pipeline
You own the trained weights, adapters, and the retraining pipeline. Self-hostable.

Build pipeline

From your data to a model you own

  • ticket#4821 · refund, late shipment
  • invoiceINV-2231 · net-30 terms
  • contract§7.2 · SLA & credits
  • transcriptcall 14:02 · upgrade ask
refundnet-30SLAcreditSev-1Enterprise§7.2
epoch
14 / 40
train_loss
0.214
eval_acc
94.7%
examples
2,140
accuracy
94.7%
f1
0.93
vs base
+22 pts

acme-support-v3.safetensors

What's our SLA for a Sev-1 outage on the Enterprise plan?

generic modelI don't have your specific SLA. Enterprise plans typically offer around 99.9% uptime and priority support...
your model
self-hostedyour weightsQwen3 · LoRA
  1. 01Your datasetReal tickets, invoices, contracts, and transcripts — prepared and tokenized.
  2. 02Fine-tuneSFT plus LoRA adapters on a frozen open-weight base. Loss falls, accuracy climbs.
  3. 03EvaluateMeasured against held-out targets — not vibes. It beats the base model on your work.
  4. 04Deploy & ownYour weights, self-hosted. Generic answers become in-house ones.

02 / The catalog

Open-weight models, fine-tuned and yours

One place for the models worth building on. Access the leading open-weight families, tune them to your data, and keep the weights.

8 open-weight bases
  • Qwen3.7-7B-InstructLanguageFast, low-cost base for chat, extraction, and classification.
    Qwen7B128K ctxopen-weight
  • Qwen3.7-32B-InstructLanguageBalanced accuracy and cost for most production fine-tunes.
    Qwen32B128K ctxopen-weight
  • Qwen3.7-72B-InstructLanguageFrontier accuracy for the hardest reasoning tasks.
    Qwen72B128K ctxopen-weight
  • Qwen3.7-VL-7BVisionReads images, scans, and document layouts.
    Qwen7B32K ctxopen-weight
  • Qwen3.7-VL-32BVisionHigher-fidelity visual understanding for inspection and OCR.
    Qwen32B32K ctxopen-weight
  • KimiLanguageVery long context for whole-document and full-history reasoning.
    MoonshotMoE256K ctxopen-weight
  • GLMLanguageStrong bilingual performance and tool use.
    Zhipu32B128K ctxopen-weight
  • GLM-VVisionVision-language model for multimodal workflows.
    ZhipuVLM64K ctxopen-weight

03 / Fine-tune

Configure a model, then watch it train

Pick the shape of the build and run an illustrative fine-tune. When it fits, book a build for that exact spec.

Spec the model, then watch it train.

Set the shape of the build and run an illustrative fine-tune right here: the loss falls, the eval climbs, and the log streams. Every number is an estimate, not a promise.

Base size
Recommended approachLoRA adapterA LoRA adapter trains fast and cheap, and you can swap it per task without retraining.
step60/60
loss0.611
eval0.81
tok/s1.8k

Compute band: ~1 to 2 GPU-h. Illustrative: params x examples x 3 epochs.

awaiting run... the curve plots as steps complete
train_config.yaml
base_model: Qwen3.7-7B-Instruct
adapter: lora
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
sequence_len: 8192
micro_batch_size: 2
gradient_accumulation_steps: 4
num_epochs: 3
learning_rate: 0.0002
optimizer: adamw_bnb_8bit
datasets:
  - path: ./data/your-dataset.jsonl
    type: chat_template
val_set_size: 0.05

Base: Qwen3.7-7B-Instruct. Open-weight, trained on your data, owned by you.

Book a build for this spec

04 / What it changes

What the build is designed to do

  1. 01Fine-tune the latest open-weight models from the Qwen, Kimi, and GLM families on your own data
  2. 02Use SFT for accuracy and LoRA adapters for fast, low-cost iteration across tasks
  3. 03Read long documents in a single pass with context windows sized to your workflow
  4. 04Handle images, scans, and video frames with custom vision-language models
  5. 05Own your trained weights and adapters as proprietary business assets you can self-host

08 / FAQs

Custom AI Models questions

Which models do you fine-tune?

We work with the latest open-weight releases from families like Qwen, Kimi, and GLM, and we choose the specific version per project based on your accuracy, latency, context-length, and hardware needs. Because the weights are open, you own the fine-tuned result and can run it on your own infrastructure instead of depending on a closed API. These families also ship vision-language variants, so we can use one toolchain whether your task is text-only or needs to read images and documents.

What is the difference between SFT and a LoRA adapter?

Supervised fine-tuning (SFT) updates the model's weights on your labeled examples and is the most direct way to lift accuracy on your domain. A LoRA adapter trains a small set of extra parameters that sit on top of the base model, which is faster and cheaper, lets you keep separate adapters for separate tasks, and can be merged into the base weights once it performs well. We often start with LoRA to iterate quickly, then commit to full SFT or merge the adapter for the production build.

Can the model read long documents or images?

Yes. We size the context window to your workflow so the model can read an entire contract, claim history, or knowledge base in a single pass instead of losing detail across chunks. For visual work we fine-tune vision-language models (VLMs) that take images, scans, screenshots, or video frames as input, so the same model can, for example, read a photo of a damaged part or a scanned invoice and respond in your terminology.

How much data do I need to train a custom AI model?

The data requirements depend on the approach. Fine-tuning a pre-trained foundation model often requires as few as 500 to 5,000 high-quality examples to achieve excellent results, since the base model already understands general patterns. Training a model from scratch typically requires tens of thousands of examples. During our discovery phase, we assess your available data and recommend the best approach. If your data is limited, we can supplement it with synthetic data generation or transfer learning techniques.

How long does it take to build a custom AI model?

A typical custom model project takes 4-8 weeks from kickoff to production deployment. The first 1-2 weeks focus on data preparation and exploration. Training and validation usually take 1-3 weeks depending on model complexity. The final phase includes integration, testing, and deployment. Fine-tuning projects on existing foundation models are often faster, sometimes as quick as 2-3 weeks. We provide regular progress updates and intermediate results throughout the process.

What happens if the model's accuracy is not good enough?

Model development is inherently iterative, and we set clear performance benchmarks at the start of every project. If initial results fall short, we have multiple strategies to improve accuracy: collecting additional training data, engineering better features, trying alternative model architectures, or adjusting the problem framing. Our discovery phase includes a feasibility assessment so we can identify potential accuracy challenges before committing to full development. We do not consider a project complete until the model meets agreed-upon performance criteria.

Can custom models be updated as my business changes?

Yes, and this is one of the key advantages of custom models. We build retraining pipelines that allow your models to be updated with new data on a regular schedule, weekly, monthly, or quarterly depending on how quickly your business evolves. This ensures your AI stays current with changing customer preferences, new products, seasonal shifts, and evolving market conditions. We can manage the retraining process for you or train your team to handle it independently.

Turn Custom AI Models into something your team actually uses.

Name the work you want this to handle. We will map the build, show what is worth doing first, and what it costs. If there is no fit, we will say so.