Our USP · Custom SLM

Bring your data. Leave with a model.

A medical SLM, fine-tuned on your proprietary data, packaged to run on the CPUs you already own. No GPU procurement. No vendor lock-in. No ML team required.

Start a fine-tune Read the security model

Why fine-tune?

RAG retrieves. Fine-tune internalises.

Retrieval lets a generic model look up your data at inference time. Fine-tuning teaches your model the language, structure, and judgement of your domain — so every output is faster, cheaper, and grounded in how your team actually works.

Lower latency — no retrieval round-trip for routine reasoning.
Lower cost — smaller models replace expensive frontier API calls.
Stronger privacy — model weights stay in your environment.
Higher consistency — your tone, your formats, your terminology.

The lifecycle

01 · Minutes
Upload
Drop in PDFs, EMR exports, structured tables, internal SOPs. We fingerprint and version everything.
02 · Minutes
Validate
Automatic PHI detection and redaction. Schema validation. Quality scoring with reject reasons.
03 · ≈ 4–6 hours
Fine-tune
LoRA / QLoRA recipes pre-configured for medical SLM bases. Eval harness scores hallucination, faithfulness, and clinical accuracy.
04 · 30 minutes
Evaluate
Side-by-side outputs vs base model. Manual SME review queue. Automated regression on your golden test set.
05 · Minutes
Package
Quantised GGUF, ONNX, or vLLM-ready bundles. Includes model card, eval report, and deployment manifest.
06 · Same day
Deploy
Pull a Docker image into your VPC, on-prem, or hospital intranet. CPU is enough.

Base models

Pick your size. They all run on CPU.

Three sizes of the Evarx Medical SLM serve as the base for your fine-tune. Quantised builds (Q4_K_M, Q5_K_M) ship for every size — even the 7B fits on a workstation.

Model	Size	Recommended CPU	Latency (Q4_K_M)
Evarx-Med-1B	1.1B params	8 vCPU · 16 GB RAM	~120ms / token
Evarx-Med-3B	3.0B params	16 vCPU · 32 GB RAM	~180ms / token
Evarx-Med-7B	7.2B params	32 vCPU · 64 GB RAM	~280ms / token

Live demo

Watch a custom fine-tune run end-to-end.

Overall0%

Upload
4s
Validate
6s
Fine-tune
8s
Evaluate
7s
Deploy
5s

~ evarx · finetune.logready

Press start to play the fine-tune…

Stage

Upload

Tokens / sec

—

Engine

evarx-trainer · GPU

TCO calculator

What does private actually cost?

Live monthly cost across the three options. Tweak the sliders to see when a fine-tune pays for itself.

Monthly tokens50M

5Cr tokens / month

Active users20

20 seats

Custom SLM hosting

Runs on your CPU servers

Custom vs Frontier

₹75,750saved / month

≈ 98% lower than running everything on a frontier hosted API.

Monthly cost · INR

Indicative. Actuals depend on volume and contract.

Frontier hosted API
Per-token pricing dominates at this volume.
₹77,500
₹1,250 / 1M tokens+ ₹15,000 fixed
Evarx Private (Medical SLM)
GPU-backed inference in your VPC.
₹51,500
₹180 / 1M tokens+ ₹42,500 fixed
Evarx Custom (Fine-tuned · CPU)
Runs on hardware you already own.
₹1,750
₹35 / 1M tokensno fixed cost

Custom-tier on-prem deployments hit cost-parity within ~3 months at this volume.

Continuous improvement loop

Every workflow run that gets a thumbs-up — or an SME-edited correction — becomes a labelled training pair. Schedule nightly refresh runs and your model improves while you sleep. Roll back any version with one click.

Versioned weights with git-style diffing
Eval gates prevent regressions from shipping
Per-team feedback isolation

Why CPU-runnable matters

Hospitals, regulated pharma units, and air-gapped research sites can't always procure GPUs. Quantised SLMs let you ship the same model your data scientists trained into a 16-core production server — no infrastructure rewrite.

Runs on existing hospital-grade hardware
Air-gap deployable via signed Docker images
Cost per token approaches zero at steady state

Get started

A fine-tune scoped, signed off, and running this week.

Tell us your highest-leverage workflow. We'll respond with a data spec, a fixed-price scope, and a deployment plan within a business day.

NDA & DPA on request
Fixed-price first fine-tune
Includes one production agent

Book a demo