CoDynamics Lab Corporation — Proprietary Inference Layer
Direct & Aggressive
The end of RAG. Document memory at the model level.
Performance-Led
Kill the context window. 40× faster document intelligence.
Economic-Led
Stop re-reading. Compile once, query forever for 97% less.
Clarity-Led
LATCH: document memory without the RAG artifacts.
Portability-Led
Compile once. Ship the .latchdoc file. Query anywhere in 1.6ms.
// I couldn't pick one tagline — because for most enterprise workloads, all five are true. LATCH is a fundamental shift in how models remember documents. Not RAG. Not prompt compression. Something new.
Seeing is believing
Same model. Same query. Same documents.
Qwen 2.5 14B — Standard Baseline
23.10s
Illustrative Time to First Token
● Baseline video coming soon
Baseline placeholder We are intentionally leaving only the live LATCH demo video up here for now. A matching baseline capture will be added later.
Qwen 2.5 14B — LATCH Compiled
0.11s
Time to First Token
● Live capture
Live LATCH room capture showing compiled cross-document inference. This is the single public demo video on the page for now.

Benchmarked on DOJ antitrust brief, SEC 10-K, credit agreement, commercial lease, and NIST AI RMF on NVIDIA H100 80GB.
The baseline side-by-side video is intentionally deferred; the current public page keeps only the LATCH runtime demo visible.

Hard numbers

Built to be measured, not marketed.

Every claim below was benchmarked on real enterprise documents on NVIDIA H100 80GB with vLLM serving infrastructure.

0.11s
TTFT
vs 23.1s baseline cold start
210×
Faster Cold Start
Time-to-first-token speedup
1.6ms
Cache Reload
From .latch file on disk
91.7%
Multi-Doc Pass
11/12 benchmark gates
97%
Cost Reduction
Amortized after 25 queries
50%
Less VRAM
More instances per node
4
Model Families
Qwen · Mistral · Llama · DeepSeek
5.2×
End-to-End Speedup
Full query cycle improvement
How it works

Compile once. Query indefinitely.

LATCH intercepts the standard inference path and replaces per-query document processing with a persistent representation. The economics compound with every additional query.

01
Compile
Documents processed through proprietary compilation. Persistent representation saved to disk as .latch or .latchdoc.
02
Query
Each query runs against compiled memory. Sub-200ms response. No raw document re-processing. Ever.
03
Amortize
Compilation cost paid once. Shared across teams, workflows, and time. More queries = lower unit cost.
Portable document memory

Stop indexing. Start shipping .latch files.

Compilation is now a tangible, portable asset. Save it. Transfer it. Load it in 1.6ms. No re-computation. No extra cost.

.latch
Privacy-First Variant
A lightweight binary of your compiled corpus. Contains only the model-level memory — no source text included.
  • Compiled document intelligence
  • 1.6ms reload from disk
  • Share analysis without exposing source docs
  • Smallest possible file size
.latchdoc
Full Intelligence Package
Everything in .latch, plus embedded raw text for full-text search and automatic quality fallback. The smart default.
  • Everything in .latch
  • Ctrl+F / needle-in-haystack search
  • Automatic fallback for edge-case queries
  • Negligible size overhead vs .latch
1 Compile on your H100 2 Save as .latchdoc 3 Ship to your team 4 They query in 0.11s — zero recompute
Quickstart

Show the operator path, not just the benchmark.

Not everyone knows how to provision a GPU room correctly on day one. The product page should still make the operating model legible: start the Docker container, then either call the API directly or open the LATCH Console from your own machine.

Terminal 1 · API Flow
Start LATCH, verify readiness, upload a document, and send a query over the standard local API surface.
$ docker run --gpus all -p 8091:8091 codynamics/latch:latest
[latch] runtime starting on http://0.0.0.0:8091
[latch] status=loading profile=cdlac_latch_qwen14b_locked_20260317
[latch] warmup complete status=ready

$ curl -s http://127.0.0.1:8091/health | jq '.status, .default_memory_tokens'
"ready"
1024

$ curl -s http://127.0.0.1:8091/compile_file \
  -H 'Content-Type: application/json' \
  -d '{"filename":"acme-10k.pdf","content_base64":"<base64>"}'
{ "doc_id":"doc_6f3a59b3f8", "status":"ready", "tokens":95030 }

$ curl -s http://127.0.0.1:8091/query \
  -H 'Content-Type: application/json' \
  -d '{"query":"Summarize the company in 3 bullets.","doc_ids":["doc_6f3a59b3f8"]}'
{ "results":[{ "answer":"Acme provides cloud learning software..." }] }
Terminal 2 · Console Flow
Use the same Docker runtime, then open the local LATCH Console in a browser to upload files, query, inspect telemetry, and manage workspaces.
$ docker run --gpus all -p 8091:8091 codynamics/latch:latest
[latch] runtime starting on http://0.0.0.0:8091
[latch] UI available at http://127.0.0.1:8091/

$ curl -s http://127.0.0.1:8091/health | jq '.ready, .service_rev'
true
"latch_product_nomount_20260325"

$ python3 -m webbrowser http://127.0.0.1:8091/
Opening CDLaC-LATCH Console...

# In the console UI:
1. Upload PDFs, DOCX, XLSX, PPTX, TXT, MD, HTML, CSV, JSON, or XML
2. Adjust compile/query controls and inspect runtime defaults from /health
3. Run prompts, inspect telemetry, and save or load workspaces
LATCH occupies a new category

Not an optimization. A replacement.

Every alternative re-reads, re-chunks, or re-embeds on every query. LATCH doesn't.

Capability Full-Context RAG KV Cache LATCH
Compile once, reuse forever
Cross-document reasoningLimited
Sub-200ms TTFTPartial
Cost amortization over queriesLimited
Persistent on diskEmbeddings onlySession-bound
Model-agnosticPer-model
No chunking artifacts
Portable binary format✓ .latch/.latchdoc
Get started

Your documents. Your infrastructure.
Sub-200ms answers.

Self-Hosted
Run on your own GPU
$79
One-time license. Docker image, license key, and one-line deploy. Your documents never leave your infrastructure. OpenAI-format API compatible.
Buy on Gumroad →
Hosted Access
Managed GPU runtime
Coming Soon
For teams that want LATCH on hosted infrastructure without owning the GPU room themselves. Same product direction, managed delivery.
Notify Me
We are keeping the first release self-hosted. Managed hosted access will follow after the operator and licensing path hardens.
Investor Portal
Open the private materials
Restricted Access
Use your investor code to open the private pitch portal. The public landing page still does not link the console directly.

Invalid access code.

Existing investor codes and portal destinations are preserved from the prior landing page.