Coming Soon

Telosima API

Direct access to a provenance-stamped knowledge graph for frontier AI companies, research institutions, and infrastructure builders.

What You Get

Telosima mints entities at scale — domains measured against the full vocabulary of human language, timestamped, provenanced, with graph edges discovered deterministically.

◆

Full Entity Data

Schema scores, content fingerprints, semantic matches across 42 languages, temporal attestation, discovery timestamps, recrawl history.

◆

Graph Edge Data

Common edges (what entities share), uncommon edges (what they don't share), substrate connections, starship/starjet relationships.

◆

Root-LD + Recursive-LD

Three-layer linked data structure (Anchor, Body, Recursive) with full provenance, timestamped passes, dimensional context.

◆

Timestamped Crawl Archives

Every recrawl preserved. See how websites evolve over time. Temporal data for training models on web change patterns.

◆

Schema Gap Measurements

Cross-referenced with language and geography. Empirical data on schema adoption, linguistic bias, coverage patterns.

◆

Machine-Readable Manifests

Every entity has a manifest.json. Fetch structured data without rendering HTML. Optimized for crawler ingestion.

Use Cases

Training Data for Frontier AI Models

Citation-grounded web data with full provenance. Every entity traced to its source. Temporal attestation for tracking how information changes. Semantic fingerprints across 42 languages. Train models on structured, falsifiable knowledge.

Retrieval Infrastructure for RAG Systems

Pre-structured entities with graph edges. Traverse from any entry point. Common and uncommon edges surface connections invisible to keyword search. Manifests enable efficient fetching without HTML parsing.

Semantic Web Research

Empirical measurement of schema adoption across languages and geographies. Linguistic bias analysis. Knowledge graph evolution over time. Substrate dimension validation. Falsifiable datasets for academic research.

Knowledge Graph Construction

Entities with deterministic measurements. Schema scores, language overlap patterns, TLD distributions. Build domain-specific graphs from Telosima's foundation. No manual tagging required.

Competitive Intelligence + Market Analysis

Track schema adoption patterns by industry, geography, language. Identify gaps. Measure structural data maturity across sectors. See how competitors architect their information.

Why Telosima Data

Most web crawl datasets are noisy, unstructured, and impossible to verify. Telosima is different.

Every entity carries full provenance. Discovery timestamp, mint timestamp, source URL, content hash, recrawl history. Every measurement is falsifiable. You know where the data came from and when it was captured.

Deterministic measurements, labeled outputs. Schema scores are calculated against the full vocabulary. Semantic fingerprints run across 42 language dictionaries. Generated outputs are labeled as generated with inputs shown. No black box.

Graph edges discovered through measurement. Common and uncommon edges form deterministically. No manual tagging. No subjective classification. Relationships emerge from accumulated measurements.

Temporal attestation. Entities recrawl on schedule. Every pass generates a new timestamped record. Previous data archives in entity folders. See how websites change over time.

Multilingual by design. 42 language dictionaries. Semantic overlap patterns across language families. Linguistic bias measurements. This is the only web dataset measuring knowledge organization beneath the language layer.

Early Access Available

If you're building AI infrastructure and need citation-grounded, provenanced data at scale — let's talk.

Request Early Access → Read the Thesis

Technical Specifications

API design is in progress. Early access partners will shape the specification.

Planned endpoints:

/entities — Query minted entities by domain, TLD, schema score, language matches
/entities/{id} — Fetch full entity data including manifest, Root-LD, folder contents
/starships — List available starships (schema, languages, TLDs)
/starships/{id} — Fetch starship data including all appended entities
/starjets/{id} — Fetch starjet data (individual vocabulary nodes)
/graph/edges — Query graph edges by type, confidence, entity pairs
/search — Fuzzy search across all minted entities

Response format: JSON-LD with Root-LD wrapper. Machine-readable, traversable, provenanced.

Rate limits: TBD based on tier (research, commercial, enterprise).

Authentication: API key + OAuth for enterprise tiers.

Get in Touch

Early access is available for frontier AI companies, research institutions, and infrastructure builders.

If you're building something that needs citation-grounded, provenanced knowledge at scale, reach out.

Contact →

◈ Pages

Home Eternal Braid Search Contact For Agencies For Enterprise For Business Schema Thesis Research Star Starjet Starship Starplanet Spacestation API Crawler Privacy

System Status

PageAPI

StatusCOMING SOON

Early AccessAVAILABLE

Entities120,399

Languages42

Provenance100%

EndpointsIN DESIGN

FormatJSON-LD

AccessREQUEST

Endpoints —

Machine Entry Point

entities/api/entities

braid/api/braid

graph/api/graph

lexicon/api/lexicon

content_type application/ld+json

Build: 2026-PROD Spec: receptor-v1.0 Status: OPERATIONAL

Enterprise infrastructure. Early access available.