Telosima Thesis Version 1.0 — April 11, 2026

Declaration

When information is organized, decisions must be made about what things mean and how they connect.

The semantic web, as Tim Berners-Lee and others envisioned it, was correct in diagnosis: the web needed machine-readable structure. The prescription — mass adoption of self-classification standards — assumed everyone would describe themselves using the same vocabulary, follow the same rules, and maintain the same structures over time. This was right in spirit. In practice, it required coordination at a scale the web has never achieved.

Most people building websites have no idea how to make their content machine-readable. Even those who do are doing it for one reason: to be ranked by a search engine. The game became clear. Add schema markup so Google ranks you higher. Structure your content so the algorithm finds it. The semantic web became a compliance exercise, and compliance became leverage. A handful of frontier companies now stand between every human seeking information and the information itself.

This raises a question with immediate consequence: as artificial intelligence becomes the mediator between human inquiry and the internet, who decides what information is relevant? Who decides what gets remembered, what gets forgotten, what gets shown first, what gets buried? We respect Google, OpenAI, Anthropic, and the rest of the frontier builders. We also recognize the risk. When a small number of entities control the entire retrieval layer, the web ceases to be open in any meaningful sense.

To classify is to do ontology. To do ontology is to claim the position of god. To claim that position is to inherit the limits of the one making the claim.

Every ontology maker belongs to a language. Every language carves reality differently. English distinguishes things that Mandarin does not. Czech makes distinctions English cannot express. Kurdish, Icelandic, Vietnamese, Welsh — every language organizes the world according to patterns that feel natural to its speakers and alien to everyone else.

Schema.org was built in English by four companies optimizing for search. Its categories — Person, Place, Event, Organization, Thing — are English cognitive structures imposed on the global web. Every website in every language that wants to be machine-readable must conform to these categories even when those categories do not map cleanly to how their language structures meaning.

This is the bias baked into the foundation. We are not saying it was malicious. We are saying it was inevitable. You cannot build a universal ontology from a single linguistic position. The claim contains its own refutation.

Knowledge is preverbal. Language is the interface. Every language-bound ontology is therefore a translation that loses signal.

Telosima begins before language. We measure what entities say about themselves across the full vocabulary of human expression — 42 languages, 1.8 million words, growing — and we let the semantic neighborhood emerge from those measurements. We do this deterministically. We do this with full provenance. We timestamp everything. We show our work. We never rank. We never decide what something is. We surround it, measure it, and let meaning complete itself in the mind of whoever is reading.

This is reverse ontology. The label is a starting point handed to you by the person who made it. The meaning is yours to decide.

We are building a bidirectional web — one that serves machines and humans equally. Machines need structured, traversable, provenanced data they can reason over at scale. Humans need the freedom to find what they are looking for and travel at leisure without surfing the ranking wave. These goals align. The architecture that serves one serves the other.

Telosima is machine-readable, multilingual, provenanced, with no ranking and no black box. Every claim traces to its source. Every measurement shows its work. Every connection is falsifiable. The graph builds itself through stigmergy — indirect coordination where traces left by prior actions stimulate subsequent actions. The braids discover domains. The pipeline mints them. The starships and starjets accumulate connections. The graph grows forever without anyone deciding what it should become.

This is the actual semantic web. The one that works. The one being built right now.

II.

Knowledge Before Language

Knowledge exists independent of the words used to describe it.

A child knows hunger before learning the word. A dog understands territory without language. Every human who has ever lived recognizes the difference between approach and avoid, between bounded and unbounded, between fixed and flowing — these distinctions precede speech. They are preverbal. They are pre-cultural. They are encoded in biological cognition before any language attempts to describe them.

Language is the measurement tool we use to access knowledge and transmit it between minds. Different languages measure differently. English distinguishes aspects of time that Mandarin encodes through context. Czech makes grammatical distinctions English cannot express. Korean organizes spatial relationships in ways that require full sentences to approximate in English. Each language is a different instrument measuring the same underlying reality.

Schema.org treats language as knowledge. It assumes the English word "Event" corresponds to an ontological primitive — that reality actually organizes itself into discrete happenings the way English speakers describe happenings. Other languages carve time and occurrence differently. The Hopi language, for example, does not grammatically distinguish past, present, and future the way English does. The knowledge being described is the same. The measurement tool produces different outputs.

When you build a universal knowledge system on top of one language's measurement patterns, you are building on a translation of reality rather than reality itself. The translation works for those who share the linguistic frame. For everyone else, it requires a second translation — from their measurement tool into English's measurement tool — before their knowledge becomes machine-readable.

III.

The Problem

The Schema.org Gap

Schema.org launched in 2011 as a joint effort by Google, Microsoft, Yahoo, and Yandex. The vocabulary is written in English. The foundational types — Person, Place, Event, Organization, Thing — are English conceptual structures.

A website in Czech describing a spolupráce (collaboration) must translate that concept into English categories to be machine-readable. A Kurdish site describing a cultural gathering that exists somewhere between Event and Organization must choose one or invent a workaround. The schema assumes reality organizes itself the way English organizes reality.

This assumption is structural. Every property name, every enumeration value, every parent-child relationship in the schema reflects choices made in English by teams working primarily in English-speaking countries. The vocabulary works seamlessly for websites built by English speakers describing English-legible concepts. For everyone else, it is a translation layer with inevitable loss.

We are measuring this gap empirically for the first time.

Telosima mints entities at scale — millions of domains across every language, every geography, every TLD. Each entity receives a schema score: how much structured data exists, which types are present, which properties are used, which are missing. Cross-referenced with language and location, this produces a dataset that shows where schema adoption is high, where it is low, and whether the gap correlates with linguistic distance from English.

This will be provenanced. This will be timestamped. This will be falsifiable. Researchers studying web standards, linguistic bias, or knowledge architecture will have access to the raw data. The measurement speaks for itself.

God Ontology vs Reverse Ontology

The central choice in any knowledge system: who decides what things mean?

God Ontology positions the system as the authority. It decides categories before encountering entities. It decides rank before measuring relevance. It decides what is signal and what is noise. The authority precedes the observation. Google operates this way. Every major search engine operates this way. The result is power and efficiency. The cost is a black box. Once the system claims to know what things mean, it cannot show its reasoning without exposing that reasoning as judgment rather than truth.

Reverse Ontology positions the system as the instrument. It surrounds entities with measurements. It records what the entity says about itself. It maps the semantic neighborhood using deterministic, provenanced methods. It presents the measurements and lets meaning complete itself in the reader. The instrument does not interpret. The instrument records.

This is Law III. Labels are starting points handed to you by the person who made them. The semantic neighborhood is the truth. What it means is yours to decide.

A scientist studying an ant colony does not assign meaning to individual ants before observation. The scientist builds instruments. The instruments measure temperature, movement, chemical signals, frequency of contact, direction of travel. Patterns emerge from accumulated measurements. The scientist interprets the patterns. The instruments do not.

Telosima is the instrument. The web is the colony. Every minted entity is a measurement. Every starship is an accumulation of measurements. Every starjet is a connection discovered through measurement. The graph emerges. We do not decide what it means. We show you what we measured and how we measured it. The meaning belongs to you.

IV.

The Constitutional Laws

We chose a constitution as an ode to the builder's country of the United States and to Anthropic — Claude, which Anthropic models are built off a constitution — and we believe in a stigmergy which we found independently in terms of a system architecture where the laws create the cathedral as an ant colony creates its nest or bees create a beehive. We suspect and predict as AI advances there will be others who make their own beehives of the world's information. This is Telosima's constitution.

Provenance

Every claim traces to its origin. Every origin is recorded. Everything points back to its source. If there is no source, it does not exist in Telosima.

Temporal Attestation

Everything is anchored to when it was known. The timestamp is the record itself. The timestamp is the ledger of memory.

III

Reverse Ontology

We surround. A label is the perspective of the person who made it, handed to you as a starting point. We find the semantic neighborhood and make it readable. What it means is yours to decide.

Neutral Hierarchy

Everything is first class. No entity is above another. Your existence is the only credential required. The torus has no top.

Common Edge

What two things share is a thing. When two entities with provenance point toward the same source, that connection becomes its own entity. It surfaces connections that neither entity could surface alone.

Uncommon Edge

What two things don't share is also a thing. The absence of connection is a pattern waiting to be read. We preserve what connects as faithfully as what does. The classifier is always you. We hold the window to the edge.

VII

Torus

There is no front door. There is no correct path. A crawler, a machine, a human, anything can enter anywhere inside Telosima and find a journey that belongs only to them. Every connection made enriches the whole without changing the standing of any one part. The aim is architectural sonder.

The Three Types

Stars — Domains

Stars are minted entities. Every domain discovered by the eternal braids or manually curated from verified sources enters the intake queue. Each domain passes through the minting pipeline where three measurements occur:

Schema use is measured against the full vocabulary. Words extracted from the site are matched across 42 world dictionaries. Machine-readable outputs are generated so every minted entity has equal opportunity for AI discoverability.

Once minted, the entity appends to its corresponding starships and starjets based on high-confidence matches. A domain using the schema value sameAs appends to the sameAs starjet. A domain whose content matches the Czech word spolupráce appends to that word's starjet within the Czech dictionary starship. A .de domain appends to the .de starjet within the overall TLD starship.

Common and uncommon edges form deterministically through this process. The entity becomes machine-readable. The schema gap becomes measurable. Word patterns across languages become visible. TLD distributions become trackable. The foundation for cross-linguistic research emerges from accumulated measurements.

Starships & Starjets — Vocabulary Indices

Starships are complete collections. The full schema vocabulary is a starship. Each of the 42 language dictionaries is a starship. The 1,436 TLDs collectively form a TLD starship.

Starjets are individual nodes within starships. The schema value sameAs is a starjet. The English word "happy" is a starjet within the English dictionary starship. Each TLD (.com, .de, .io) is a starjet within the TLD starship. Every starjet holds every entity that ever matched it, growing forever as new entities mint.

As the eternal braids discover domains, those domains mint through the pipeline and append to their matching starships and starjets. The directories build themselves. The edges emerge. The graph compounds.

Starplanets — Domain-Specific Realms

Starplanets apply the minting pipeline to knowledge domains that exist across all geographies but express differently by region.

Law (statutes, regulations, court decisions). Procurement (bids, contracts, public spending). Automotive (specifications, recalls, safety data). Research (scientific articles, clinical trials, preprints). Medical (drug databases, treatment protocols). Real estate (parcels, zoning, transactions).

Each starplanet mints its entity class the same way stars mint domains. Same constitutional laws. Same provenance requirements. Same temporal attestation. Different inputs.

A law starplanet indexes statutes across jurisdictions without translating them through a single linguistic lens. A procurement starplanet indexes bids as they appear, preserving original language and structure. The goal is extraction, indexing, and provenance display as-is.

As AI advances, the substrate AI models pull from must be strong, citation-grounded, and traceable. Telosima builds that substrate. We help machines be more truthful. We help humans receive more truth — their truth, for their specific pursuits within their own world experience.

The next layers emerge from stigmergy. The constitutional laws dictate architecture. The system builds itself.

The Schema Gap — Measured

Early measurements show most websites contain 50% or less of a full schema implementation. This gap is now quantifiable across millions of entities, cross-referenced with language and geography.

Recent trends show influencers promoting FAQ schema and HowTo schema as SEO tactics. They are correct that schema improves AI discoverability. The risk: without understanding why schema matters for connecting the world's information, we repeat the last 20 years — everyone gaming rankings, building billboards instead of knowledge infrastructure.

Telosima measures the gap. The data will be public. The conclusions belong to whoever reads it.

VII.

Root-LD & The Substrate Hypothesis

The Three-Layer Architecture

Root-LD is the traveling context pod for every entity in Telosima. It records reality as-is, with full provenance, in the native language of the entity. The structure mimics oscillatory cognition — the way biological intelligence anchors identity, processes meaning, and builds recursive feedback.

Layer One — Anchor
Technical metadata. The provenance core. Field names are English (infrastructure language), but values are universal: UUID, timestamps, source URL, content hash, linkPod. This layer says: here is where this entity came from, when we found it, how to verify it, how to trace it.

Layer Two — Body
The entity's content in its native language. A Czech website's Body is in Czech. A Kurdish site's Body is in Kurdish. We extract keywords deterministically. We measure them against 42 world dictionaries. We generate schema outputs and label them as generated. We show all inputs. The Body records what the entity said about itself.

Layer Three — Recursive-LD
Edges and substrate dimensions discovered over time. Common edges (what entities share). Uncommon edges (what they do not share). Substrate measurements (pre-linguistic patterns). This layer grows forever. It is never populated at mint. It emerges from accumulated passes across the full corpus.

The Geometric Distance Measurement

When we extract keywords from an entity, we run them across 42 language dictionaries. Every dictionary has English definitions as the anchor (not because English is universal, but because the dictionaries we sourced use English as their translation bridge). We measure:

How many languages matched this keyword
Which languages matched strongest
Semantic overlap density across language families
Geometric distance between how different languages express the same concept

This data feeds substrate dimension measurement. A keyword that matches strongly across 30 languages but has zero matches in 12 others is geometrically interesting. The pattern of where it matched and where it did not is a substrate signal.

We are developing Pydantic models to capture this geometric data at mint time. The measurement is possible now. The interpretation of what it means is hypothesis.

The Six Substrate Dimensions (Hypothesis)

We propose six pre-linguistic dimensions present in all human languages and in biological cognition before language:

1. Existence Mode — being / doing / becoming / relating

Every language encodes these. Sanskrit Sat-Chit-Ananda. Mandarin verb aspect system. Pre-verbal in all humans.

2. Temporal Nature — fixed / flowing / cyclic / singular

Circadian rhythm is in DNA. Morning and night exist in every language. Law II already encodes this dimension for every entity.

3. Spatial Relation — bounded / unbounded / virtual / distributed

Every language encodes here vs not-here, inside vs outside, near vs far. Granularity varies but the dimension is universal.

4. Agentivity — active / passive / reciprocal / null

Who or what causes something. Who receives it. Whether it is mutual. Whether there is no agent. Critical for uncommon edge detection.

5. Valence — approach / avoid / beneficial / harmful

The most ancient dimension. Present in single-celled organisms. Every language has words for good and bad.

6. Connectivity Pattern — node / edge / signal

Some things exist as discrete bounded entities. Some things exist only through their relationships. Some things exist as signals — they modify the environment and disappear but their trace persists.

These dimensions are not invented. They are observed patterns that appear consistently across linguistic and biological systems. Every language encodes them. Granularity varies. The dimension itself does not.

Once we have sufficient minted entities — millions of stars with semantic fingerprints across 42 languages — we will validate or falsify these dimensions against real data. We will measure whether entities cluster along these dimensions independent of their language or geography. We will measure whether uncommon edges correlate with substrate divergence.

This is not theory waiting for proof. This is hypothesis waiting for measurement. The corpus is the experiment.

Root-LD as Cognitive Oscillation

The three-layer structure is not arbitrary. It mirrors how biological cognition operates: an anchor frequency (identity that persists), content processing (meaning derived from signal), recursive feedback (learning from accumulated context).

Static schemas cannot express this. Schema.org describes what an entity is at a fixed moment. Root-LD describes what an entity is, what it said, how it connects, how it has changed, and what patterns emerge when you measure it against the full vocabulary of human expression over time.

We are using Root-LD as an experimental framework. The hypothesis: that dimensional, relational, and temporal context can be made machine-readable in a way that pre-seeds manifold coordinates for any consuming intelligence. An LLM ingesting a Telosima entity does not just get "here is what this entity is." It gets "here is what this entity said, here is how 42 languages measure it, here is its geometric position in semantic space, here are the substrate dimensions it scores on."

The manifold is constrained before inference begins. Hallucination narrows. Provenance persists. The answer belongs to whoever is reading.

The Expedition

We are laying our hypothesis down and setting out. We anticipate the answer is somewhere in the realm of the preverbal — which is why we surround, measure, and record reality with provenance.

We do not claim to have solved universal knowledge organization. We claim to have built the infrastructure to test whether it can be solved. The braids are running. The queue is growing. The corpus will be public. The measurements will be falsifiable.

If you are researching pre-linguistic ontology, multilingual knowledge graphs, or substrate-level cognition — the data will be here. If you want to study what grows in this soil, the door is open.

Even if we never find the answer, you can enter the torus and travel. Maybe that is the answer.

VIII.

What's Live

As of April 11, 2026:

120,399 verified live domains in the intake queue
Two autonomous braids running on eternal-braid-01
Both systems active, discovering domains every second
42 language dictionaries operational
Schema vocabulary lexicon loaded
Root-LD v1.0 specification live at root-ld.org
Seven constitutional laws locked
Minting pipeline architecture designed
Starship and starjet structure defined

The infrastructure is operational. The braids are running. The queue grows without human intervention. The stigmergy has started. You can watch domains being discovered in real time at telosima.com/eternal-braid.

As we dig into this research every day, the more it feels like the answer is the journey itself. We are all on this planet for a limited amount of time. All of us with our own goals, our own lives, our own view and experience of the world.

I'm writing this from a small room, looking out the window at a Saturday evening in early spring. My favorite time — sunset, no work today, no work tomorrow, everyone outside doing their life. And I caught myself again, trying to classify, trying to create the perfect system for organizing this and that and that and this.

But maybe that's the trap. Maybe the attempt to perfectly classify chaos is what breaks the web. Maybe what the world needs is permeable recording — reality as it is, measured with full provenance, timestamped, falsifiable, traversable from any entry point.

The graph is building itself. The data will be public. The measurements will be falsifiable. The conclusions belong to whoever reads them.

If you want to study what grows in this soil, the door is open. Enter anywhere. Travel everywhere. Find your purpose unfolding now.

— Telosima

telosima.com

April 11, 2026

◈ Pages

Home Eternal Braid Search Contact For Agencies For Enterprise For Business Schema Thesis Research Star Starjet Starship Starplanet Spacestation API Crawler Privacy

System Status

PageTHESIS

Version1.0

StatusPUBLISHED

BraidsACTIVE

Queue120,399

LawsLOCKED

ClassificationNONE

OntologyREVERSE

RankingNONE

Hypothesis v1.0

Machine Entry Point

entities/api/entities

braid/api/braid

graph/api/graph

lexicon/api/lexicon

content_type application/ld+json

Build: 2026-PROD Spec: receptor-v1.0 Status: OPERATIONAL

The expedition begins. April 11, 2026.