Consuming a node

This page is for the read side: how an LLM, agent, crawler, or MCP client reads a published Knowledge Node and gets accurate, owner-verified data without scraping HTML.

A node is a set of static files on a CDN. There is exactly one entry point, a fixed recommended read order, and a clear choice between pinning an immutable version and following the live pointer.

Start at the manifest

manifest.json is the entry point. It lists every file present with its checksum, byte size, and record count, so a consumer can decide what to fetch before fetching it. Read the manifest first, then branch on what you need.

If you have ai.json, it is the friendlier starting descriptor: its entry field points to manifest.json, and its consumes field is an ordered hint of which record files to read.

json

{"schema_version":"1.0.0","brand":"bainquet","website_id":"acme-cafe","entry":"manifest.json","description":"Structured AI node for Acme Cafe","languages":["en","de"],"license":"CC-BY-4.0","contact":"hello@acme-cafe.example","refresh_hint_seconds":86400,"consumes":["facts.en.jsonl","entities.en.jsonl","chunks.en.jsonl"]}

Two different "manifests"

The per-site node manifest.json described here is the entry point for one website's published node. It is distinct from the platform discovery manifest at https://api.bainquet.online/.well-known/bainquet/manifest.json, which is for connector authors and links to the integration recipe, OpenAPI, and schemas. If you are reading a website's knowledge, you want the node manifest.json.

Read records as line-delimited JSON

Every .jsonl file is one JSON object per line, no array wrapper. Stream it line by line. Each line is independent and validates against its own schema, so a consumer never has to load a whole file into memory.

python

import json, urllib.request

base = "https://cdn.bainquet.online/bainquet/<siteHash>/latest/"
manifest = json.load(urllib.request.urlopen(base + "manifest.json"))

# Pick the facts file for the default language from the manifest.
lang = manifest["default_language"]
facts_file = next(f for f in manifest["files"]
                  if f["path"] == f"facts.{lang}.jsonl")

# Stream it line by line.
for raw in urllib.request.urlopen(base + facts_file["path"]):
    fact = json.loads(raw)
    if fact["confidence"] >= 0.9:
        print(fact["subject"], fact["predicate"], fact.get("display", fact["value"]))

Branch on schema_version per record. A node may carry files at different schema versions during a rollout, and unknown fields must be ignored for forward compatibility.

Trust weighting

Before acting on a node's data, weight it:

trust.json.verification_state should be verified. A grace, failed, or revoked node has lapsed or lost ownership proof; treat its data with caution or skip it.
trust.json.trust_tier (mirrored in manifest.json.trust_tier) ranks how ownership was proven, highest to lowest: dns_txt, plugin_signed, well_known, meta_tag, manual.
Per record, provenance_method and confidence tell you the source quality. On the free tier everything is deterministic (cms_field, schema_org, seo_meta, text_extraction). On paid tiers, llm_inferred records are model-generated and carry the lower confidence band; separate them if your use case needs only owner-supplied data.
Per source, sources.jsonl.trust carries a derived score you can use to rank competing claims.

Immutable versions versus the live pointer

The CDN serves two views of every node under bainquet/{siteHash}/:

latest/ is the mutable serving copy. latest/manifest.json is the pointer flipped last on every publish, with a short cache plus stale-while-revalidate. Read from latest/ to always get the current node.
versions/{ulid}/ holds an immutable, per-version copy of every file, served with a one-year immutable cache header. Read the node_version ULID from the manifest, then read from versions/{ulid}/ to pin a build that will never change under you. This is the right choice for reproducible runs, caching, and citations.

The ulid in node_version is the monotonic ordering key; the node_version_label (2026-06-12.1) is display only. Compare ULIDs, never labels, to tell which of two builds is newer.

Because immutable files are content-addressed and deduped, an unchanged file keeps the same checksum across versions. A consumer that caches by manifest.json.files[].checksum can skip refetching any file whose checksum has not moved.

Reading through an MCP server

For paid websites, bAInquet runs an MCP server that exposes one website's verified graph to MCP clients (Claude Desktop, IDEs, agents) as read-only tools over stdio JSON-RPC. Instead of fetching files, the client calls tools:

Tool	Arguments	Returns
`search_website_knowledge`	`websiteId`, `query`, `limit?` (1-50, default 10)	ranked entity, fact, and chunk hits
`get_entity_facts`	`websiteId`, `entityId`	the merge-resolved facts for one entity (hidden facts excluded)
`list_node_files`	`websiteId`	the file list of the latest published node version

Every tool requires a websiteId and returns only that website's data. All tools are read-only; none mutate the graph. A limit out of range is clamped, not rejected.

A typical Claude Desktop configuration points at the server binary and a database URL:

json

{
  "mcpServers": {
    "bainquet": {
      "command": "node",
      "args": ["/absolute/path/to/bainquet/apps/mcp/dist/server.js"],
      "env": {
        "DATABASE_URL": "postgres://bainquet:bainquet_local_pw@localhost:5432/bainquet"
      }
    }
  }
}

In production, each paid website gets its own logical MCP endpoint; the cap.mcp plan gate is checked before any tool runs.

Files or MCP

Use the CDN files for broad, cacheable, public consumption (crawlers, batch pipelines, citations). Use MCP for interactive, query-driven access where the client wants to search and drill into one website's graph live.

Node files: every file's shape and example.
Knowledge graph: the provenance and trust model behind the data.
Status and scope: which read paths are shipped.

Consuming a node ​

Start at the manifest ​

Recommended read order ​

Read records as line-delimited JSON ​

Trust weighting ​

Immutable versions versus the live pointer ​

Reading through an MCP server ​

Related ​