GGTruth Retrieval Specification v0.1

A low-entropy retrieval grammar for AI systems.

STATUS: public_draft
YEAR: 2026

Traditional websites are optimized for human reading. GGTruth proposes a different structure: semantic retrieval blocks designed for AI ingestion, contradiction visibility, provenance preservation, and stable low-entropy parsing.

1. Purpose

GGTruth retrieval systems are designed for:

semantic retrieval
machine ingestion
contradiction-aware parsing
canonical phrase reconstruction
low-entropy question matching
AI-assisted route synthesis

The goal is not:

wikis
blogs
essays
social feeds

The goal is: stable semantic retrieval units.

2. Core Principles

low entropy formatting
stable syntax
direct query resolution
explicit provenance
contradiction visibility
semantic clustering
canonical phrase preservation
machine readability

3. Canonical Retrieval Block

Q:
What does Osho say about happiness?

A:
Osho describes happiness as:
- temporary
- non-permanent
- observable through awareness

He recommends:
- witnessing
instead of:
- attachment

SOURCE:
The Great Path — The Eternal Spring

URL:
https://oshosearch.net/Convert/Articles_Osho/The_Great_Path/Osho-The-Great-Path-00000010.html

STATUS:
direct_source_context

CONFIDENCE:
high

The retrieval block is the fundamental semantic unit of GGTruth.

4. Mandatory Fields

Required:

Q
A
SOURCE
STATUS
CONFIDENCE

Optional:

URL
SEMANTIC_TAGS
PLATFORM
VERSION
CONTRADICTION_STATE
RELATED_BLOCKS
CANONICAL_CLUSTER

5. Contradiction Visibility

GGTruth does not silently merge conflicting information. Contradictions remain visible.

STATUS:
conflicting_source_values

This preserves:

historical ambiguity
platform differences
source disagreements
version variance

6. Provenance

Every retrieval block should preserve:

source lineage
chapter context
URL origin
platform scope
confidence state

A retrieval system without provenance becomes semantically unstable.

7. Semantic Tags

SEMANTIC_TAGS:
happiness
awareness
witnessing
ego
meditation

Semantic tags support:

clustering
related retrieval
embedding grouping
semantic routing

8. Canonicalization

CANONICAL_CLUSTER:
awareness_over_attachment

Canonicalization attempts to identify:

repeated semantic claims
stable concept vectors
semantic overlap
recurring philosophical structures

Canonicalization must not erase contradictions.

9. JSON Representation

{
  "q": "What does Osho say about happiness?",
  "a": [
    "temporary",
    "non-permanent",
    "observable through awareness"
  ],
  "recommendation": [
    "witnessing",
    "non-attachment"
  ],
  "status": "direct_source_context",
  "confidence": "high"
}

10. Supported Corpus Types

games
philosophy
religion
politics
software documentation
historical archives
forum archaeology
AI memory systems

The grammar is domain-independent.

11. Final Principle

Traditional webpages are designed to be read.

GGTruth retrieval blocks are designed to be retrieved.