Skip to main content
A data dictionary is a markdown document bundled with a service that describes its real-world data patterns, field conventions, relationships, and sample data. It provides context that tool schemas alone cannot — actual enum values, ID format patterns, nullable fields in practice, and relationship cardinality.

Why it matters

Tool schemas define how to call a service, but not what the data looks like in practice. The gap varies by service type: API services (Shopify, GitHub, Slack) — tool schemas describe the available resources, but the data dictionary adds real-world patterns: a status field is actually an enum with values active, draft, and archived; product IDs use Shopify GID format like gid://shopify/Product/123456789; the metadata field contains parseable JSON. Data platform services (Snowflake, PostgreSQL, Elasticsearch) — tool schemas only describe how to run queries. They contain no information about what tables exist in the account, what columns those tables have, how tables relate to each other, or what the data means. The data dictionary bridges this gap entirely — documenting the account’s schema, table relationships, column semantics, and sample data so that AI agents can construct meaningful queries. AI agents and apps call get_data_dictionary before using a service to understand these patterns.

Structure

The data dictionary is defined in src/dataDictionary.ts:
export const DATA_DICTIONARY = `
# Data Dictionary: My Service

> Generated on: 2026-01-28

## Overview
Brief description of the service and its data model.

## Quick Reference
| Resource | Description | Key Fields | Notable Patterns |
|----------|-------------|------------|------------------|
| Products | Product catalog | id, title, status | ID format: \`gid://shopify/Product/{id}\` |
| Orders   | Customer orders | id, status, total | Status: open, closed, cancelled |

## Resources

### Products
| Field | Type | Nullable | Description | Example Values | Conventions |
|-------|------|----------|-------------|----------------|-------------|
| id | string | No | Product identifier | \`"gid://shopify/Product/123"\` | Shopify GID format |
| title | string | No | Product title | \`"Classic T-Shirt"\` | |
| status | string | No | Publication status | \`"active"\`, \`"draft"\`, \`"archived"\` | Enum |

## Relationships
Products have many Variants. Orders have many Line Items, each referencing a Variant.

## Field Conventions
### ID Fields
- Products: \`gid://shopify/Product/{numeric_id}\`
- Orders: \`gid://shopify/Order/{numeric_id}\`

### Status Fields
- Product status: active, draft, archived
- Order financial status: pending, paid, refunded, voided
`;

export const DATA_DICTIONARY_GENERATED_AT = "2026-01-28T00:00:00Z";

The get_data_dictionary tool

The data dictionary is exposed at runtime through a get_data_dictionary tool so that AI agents and apps can retrieve it before interacting with the service.

Tool definition

get_data_dictionary: {
  name: "get_data_dictionary",
  description: "Get the data dictionary documentation for this service. Use this to understand the data structure, field conventions, relationships, and sample data patterns before querying the service.",
  inputSchema: {
    type: "object",
    additionalProperties: false,
    properties: {},
  },
  outputSchema: {
    type: "object",
    additionalProperties: true,
    properties: {
      success: { type: "boolean" },
      content: { type: "string", description: "The full data dictionary markdown content" },
      generatedAt: { type: "string", description: "When the data dictionary was generated" },
    },
    required: ["success", "content"],
  },
},

Handler

private async getDataDictionary(): Promise<any> {
  try {
    const { DATA_DICTIONARY, DATA_DICTIONARY_GENERATED_AT } = await import("./dataDictionary");
    return {
      success: true,
      content: DATA_DICTIONARY,
      generatedAt: DATA_DICTIONARY_GENERATED_AT,
    };
  } catch {
    return {
      success: false,
      error: "Data dictionary not found",
      details: "Run the data_explorer tool to generate the data dictionary.",
    };
  }
}

Content guidelines

A good data dictionary documents what schemas cannot:
CategoryAPI servicesData platform services
ID formatsPrefixes, GID patterns, UUID vs numericPrimary key types and naming
Enum valuesActual values for status, type, and category fieldsColumn value ranges and categories
Schema discoveryWhich fields are nullable in practiceTables, columns, types, and descriptions
RelationshipsForeign key patterns and cardinalityJoins, foreign keys, and dimensional models
Sample dataRepresentative records showing real-world valuesExample rows illustrating column semantics
ConventionsDate formats, currency handling, paginationNaming conventions, partitioning, warehouses