Fetching Documents from your Knowledge Base | Dcipher Analytics Help Center

Overview

To retrieve documents, you will use the GET /kb/{kb_id} endpoint. This allows you to pull text, sources, and metadata for use in external applications or analysis tools.

1. Authentication

All API requests require a Bearer Token in the HTTP header.

Header: Authorization: Bearer <your_access_token>
Error Note: If the token is missing or invalid, the API will return a 401 Unauthorized status.

2. Request Parameters

Parameter	Type	Required	Description
`kb_id`	Path	Yes	The unique identifier for your Knowledge Base.
`page_size`	Query	No	Number of results per page (Default: 100, Max: 1000).
`next_token`	Query	No	The cursor for the next page of results.

3. Filtering Results

You can filter your document list using the syntax field[operator]=value.

Common Operators:

[gte] / [lte]: Greater than / Less than or equal to.
[in]: Matches any value in a comma-separated list.
[exists]: Check if a field is present (true/false).
[regex]: Search using a regular expression pattern.

Pro Tip: Date Filtering Use YYYY-MM-DD format. timestamp[gte]=2025-01-01 will capture everything from the start of New Year's Day UTC.

4. Handling Pagination

The API uses cursor-based pagination. If a response contains more data than the page_size, it will return a next_token.

First Request: Leave next_token empty.
Check has_next: If true, copy the next_token string.
Subsequent Request: Pass that token into your next call to get the next batch.

5. Response Example

JSON

{
    "documents": [
        {
            "text": "Example document content...",
            "source": "Internal Report",
            "timestamp": "2025-02-02T12:00:00Z",
            "metadata": {
                "department": "Analysis"
            }
        }
    ],
    "next_token": "abc123xyz",
    "has_next": true
}

Implementation Examples

Bash (cURL)

Bash

# Basic request
curl -X GET "https://rb-backend.dcipheranalytics.com/kb/${KB_ID}?page_size=100" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" \
  -H "Accept: application/json"

# Filtering example (Numeric & List)
curl -G "https://rb-backend.dcipheranalytics.com/kb/${KB_ID}" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" \
  --data-urlencode "field[gte]=3" \
  --data-urlencode "field[in]=a,b,c"

Python (Pagination Loop)

import requests

BASE_URL = "https://rb-backend.dcipheranalytics.com"
KB_ID = "YOUR_KB_ID"
ACCESS_TOKEN = "YOUR_ACCESS_TOKEN"

page_size = 100
next_token = None

while True:
    params = {"page_size": page_size}
    if next_token:
        params["next_token"] = next_token

    resp = requests.get(
        f"{BASE_URL}/kb/{KB_ID}",
        headers={
          "Authorization": f"Bearer {ACCESS_TOKEN}",
          "Accept": "application/json"
        },
        params=params,
        timeout=30,
    )
    resp.raise_for_status()
    data = resp.json()

    documents = data.get("documents", [])
    print(f"Fetched {len(documents)} documents")

    if not data.get("has_next"):
        break

    next_token = data.get("next_token")

Node.js (Fetch)

const BASE_URL = "https://rb-backend.dcipheranalytics.com";
const KB_ID = "YOUR_KB_ID";
const ACCESS_TOKEN = "YOUR_ACCESS_TOKEN";

let nextToken = null;
const pageSize = 100;

while (true) {
  const url = new URL(`${BASE_URL}/kb/${KB_ID}`);
  url.searchParams.set("page_size", String(pageSize));
  if (nextToken) url.searchParams.set("next_token", nextToken);

  const resp = await fetch(url, {
    method: "GET",
    headers: {
      Authorization: `Bearer ${ACCESS_TOKEN}`,
      Accept: "application/json",
    },
  });

  if (!resp.ok) throw new Error(`HTTP ${resp.status}`);
  const data = await resp.json();
  
  console.log(`Fetched ${data.documents?.length} documents`);

  if (!data.has_next) break;
  nextToken = data.next_token;
}