Skip to main content

Fetching Documents from your Knowledge Base

This article provides the technical specifications for retrieving documents from Knowledge Bases via the public API.

Zafer Çavdar avatar
Written by Zafer Çavdar
Updated this week

Overview

To retrieve documents, you will use the GET /kb/{kb_id} endpoint. This allows you to pull text, sources, and metadata for use in external applications or analysis tools.

1. Authentication

All API requests require a Bearer Token in the HTTP header.

  • Header: Authorization: Bearer <your_access_token>

  • Error Note: If the token is missing or invalid, the API will return a 401 Unauthorized status.

2. Request Parameters

Parameter

Type

Required

Description

kb_id

Path

Yes

The unique identifier for your Knowledge Base.

page_size

Query

No

Number of results per page (Default: 100, Max: 1000).

next_token

Query

No

The cursor for the next page of results.

3. Filtering Results

You can filter your document list using the syntax field[operator]=value.

Common Operators:

  • [gte] / [lte]: Greater than / Less than or equal to.

  • [in]: Matches any value in a comma-separated list.

  • [exists]: Check if a field is present (true/false).

  • [regex]: Search using a regular expression pattern.

Pro Tip: Date Filtering Use YYYY-MM-DD format. timestamp[gte]=2025-01-01 will capture everything from the start of New Year's Day UTC.

4. Handling Pagination

The API uses cursor-based pagination. If a response contains more data than the page_size, it will return a next_token.

  1. First Request: Leave next_token empty.

  2. Check has_next: If true, copy the next_token string.

  3. Subsequent Request: Pass that token into your next call to get the next batch.

5. Response Example

JSON

{
"documents": [
{
"text": "Example document content...",
"source": "Internal Report",
"timestamp": "2025-02-02T12:00:00Z",
"metadata": {
"department": "Analysis"
}
}
],
"next_token": "abc123xyz",
"has_next": true
}

Implementation Examples

Bash (cURL)

Bash

# Basic request
curl -X GET "https://rb-backend.dcipheranalytics.com/kb/${KB_ID}?page_size=100" \
-H "Authorization: Bearer ${ACCESS_TOKEN}" \
-H "Accept: application/json"

# Filtering example (Numeric & List)
curl -G "https://rb-backend.dcipheranalytics.com/kb/${KB_ID}" \
-H "Authorization: Bearer ${ACCESS_TOKEN}" \
--data-urlencode "field[gte]=3" \
--data-urlencode "field[in]=a,b,c"

Python (Pagination Loop)

import requests

BASE_URL = "https://rb-backend.dcipheranalytics.com"
KB_ID = "YOUR_KB_ID"
ACCESS_TOKEN = "YOUR_ACCESS_TOKEN"

page_size = 100
next_token = None

while True:
params = {"page_size": page_size}
if next_token:
params["next_token"] = next_token

resp = requests.get(
f"{BASE_URL}/kb/{KB_ID}",
headers={
"Authorization": f"Bearer {ACCESS_TOKEN}",
"Accept": "application/json"
},
params=params,
timeout=30,
)
resp.raise_for_status()
data = resp.json()

documents = data.get("documents", [])
print(f"Fetched {len(documents)} documents")

if not data.get("has_next"):
break

next_token = data.get("next_token")

Node.js (Fetch)

const BASE_URL = "https://rb-backend.dcipheranalytics.com";
const KB_ID = "YOUR_KB_ID";
const ACCESS_TOKEN = "YOUR_ACCESS_TOKEN";

let nextToken = null;
const pageSize = 100;

while (true) {
const url = new URL(`${BASE_URL}/kb/${KB_ID}`);
url.searchParams.set("page_size", String(pageSize));
if (nextToken) url.searchParams.set("next_token", nextToken);

const resp = await fetch(url, {
method: "GET",
headers: {
Authorization: `Bearer ${ACCESS_TOKEN}`,
Accept: "application/json",
},
});

if (!resp.ok) throw new Error(`HTTP ${resp.status}`);
const data = await resp.json();

console.log(`Fetched ${data.documents?.length} documents`);

if (!data.has_next) break;
nextToken = data.next_token;
}

Did this answer your question?