Documents
2 – Data Source
2 – Data Source
Type
Document
Status
Published
Created
Sep 5, 2025
Updated
Mar 2, 2026
Updated by
Dosu Bot

A Data Source is a platform that Dosu integrates with to read and write documentation. Data sources provide the context Dosu needs to give accurate, relevant responses about your project. With bidirectional sync, Dosu can also maintain documentation directly on these platforms, keeping your docs up to date where they already live.

You can connect as many data sources as you need. Available data sources include:

  • Coda: Workspaces and documents (Beta)
  • Confluence: Spaces and pages (supports OAuth, email + API token, or scoped API token authentication; supports bidirectional sync) (Beta)
  • GitHub: Repositories (including Markdown and reStructuredText files), issues, PRs, discussions, wikis (supports bidirectional sync)
  • GitLab: Projects, markdown and reStructuredText files (supports bidirectional sync)
  • Notion: Workspaces and pages (supports bidirectional sync)
  • Slack: Channel messages and threads
  • Web: Public websites and documentation

How Data Sources Work#

When you connect a data source, Dosu indexes its content so it can retrieve relevant information when responding to questions. For code repositories, Dosu monitors changes and re-indexes automatically to keep its knowledge up to date.

Note: Web data sources are marked as "synced" in the UI for consistency with other data source types, but no website crawling or indexing occurs. Dosu retrieves information from the web using search capabilities at query time. This intentional design avoids expensive and unreliable website crawling while maintaining a consistent user experience across all data source types.

Data sources are connected at the organization level, then linked to individual deployments. This lets you control which context Dosu draws from when responding in different locations. Once connected, all members of your organization can access and use the data from these sources.

File Filtering for GitHub and GitLab#

For GitHub and GitLab repositories, you can control which files are indexed using file pattern filters in the data source configuration:

  • Include file patterns: Acts as a whitelist—only files matching these patterns will be indexed. Uses glob pattern syntax (e.g., docs/**, src/**/*.ts, backend/**). Leave empty to index all files (default behavior).
  • Ignored file patterns: Files matching these patterns will be excluded from indexing.

The filtering process works in two stages:

  1. If include patterns are specified, only files matching those patterns are considered
  2. Excluded patterns are then applied to remove unwanted files

A file must match include patterns (if any) AND not match excluded patterns to be indexed.

Import and Sync#

Dosu can maintain documentation directly on supported platforms (GitHub, GitLab, Notion, and Confluence). You can import existing documents from these platforms at app.dosu.dev/documents, and Dosu will keep them up to date as your code changes.

For GitHub and GitLab, both Markdown (.md) and reStructuredText (.rst) file formats are supported. The format is automatically detected based on the file extension, and documents maintain their original format throughout the import and update lifecycle—no conversion occurs.

reStructuredText Support#

When you import .rst files from GitHub or GitLab, Dosu provides comprehensive rendering capabilities:

  • RST Rendering Pipeline: Uses rst-compiler with rehype plugins to render reStructuredText content
  • Sphinx-Compatible Directives: Supports common Sphinx directives including:
    • Admonitions (note, warning, tip, caution, danger, error, hint, important, attention, seealso) with styled display and proper title positioning
    • Version directives (versionchanged, versionadded, deprecated) with semantic labels
    • Invisible directives (index, highlight, default-role, meta) that produce no visible output
  • Graceful Handling: Unknown roles and directives render as plain text rather than breaking the document
  • Format Preservation: Documents remain in their original format (.rst or .md) and are not converted during import or updates

How Updates Work#

Updates work differently depending on the platform:

  • GitHub and GitLab: Dosu creates a pull request or merge request with proposed documentation changes. You can review and merge these changes through your normal code review process.
  • Notion and Confluence: Dosu updates documents directly when code changes are shipped.

You can control update behavior through the Auto-Accept Review option in Space settings. When enabled, Dosu automatically applies documentation updates. When disabled, all updates require manual review before being applied.

Managing Data Sources#

null

To view and manage your organization's data sources, go to Settings > Data Sources.

For installation instructions, see GitHub Installation or Confluence Installation.