Core Concepts

Core Concepts#

ChartBook uses a hierarchical system for organizing and sharing data science work. Understanding these concepts will help you structure your projects effectively.

Terminology Overview#

Term	Description
Pipeline	A single reproducible analytics project with charts, dataframes, and documentation
Catalog	A collection of pipelines aggregated into a unified, browsable documentation site
ChartBook	A curated narrative document combining charts and text from one or more pipelines
ChartHub	The global platform for sharing ChartBooks and Catalogs publicly

System Hierarchy#

ChartHub (global sharing platform)
│
├── Organization A's Catalog
│   ├── Pipeline 1 → produces → ChartBook(s)
│   ├── Pipeline 2 → produces → ChartBook(s)
│   └── Pipeline 3 → produces → ChartBook(s)
│
└── Organization B's Catalog
    └── ...

Pipeline#

A Pipeline is the fundamental unit in ChartBook. It represents a single, self-contained analytics project.

What a Pipeline Contains#

Source code for data processing and analysis
Dataframes with documented schemas and metadata
Charts with interactive visualizations
Notebooks for exploratory analysis
Documentation generated from your configuration

When to Use a Pipeline#

Use a Pipeline when you have:

A single analytics project or workflow
One team working on related analyses
Data that flows through a defined process
Charts and outputs that belong together logically

Pipeline Configuration#

Pipelines are configured with chartbook.toml:

[config]
type = "pipeline"
chartbook_format_version = "0.1.0"

[site]
title = "My Analytics Pipeline"
author = "Data Team"

[pipeline]
id = "ANALYTICS"
pipeline_name = "Quarterly Analytics"
pipeline_description = "Quarterly business metrics and analysis"

See Pipelines for detailed configuration options.

Catalog#

A Catalog aggregates multiple Pipelines into a unified documentation site, making it easy to discover and connect related work across teams.

What a Catalog Provides#

Unified search across all pipelines
Cross-pipeline navigation and discovery
Centralized documentation for an organization or team
Data connectivity between pipelines via the Python API

When to Use a Catalog#

Use a Catalog when you have:

Multiple related pipelines to organize
Teams that need to discover each other’s work
Data that flows between different pipelines
An organization-wide analytics portal

Catalog Configuration#

[config]
type = "catalog"
chartbook_format_version = "0.1.0"

[site]
title = "Analytics Catalog"
author = "Data Science Team"

[pipelines.quarterly]
path_to_pipeline = "./pipelines/quarterly"

[pipelines.monthly]
path_to_pipeline = "./pipelines/monthly"

See Catalog Projects for detailed configuration options.

Connecting Pipelines#

One of the key benefits of a Catalog is the ability to load data from any pipeline programmatically:

from chartbook import data

# Load data from any pipeline in your catalog
quarterly_df = data.load(pipeline="QUARTERLY", dataframe="summary")
monthly_df = data.load(pipeline="MONTHLY", dataframe="metrics")

ChartBook (Future Feature)#

A ChartBook is a curated narrative document that combines charts, data, and explanatory text from one or more pipelines into a polished, shareable report.

Note

ChartBook creation is a planned feature. Currently, pipelines generate documentation automatically. The ChartBook feature will allow you to create custom narrative documents by selecting and arranging content from your pipelines.

Vision for ChartBooks#

Select charts from multiple pipelines
Add narrative text between visualizations
Create executive summaries and reports
Share as standalone documents

ChartHub (Future Feature)#

ChartHub is envisioned as a global platform for sharing ChartBooks and Catalogs publicly—similar to how GitHub enables sharing code repositories.