BoxLab Documentation¶

Welcome to BoxLab - A Python toolkit for managing, converting, and annotating object detection datasets with support for COCO and YOLO formats.

What is BoxLab?¶

BoxLab is a comprehensive solution for working with object detection datasets. It provides:

Dataset Management - Load, merge, split, and analyze datasets
Format Conversion - Seamlessly convert between COCO and YOLO formats
GUI Annotator - Interactive desktop application for viewing and editing annotations
CLI Tools - Powerful command-line interface for batch operations
PyTorch Integration - Direct integration with PyTorch training pipelines
Plugin System - Extensible architecture for custom formats

Quick Links¶

Getting Started

New to BoxLab? Start here!

Installation Guide

Quick Start
CLI Reference

Command-line interface documentation

CLI Overview

Dataset Commands
API Reference

Complete Python API documentation

API Reference

Dataset API
GUI Annotator

Interactive annotation application

CLI Command

Installation¶

From PyPI (Recommended)¶

pip install boxlab

From Source¶

# Clone repository
git clone https://github.com/6ixGODD/boxlab.git
cd boxlab

# Install with Poetry
poetry install

# Or with pip
pip install -e .

See the Installation Guide for detailed instructions.

Quick Start¶

View Dataset Info¶

boxlab dataset info data/coco/annotations.json --format coco

Convert Format¶

# COCO to YOLO
boxlab dataset convert input.json -if coco output -of yolo

# YOLO to COCO
boxlab dataset convert data/yolo -if yolo output -of coco

Launch Annotator¶

boxlab annotator

Python API¶

from boxlab.dataset.io import load_dataset, export_dataset

# Load dataset
dataset = load_dataset("annotations.json", format="coco")

# Export to different format
export_dataset(dataset, "output/yolo", format="yolo")

See the Quick Start Guide for more examples.

Features¶

Dataset Management¶

Multi-format Support - COCO JSON and YOLO formats
Source Tracking - Track dataset origins in merged datasets
Statistics - Comprehensive dataset analysis
Visualization - Generate distribution plots and sample images

Format Conversion¶

Bidirectional - Convert between COCO and YOLO
Flexible Splitting - Custom train/val/test ratios
Naming Strategies - Multiple file naming options
Validation - Automatic format validation

Annotation Tools¶

Visual Editor - Interactive bounding box editing
Audit Workflow - Approve/reject images systematically
Tagging System - Organize images with custom tags
Workspace Persistence - Save and restore work sessions

Command Line Interface¶

Intuitive Commands - Easy-to-use CLI structure
Rich Output - Formatted tables and progress indicators
Batch Operations - Process multiple datasets
Scriptable - Integration with automation workflows

PyTorch Integration¶

Dataset Adapter - Direct PyTorch Dataset compatibility
Transform Support - Built-in augmentation pipelines
DataLoader Ready - Custom collate functions
Training Workflows - Seamless integration with training loops

Use Cases¶

Format Conversion¶

Convert your existing datasets to the format required by your training framework:

boxlab dataset convert coco_annotations.json -if coco yolo_output -of yolo

Dataset Merging¶

Combine multiple annotation sources into a single unified dataset:

boxlab dataset merge \
  -i manual_labels.json coco manual \
  -i auto_labels.json coco automatic \
  -o merged_dataset

Quality Assurance¶

Use the annotator to review and audit dataset quality:

boxlab annotator
# Enable Audit Mode → Review images → Export report

Training Preparation¶

Prepare datasets for model training with PyTorch:

from boxlab.dataset.io import load_dataset
from boxlab.dataset.torchadapter import build_torchdataset
from torch.utils.data import DataLoader

dataset = load_dataset("train.json", format="coco")
torch_dataset = build_torchdataset(dataset, image_size=640, augment=True)
loader = DataLoader(torch_dataset, batch_size=16, collate_fn=torch_dataset.collate)

Documentation Structure¶

Guides¶

Step-by-step tutorials and conceptual guides:

Installation - Setup and installation
Quick Start - Get started in 5 minutes
CLI Overview - Command-line interface basics
Dataset Commands - Dataset operation reference
Annotator Command - GUI application usage

API Reference¶

Complete technical documentation:

API Overview - API documentation index
Dataset Core - Dataset management
I/O Operations - Loading and exporting
Types - Data structures
Plugin System - Extensibility
PyTorch Adapter - Training integration
Exceptions - Error handling

Examples¶

Convert COCO to YOLO with Split¶

boxlab dataset convert \
  annotations.json \
  -if coco \
  output/yolo \
  -of yolo \
  --train-ratio 0.7 \
  --val-ratio 0.2 \
  --test-ratio 0.1 \
  --seed 42

Merge Three Datasets¶

boxlab dataset merge \
  -i dataset1/ann.json coco source1 \
  -i dataset2/ann.json coco source2 \
  -i dataset3 yolo source3 \
  -o merged_output \
  --output-format coco

Visualize Dataset¶

boxlab dataset visualize \
  data/yolo \
  --format yolo \
  -o visualizations \
  --samples 10 \
  --show-heatmap

PyTorch Training Loop¶

from boxlab.dataset.io import load_dataset
from boxlab.dataset.torchadapter import build_torchdataset
from torch.utils.data import DataLoader
import torch

# Prepare dataset
dataset = load_dataset("train_annotations.json", format="coco")
train_dataset = build_torchdataset(
    dataset,
    image_size=640,
    augment=True,
    normalize=True,
    return_format="xyxy"
)

# Create DataLoader
train_loader = DataLoader(
    train_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4,
    collate_fn=train_dataset.collate
)

# Training loop
model = YourDetectionModel()
optimizer = torch.optim.Adam(model.parameters())

for epoch in range(num_epochs):
    for images, targets in train_loader:
        images = [img.to(device) for img in images]
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)
        losses = sum(loss for loss in loss_dict.values())

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

System Requirements¶

Python: 3.10 or higher
Operating System: Linux, macOS, Windows
RAM: 2GB minimum, 4GB recommended
Disk Space: 500MB for installation

Optional Dependencies¶

PyTorch: For training integration (pip install torch torchvision)
CUDA: For GPU acceleration (with PyTorch GPU version)

Project Information¶

GitHub: github.com/6ixGODD/boxlab
PyPI: pypi.org/project/boxlab
License: MIT
Author: BoChenSHEN (6ixGODD)

Getting Help¶

Documentation¶

Browse the Guides for tutorials
Check the API Reference for detailed documentation

Community¶

GitHub Issues: Report bugs or request features
GitHub Discussions: Ask questions and share ideas

Contributing¶

Contributions are welcome! Please see:

Development Setup

What's Next?¶

New Users

Follow the installation and quick start guides

Installation

Quick Start
CLI Users

Learn the command-line interface

CLI Overview

Dataset Commands
Python Developers

Explore the Python API

API Reference

PyTorch Integration
Annotators

Use the GUI application

Annotator Command