BoxLab Documentation¶
Welcome to BoxLab - A Python toolkit for managing, converting, and annotating object detection datasets with support for COCO and YOLO formats.
What is BoxLab?¶
BoxLab is a comprehensive solution for working with object detection datasets. It provides:
- Dataset Management - Load, merge, split, and analyze datasets
- Format Conversion - Seamlessly convert between COCO and YOLO formats
- GUI Annotator - Interactive desktop application for viewing and editing annotations
- CLI Tools - Powerful command-line interface for batch operations
- PyTorch Integration - Direct integration with PyTorch training pipelines
- Plugin System - Extensible architecture for custom formats
Quick Links¶
-
Getting Started
New to BoxLab? Start here!
-
CLI Reference
Command-line interface documentation
-
API Reference
Complete Python API documentation
-
GUI Annotator
Interactive annotation application
Installation¶
From PyPI (Recommended)¶
From Source¶
# Clone repository
git clone https://github.com/6ixGODD/boxlab.git
cd boxlab
# Install with Poetry
poetry install
# Or with pip
pip install -e .
See the Installation Guide for detailed instructions.
Quick Start¶
View Dataset Info¶
Convert Format¶
# COCO to YOLO
boxlab dataset convert input.json -if coco output -of yolo
# YOLO to COCO
boxlab dataset convert data/yolo -if yolo output -of coco
Launch Annotator¶
Python API¶
from boxlab.dataset.io import load_dataset, export_dataset
# Load dataset
dataset = load_dataset("annotations.json", format="coco")
# Export to different format
export_dataset(dataset, "output/yolo", format="yolo")
See the Quick Start Guide for more examples.
Features¶
Dataset Management¶
- Multi-format Support - COCO JSON and YOLO formats
- Source Tracking - Track dataset origins in merged datasets
- Statistics - Comprehensive dataset analysis
- Visualization - Generate distribution plots and sample images
Format Conversion¶
- Bidirectional - Convert between COCO and YOLO
- Flexible Splitting - Custom train/val/test ratios
- Naming Strategies - Multiple file naming options
- Validation - Automatic format validation
Annotation Tools¶
- Visual Editor - Interactive bounding box editing
- Audit Workflow - Approve/reject images systematically
- Tagging System - Organize images with custom tags
- Workspace Persistence - Save and restore work sessions
Command Line Interface¶
- Intuitive Commands - Easy-to-use CLI structure
- Rich Output - Formatted tables and progress indicators
- Batch Operations - Process multiple datasets
- Scriptable - Integration with automation workflows
PyTorch Integration¶
- Dataset Adapter - Direct PyTorch Dataset compatibility
- Transform Support - Built-in augmentation pipelines
- DataLoader Ready - Custom collate functions
- Training Workflows - Seamless integration with training loops
Use Cases¶
Format Conversion¶
Convert your existing datasets to the format required by your training framework:
Dataset Merging¶
Combine multiple annotation sources into a single unified dataset:
boxlab dataset merge \
-i manual_labels.json coco manual \
-i auto_labels.json coco automatic \
-o merged_dataset
Quality Assurance¶
Use the annotator to review and audit dataset quality:
Training Preparation¶
Prepare datasets for model training with PyTorch:
from boxlab.dataset.io import load_dataset
from boxlab.dataset.torchadapter import build_torchdataset
from torch.utils.data import DataLoader
dataset = load_dataset("train.json", format="coco")
torch_dataset = build_torchdataset(dataset, image_size=640, augment=True)
loader = DataLoader(torch_dataset, batch_size=16, collate_fn=torch_dataset.collate)
Documentation Structure¶
Guides¶
Step-by-step tutorials and conceptual guides:
- Installation - Setup and installation
- Quick Start - Get started in 5 minutes
- CLI Overview - Command-line interface basics
- Dataset Commands - Dataset operation reference
- Annotator Command - GUI application usage
API Reference¶
Complete technical documentation:
- API Overview - API documentation index
- Dataset Core - Dataset management
- I/O Operations - Loading and exporting
- Types - Data structures
- Plugin System - Extensibility
- PyTorch Adapter - Training integration
- Exceptions - Error handling
Examples¶
Convert COCO to YOLO with Split¶
boxlab dataset convert \
annotations.json \
-if coco \
output/yolo \
-of yolo \
--train-ratio 0.7 \
--val-ratio 0.2 \
--test-ratio 0.1 \
--seed 42
Merge Three Datasets¶
boxlab dataset merge \
-i dataset1/ann.json coco source1 \
-i dataset2/ann.json coco source2 \
-i dataset3 yolo source3 \
-o merged_output \
--output-format coco
Visualize Dataset¶
boxlab dataset visualize \
data/yolo \
--format yolo \
-o visualizations \
--samples 10 \
--show-heatmap
PyTorch Training Loop¶
from boxlab.dataset.io import load_dataset
from boxlab.dataset.torchadapter import build_torchdataset
from torch.utils.data import DataLoader
import torch
# Prepare dataset
dataset = load_dataset("train_annotations.json", format="coco")
train_dataset = build_torchdataset(
dataset,
image_size=640,
augment=True,
normalize=True,
return_format="xyxy"
)
# Create DataLoader
train_loader = DataLoader(
train_dataset,
batch_size=16,
shuffle=True,
num_workers=4,
collate_fn=train_dataset.collate
)
# Training loop
model = YourDetectionModel()
optimizer = torch.optim.Adam(model.parameters())
for epoch in range(num_epochs):
for images, targets in train_loader:
images = [img.to(device) for img in images]
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
loss_dict = model(images, targets)
losses = sum(loss for loss in loss_dict.values())
optimizer.zero_grad()
losses.backward()
optimizer.step()
System Requirements¶
- Python: 3.10 or higher
- Operating System: Linux, macOS, Windows
- RAM: 2GB minimum, 4GB recommended
- Disk Space: 500MB for installation
Optional Dependencies¶
- PyTorch: For training integration (
pip install torch torchvision) - CUDA: For GPU acceleration (with PyTorch GPU version)
Project Information¶
- GitHub: github.com/6ixGODD/boxlab
- PyPI: pypi.org/project/boxlab
- License: MIT
- Author: BoChenSHEN (6ixGODD)
Getting Help¶
Documentation¶
- Browse the Guides for tutorials
- Check the API Reference for detailed documentation
Community¶
- GitHub Issues: Report bugs or request features
- GitHub Discussions: Ask questions and share ideas
Contributing¶
Contributions are welcome! Please see:
What's Next?¶
-
New Users
Follow the installation and quick start guides
-
CLI Users
Learn the command-line interface
-
Python Developers
Explore the Python API
-
Annotators
Use the GUI application