Plugins¶

plugins ¶

Classes¶

NamingStrategy ¶

Bases: Protocol

Protocol for file naming strategies.

This protocol defines the interface for generating file names when exporting datasets. Custom naming strategies can be implemented by creating classes that follow this protocol.

Example

Implementing a custom naming strategy:

class CustomNamingStrategy:
    def gen_name(
        self, origin: str, source: str | None, image_id: str
    ) -> str:
        # Generate name with source prefix
        if source:
            return f"{source}_{image_id}_{origin}"
        return f"{image_id}_{origin}"


# Use with exporter
strategy = CustomNamingStrategy()
exporter.export(
    dataset, output_dir="output/", naming_strategy=strategy
)

Example

Simple naming strategy that preserves original names:

class OriginalNamingStrategy:
    def gen_name(
        self, origin: str, source: str | None, image_id: str
    ) -> str:
        return origin

Functions¶

gen_name ¶

gen_name(origin: str, source: str | None, image_id: str) -> str

Generate a new file name.

Parameters:

Name	Type	Description	Default
`origin`	`str`	Original file name (e.g., "image001.jpg").	required
`source`	`str \| None`	Source name if available (e.g., "camera1"), None otherwise.	required
`image_id`	`str`	Unique image identifier (e.g., "img_12345").	required

Returns:

Type	Description
`str`	Generated file name as a string.

Example

strategy = MyNamingStrategy()
new_name = strategy.gen_name(
    origin="photo.jpg",
    source="camera_front",
    image_id="001",
)
print(new_name)  # Output depends on implementation

Source code in boxlab/dataset/plugins/__init__.py

def gen_name(self, origin: str, source: str | None, image_id: str, /) -> str:
    """Generate a new file name.

    Args:
        origin: Original file name (e.g., "image001.jpg").
        source: Source name if available (e.g., "camera1"), None otherwise.
        image_id: Unique image identifier (e.g., "img_12345").

    Returns:
        Generated file name as a string.

    Example:
        ```python
        strategy = MyNamingStrategy()
        new_name = strategy.gen_name(
            origin="photo.jpg",
            source="camera_front",
            image_id="001",
        )
        print(new_name)  # Output depends on implementation
        ```
    """

LoaderPlugin ¶

Bases: ABC

Base class for dataset loaders.

LoaderPlugin provides the abstract interface for implementing dataset loaders that can read various object detection dataset formats. Subclasses must implement the abstract methods to support specific formats like COCO, YOLO, etc.

This class handles dataset loading with validation and format detection capabilities. Each loader plugin should focus on a specific dataset format and implement the necessary parsing logic.

Example

Implementing a custom loader:

from boxlab.dataset import Dataset
from boxlab.dataset.plugins import LoaderPlugin
import json


class CustomLoader(LoaderPlugin):
    @property
    def name(self) -> str:
        return "custom"

    @property
    def description(self) -> str:
        return "Custom JSON format loader"

    @property
    def supported_extensions(self) -> list[str]:
        return [".json", ".jsonl"]

    def load(self, path, **kwargs):
        dataset = Dataset(name="custom_dataset")

        with open(path, "r") as f:
            data = json.load(f)

        # Parse and populate dataset
        for item in data["images"]:
            # Add images, annotations, categories
            pass

        return dataset


# Use the loader
loader = CustomLoader()
dataset = loader.load("path/to/dataset.json")

Example

Using a loader with validation:

loader = CustomLoader()

# Validate before loading
if loader.validate("dataset.json"):
    dataset = loader.load("dataset.json")
    print(f"Loaded {len(dataset)} images")
else:
    print("Invalid dataset file")

Attributes¶

name `abstractmethod` `property` ¶

name: str

Plugin name (e.g., 'coco', 'yolo').

Returns:

Type	Description
`str`	A unique lowercase string identifying this loader.

Example

class COCOLoader(LoaderPlugin):
    @property
    def name(self) -> str:
        return "coco"

description `abstractmethod` `property` ¶

description: str

Plugin description.

Returns:

Type	Description
`str`	A human-readable description of what this loader does.

Example

class COCOLoader(LoaderPlugin):
    @property
    def description(self) -> str:
        return "Load datasets in COCO JSON format"

supported_extensions `property` ¶

supported_extensions: list[str]

List of supported file extensions (e.g., ['.json', '.yaml']).

Returns:

Type	Description
`list[str]`	List of file extensions this loader can handle, including the dot.
`list[str]`	Return empty list if not applicable.

Example

class COCOLoader(LoaderPlugin):
    @property
    def supported_extensions(self) -> list[str]:
        return [".json"]


class YOLOLoader(LoaderPlugin):
    @property
    def supported_extensions(self) -> list[str]:
        return [".yaml", ".yml", ".txt"]

Functions¶

load `abstractmethod` ¶

load(path: str | PathLike[str], name: str | None = None, **kwargs: Any) -> Dataset

Load dataset from path.

This method should parse the dataset file(s) at the given path and construct a Dataset object with all images, annotations, and categories.

Parameters:

Name	Type	Description	Default
`path`	`str \| PathLike[str]`	Path to dataset file or directory. Can be a JSON file, YAML file, or directory containing dataset files.	required
`name`	`str \| None`	Name to assign to the loaded Dataset instance.	`None`
`**kwargs`	`Any`	Additional loader-specific parameters. Common options: - image_root (str): Root directory for image files - source_name (str): Name to tag this data source - strict (bool): Whether to fail on parse errors	`{}`

Returns:

Type	Description
`Dataset`	A populated Dataset instance containing all loaded data.

Raises:

Type	Description
`FileNotFoundError`	If the specified path doesn't exist.
`ValueError`	If the dataset format is invalid or corrupted.
`PermissionError`	If files cannot be read due to permissions.

Example

class COCOLoader(LoaderPlugin):
    def load(self, path, **kwargs):
        dataset = Dataset(name="coco")
        image_root = kwargs.get("image_root", ".")

        with open(path) as f:
            data = json.load(f)

        # Load categories
        for cat in data["categories"]:
            dataset.add_category(cat["id"], cat["name"])

        # Load images and annotations
        # ... implementation details ...

        return dataset


# Usage
loader = COCOLoader()
dataset = loader.load(
    "annotations.json",
    image_root="/data/images",
    source_name="train2017",
)

Source code in boxlab/dataset/plugins/__init__.py

@abc.abstractmethod
def load(
    self,
    path: str | os.PathLike[str],
    name: str | None = None,
    **kwargs: t.Any,
) -> Dataset:
    """Load dataset from path.

    This method should parse the dataset file(s) at the given path and
    construct a Dataset object with all images, annotations, and categories.

    Args:
        path: Path to dataset file or directory. Can be a JSON file,
            YAML file, or directory containing dataset files.
        name: Name to assign to the loaded Dataset instance.
        **kwargs: Additional loader-specific parameters. Common options:
            - image_root (str): Root directory for image files
            - source_name (str): Name to tag this data source
            - strict (bool): Whether to fail on parse errors

    Returns:
        A populated Dataset instance containing all loaded data.

    Raises:
        FileNotFoundError: If the specified path doesn't exist.
        ValueError: If the dataset format is invalid or corrupted.
        PermissionError: If files cannot be read due to permissions.

    Example:
        ```python
        class COCOLoader(LoaderPlugin):
            def load(self, path, **kwargs):
                dataset = Dataset(name="coco")
                image_root = kwargs.get("image_root", ".")

                with open(path) as f:
                    data = json.load(f)

                # Load categories
                for cat in data["categories"]:
                    dataset.add_category(cat["id"], cat["name"])

                # Load images and annotations
                # ... implementation details ...

                return dataset


        # Usage
        loader = COCOLoader()
        dataset = loader.load(
            "annotations.json",
            image_root="/data/images",
            source_name="train2017",
        )
        ```
    """

validate ¶

validate(path: str | PathLike[str]) -> bool

Check if this loader can handle the given path.

Performs basic validation to determine if the file or directory at the given path appears to be in a format this loader can handle. This is typically used for automatic format detection.

Parameters:

Name	Type	Description	Default
`path`	`str \| PathLike[str]`	Path to validate. Can be a file or directory.	required

Returns:

Type	Description
`bool`	True if this loader can likely handle the path, False otherwise.

Note

This method performs basic checks (existence, extension). It does not guarantee that load() will succeed, as it doesn't validate the full file contents.

Example

loader = COCOLoader()

if loader.validate("dataset.json"):
    dataset = loader.load("dataset.json")
else:
    print("Not a valid COCO format file")

Example

Custom validation logic:

class YOLOLoader(LoaderPlugin):
    def validate(self, path):
        # Call parent validation first
        if not super().validate(path):
            return False

        # Additional YOLO-specific checks
        path = pathlib.Path(path)
        if path.is_file() and path.suffix in [
            ".yaml",
            ".yml",
        ]:
            # Check for YOLO-specific keys
            with open(path) as f:
                data = yaml.safe_load(f)
                return "names" in data and "path" in data

        return False

Source code in boxlab/dataset/plugins/__init__.py

def validate(self, path: str | os.PathLike[str]) -> bool:
    """Check if this loader can handle the given path.

    Performs basic validation to determine if the file or directory at the
    given path appears to be in a format this loader can handle. This is
    typically used for automatic format detection.

    Args:
        path: Path to validate. Can be a file or directory.

    Returns:
        True if this loader can likely handle the path, False otherwise.

    Note:
        This method performs basic checks (existence, extension). It does not
        guarantee that load() will succeed, as it doesn't validate the full
        file contents.

    Example:
        ```python
        loader = COCOLoader()

        if loader.validate("dataset.json"):
            dataset = loader.load("dataset.json")
        else:
            print("Not a valid COCO format file")
        ```

    Example:
        Custom validation logic:

        ```python
        class YOLOLoader(LoaderPlugin):
            def validate(self, path):
                # Call parent validation first
                if not super().validate(path):
                    return False

                # Additional YOLO-specific checks
                path = pathlib.Path(path)
                if path.is_file() and path.suffix in [
                    ".yaml",
                    ".yml",
                ]:
                    # Check for YOLO-specific keys
                    with open(path) as f:
                        data = yaml.safe_load(f)
                        return "names" in data and "path" in data

                return False
        ```
    """
    import pathlib

    path = pathlib.Path(path)

    if not path.exists():
        return False

    # Check file extension if supported_extensions is defined
    if self.supported_extensions:
        return path.suffix.lower() in self.supported_extensions

    return True

ExporterPlugin ¶

Bases: ABC

Base class for dataset exporters.

ExporterPlugin provides the abstract interface for implementing dataset exporters that can write datasets to various object detection formats. Subclasses must implement the abstract methods to support specific formats like COCO, YOLO, etc.

This class handles dataset export with support for train/val/test splits, custom naming strategies, and optional image copying. Each exporter plugin should focus on a specific output format.

Example

Implementing a custom exporter:

from boxlab.dataset import Dataset, SplitRatio
from boxlab.dataset.plugins import ExporterPlugin
import json
import shutil
from pathlib import Path


class CustomExporter(ExporterPlugin):
    @property
    def name(self) -> str:
        return "custom"

    @property
    def description(self) -> str:
        return "Export to custom JSON format"

    @property
    def default_extension(self) -> str:
        return ".json"

    def export(
        self,
        dataset,
        output_dir,
        split_ratio=None,
        seed=None,
        naming_strategy=None,
        copy_images=True,
        **kwargs,
    ):
        output_dir = Path(output_dir)
        output_dir.mkdir(parents=True, exist_ok=True)

        # Handle splits if requested
        if split_ratio:
            splits = dataset.split(split_ratio, seed=seed)
        else:
            splits = {"all": list(dataset.images.keys())}

        # Export each split
        for split_name, image_ids in splits.items():
            split_data = {"images": [], "annotations": []}

            # Export logic here
            # ...

            # Write JSON file
            output_file = output_dir / f"{split_name}.json"
            with open(output_file, "w") as f:
                json.dump(split_data, f, indent=2)


# Use the exporter
exporter = CustomExporter()
exporter.export(
    dataset,
    output_dir="output/custom",
    split_ratio=SplitRatio(train=0.7, val=0.2, test=0.1),
    seed=42,
)

Example

Using an exporter with custom configuration:

from boxlab.dataset import Dataset
from boxlab.dataset.plugins import ExporterPlugin

exporter = MyExporter()

# Get default configuration
config = exporter.get_default_config()
print(
    config
)  # {'copy_images': True, 'naming_strategy': 'original'}

# Export with custom settings
exporter.export(
    dataset=my_dataset,
    output_dir="output/",
    copy_images=False,
    indent=4,  # Custom parameter
)

Attributes¶

name `abstractmethod` `property` ¶

name: str

Plugin name (e.g., 'coco', 'yolo').

Returns:

Type	Description
`str`	A unique lowercase string identifying this exporter.

Example

class COCOExporter(ExporterPlugin):
    @property
    def name(self) -> str:
        return "coco"

description `abstractmethod` `property` ¶

description: str

Plugin description.

Returns:

Type	Description
`str`	A human-readable description of what this exporter does.

Example

class COCOExporter(ExporterPlugin):
    @property
    def description(self) -> str:
        return "Export datasets to COCO JSON format"

default_extension `property` ¶

default_extension: str

Default file extension for exported files.

Returns:

Type	Description
`str`	File extension string including the dot (e.g., ".json", ".txt").
`str`	Return empty string if not applicable.

Example

class COCOExporter(ExporterPlugin):
    @property
    def default_extension(self) -> str:
        return ".json"


class YOLOExporter(ExporterPlugin):
    @property
    def default_extension(self) -> str:
        return ".txt"

Functions¶

export `abstractmethod` ¶

export(dataset: Dataset, output_dir: str | PathLike[str], split_ratio: SplitRatio | None = None, seed: int | None = None, naming_strategy: NamingStrategy | None = None, copy_images: bool = True, **kwargs: Any) -> None

Export dataset to output directory.

This method should write the dataset to disk in the format supported by this exporter. It should handle creating output directories, optionally splitting the dataset, copying images, and writing annotation files.

Parameters:

Name	Type	Description	Default
`dataset`	`Dataset`	The Dataset instance to export.	required
`output_dir`	`str \| PathLike[str]`	Path to the output directory. Will be created if it doesn't exist.	required
`split_ratio`	`SplitRatio \| None`	Optional SplitRatio object defining train/val/test proportions. If None, exports entire dataset without splitting.	`None`
`seed`	`int \| None`	Random seed for reproducible splits. Only used if split_ratio is provided.	`None`
`naming_strategy`	`NamingStrategy \| None`	Optional NamingStrategy instance for generating image file names. If None, uses original file names.	`None`
`copy_images`	`bool`	If True, copies image files to output directory. If False, only writes annotation files.	`True`
`**kwargs`	`Any`	Additional exporter-specific parameters. Common options: - indent (int): JSON indentation level - include_metadata (bool): Whether to include extra metadata - compress (bool): Whether to compress output files	`{}`

Raises:

Type	Description
`ValueError`	If export parameters are invalid (e.g., invalid split_ratio).
`OSError`	If file operations fail (permission denied, disk full, etc.).
`DatasetError`	If dataset is empty or malformed.

Example

from boxlab.dataset import Dataset, SplitRatio
from pathlib import Path


class COCOExporter(ExporterPlugin):
    def export(
        self,
        dataset,
        output_dir,
        split_ratio=None,
        seed=None,
        naming_strategy=None,
        copy_images=True,
        **kwargs,
    ):
        output_dir = Path(output_dir)
        output_dir.mkdir(parents=True, exist_ok=True)

        # Handle splits
        if split_ratio:
            splits = dataset.split(split_ratio, seed=seed)
        else:
            splits = {"all": list(dataset.images.keys())}

        # Export each split
        for split_name, image_ids in splits.items():
            # Create COCO format dictionary
            coco_data = {
                "images": [],
                "annotations": [],
                "categories": [],
            }

            # Add categories
            for (
                cat_id,
                cat_name,
            ) in dataset.categories.items():
                coco_data["categories"].append({
                    "id": cat_id,
                    "name": cat_name,
                })

            # Add images and annotations
            # ... implementation ...

            # Write JSON
            output_file = output_dir / f"{split_name}.json"
            with open(output_file, "w") as f:
                json.dump(coco_data, f, indent=2)

            # Copy images if requested
            if copy_images:
                # ... copy logic ...
                pass


# Usage
exporter = COCOExporter()
exporter.export(
    dataset=my_dataset,
    output_dir="output/coco",
    split_ratio=SplitRatio(train=0.8, val=0.1, test=0.1),
    seed=42,
    copy_images=True,
    indent=4,
)

Example

Exporting without splits:

exporter = COCOExporter()
exporter.export(
    dataset=my_dataset,
    output_dir="output/full_dataset",
    copy_images=False,  # Only export annotations
)

Source code in boxlab/dataset/plugins/__init__.py

@abc.abstractmethod
def export(
    self,
    dataset: Dataset,
    output_dir: str | os.PathLike[str],
    split_ratio: SplitRatio | None = None,
    seed: int | None = None,
    naming_strategy: NamingStrategy | None = None,
    copy_images: bool = True,
    **kwargs: t.Any,
) -> None:
    """Export dataset to output directory.

    This method should write the dataset to disk in the format supported by
    this exporter. It should handle creating output directories, optionally
    splitting the dataset, copying images, and writing annotation files.

    Args:
        dataset: The Dataset instance to export.
        output_dir: Path to the output directory. Will be created if it
            doesn't exist.
        split_ratio: Optional SplitRatio object defining train/val/test
            proportions. If None, exports entire dataset without splitting.
        seed: Random seed for reproducible splits. Only used if split_ratio
            is provided.
        naming_strategy: Optional NamingStrategy instance for generating
            image file names. If None, uses original file names.
        copy_images: If True, copies image files to output directory.
            If False, only writes annotation files.
        **kwargs: Additional exporter-specific parameters. Common options:
            - indent (int): JSON indentation level
            - include_metadata (bool): Whether to include extra metadata
            - compress (bool): Whether to compress output files

    Raises:
        ValueError: If export parameters are invalid (e.g., invalid split_ratio).
        OSError: If file operations fail (permission denied, disk full, etc.).
        DatasetError: If dataset is empty or malformed.

    Example:
        ```python
        from boxlab.dataset import Dataset, SplitRatio
        from pathlib import Path


        class COCOExporter(ExporterPlugin):
            def export(
                self,
                dataset,
                output_dir,
                split_ratio=None,
                seed=None,
                naming_strategy=None,
                copy_images=True,
                **kwargs,
            ):
                output_dir = Path(output_dir)
                output_dir.mkdir(parents=True, exist_ok=True)

                # Handle splits
                if split_ratio:
                    splits = dataset.split(split_ratio, seed=seed)
                else:
                    splits = {"all": list(dataset.images.keys())}

                # Export each split
                for split_name, image_ids in splits.items():
                    # Create COCO format dictionary
                    coco_data = {
                        "images": [],
                        "annotations": [],
                        "categories": [],
                    }

                    # Add categories
                    for (
                        cat_id,
                        cat_name,
                    ) in dataset.categories.items():
                        coco_data["categories"].append({
                            "id": cat_id,
                            "name": cat_name,
                        })

                    # Add images and annotations
                    # ... implementation ...

                    # Write JSON
                    output_file = output_dir / f"{split_name}.json"
                    with open(output_file, "w") as f:
                        json.dump(coco_data, f, indent=2)

                    # Copy images if requested
                    if copy_images:
                        # ... copy logic ...
                        pass


        # Usage
        exporter = COCOExporter()
        exporter.export(
            dataset=my_dataset,
            output_dir="output/coco",
            split_ratio=SplitRatio(train=0.8, val=0.1, test=0.1),
            seed=42,
            copy_images=True,
            indent=4,
        )
        ```

    Example:
        Exporting without splits:

        ```python
        exporter = COCOExporter()
        exporter.export(
            dataset=my_dataset,
            output_dir="output/full_dataset",
            copy_images=False,  # Only export annotations
        )
        ```
    """

get_default_config ¶

get_default_config() -> dict[str, Any]

Get default configuration for this exporter.

Returns:

Type	Description
`dict[str, Any]`	Dictionary of default configuration values that will be used
`dict[str, Any]`	if not overridden in export() call.

Note

Subclasses can override this to provide format-specific defaults.

Example

class YOLOExporter(ExporterPlugin):
    def get_default_config(self) -> dict[str, t.Any]:
        return {
            "copy_images": True,
            "naming_strategy": "original",
            "normalize_coords": True,
            "include_yaml": True,
        }


# Usage
exporter = YOLOExporter()
config = exporter.get_default_config()
print(config["normalize_coords"])  # True

Example

Using default config in export:

class CustomExporter(ExporterPlugin):
    def get_default_config(self) -> dict[str, t.Any]:
        return {
            "copy_images": True,
            "naming_strategy": "original",
            "compression": "zip",
        }

    def export(self, dataset, output_dir, **kwargs):
        # Merge with defaults
        config = self.get_default_config()
        config.update(kwargs)

        # Use configuration
        if config["compression"] == "zip":
            # ... compression logic ...
            pass

Source code in boxlab/dataset/plugins/__init__.py

def get_default_config(self) -> dict[str, t.Any]:
    """Get default configuration for this exporter.

    Returns:
        Dictionary of default configuration values that will be used
        if not overridden in export() call.

    Note:
        Subclasses can override this to provide format-specific defaults.

    Example:
        ```python
        class YOLOExporter(ExporterPlugin):
            def get_default_config(self) -> dict[str, t.Any]:
                return {
                    "copy_images": True,
                    "naming_strategy": "original",
                    "normalize_coords": True,
                    "include_yaml": True,
                }


        # Usage
        exporter = YOLOExporter()
        config = exporter.get_default_config()
        print(config["normalize_coords"])  # True
        ```

    Example:
        Using default config in export:

        ```python
        class CustomExporter(ExporterPlugin):
            def get_default_config(self) -> dict[str, t.Any]:
                return {
                    "copy_images": True,
                    "naming_strategy": "original",
                    "compression": "zip",
                }

            def export(self, dataset, output_dir, **kwargs):
                # Merge with defaults
                config = self.get_default_config()
                config.update(kwargs)

                # Use configuration
                if config["compression"] == "zip":
                    # ... compression logic ...
                    pass
        ```
    """
    return {
        "copy_images": True,
        "naming_strategy": "original",
    }

options: show_root_heading: true show_source: true heading_level: 2 members_order: source show_signature_annotations: true separate_signature: true

Overview¶

The plugin system provides extensible interfaces for loading and exporting datasets in various formats. BoxLab comes with built-in plugins for popular formats like COCO and YOLO, and allows custom plugin development.

Architecture¶

The plugin system consists of three main components:

NamingStrategy: Protocol for generating file names during export
LoaderPlugin: Abstract base class for dataset loaders
ExporterPlugin: Abstract base class for dataset exporters

NamingStrategy Protocol¶

Define custom file naming strategies when exporting datasets:

class CustomNamingStrategy:
    def gen_name(self, origin: str, source: str | None, image_id: str) -> str:
        if source:
            return f"{source}_{image_id}_{origin}"
        return f"{image_id}_{origin}"

# Use with exporter
strategy = CustomNamingStrategy()
exporter.export(dataset, output_dir="output/", naming_strategy=strategy)

LoaderPlugin¶

Create custom dataset loaders by implementing the LoaderPlugin abstract class:

import json

from boxlab.dataset import Dataset
from boxlab.dataset.plugins import LoaderPlugin

class CustomLoader(LoaderPlugin):
    @property
    def name(self) -> str:
        return "custom"

    @property
    def description(self) -> str:
        return "Custom JSON format loader"

    @property
    def supported_extensions(self) -> list[str]:
        return [".json", ".jsonl"]

    def load(self, path, **kwargs):
        dataset = Dataset(name="custom_dataset")

        with open(path, "r") as f:
            data = json.load(f)

        # Parse and populate dataset
        for item in data["images"]:
            # Add images, annotations, categories
            pass

        return dataset

# Register and use the loader
from boxlab.dataset.plugins.registry import register_loader
register_loader("custom", CustomLoader)

loader = get_loader("custom")
dataset = loader.load("path/to/dataset.json")

ExporterPlugin¶

Create custom dataset exporters by implementing the ExporterPlugin abstract class:

from boxlab.dataset import Dataset, SplitRatio
from boxlab.dataset.plugins import ExporterPlugin
import json
from pathlib import Path

class CustomExporter(ExporterPlugin):
    @property
    def name(self) -> str:
        return "custom"

    @property
    def description(self) -> str:
        return "Export to custom JSON format"

    @property
    def default_extension(self) -> str:
        return ".json"

    def export(
        self,
        dataset,
        output_dir,
        split_ratio=None,
        seed=None,
        naming_strategy=None,
        copy_images=True,
        **kwargs,
    ):
        output_dir = Path(output_dir)
        output_dir.mkdir(parents=True, exist_ok=True)

        # Handle splits if requested
        if split_ratio:
            splits = dataset.split(split_ratio, seed=seed)
        else:
            splits = {"all": list(dataset.images.keys())}

        # Export each split
        for split_name, image_ids in splits.items():
            split_data = {"images": [], "annotations": []}

            # Export logic here
            # ...

            # Write JSON file
            output_file = output_dir / f"{split_name}.json"
            with open(output_file, "w") as f:
                json.dump(split_data, f, indent=2)

# Register and use the exporter
from boxlab.dataset.plugins.registry import register_exporter
register_exporter("custom", CustomExporter)

exporter = get_exporter("custom")
exporter.export(
    dataset,
    output_dir="output/custom",
    split_ratio=SplitRatio(train=0.7, val=0.2, test=0.1),
    seed=42,
)

Built-in Plugins¶

BoxLab includes the following built-in plugins:

COCO: Load and export COCO format datasets
YOLO: Load and export YOLOv5/YOLOv8 format datasets

Plugin Registry¶

Manage plugins using the registry system:

Registry: Register, retrieve, and discover plugins

Key Methods¶

LoaderPlugin¶

name: Unique plugin identifier
description: Human-readable description
supported_extensions: List of supported file extensions
load(): Load dataset from path
validate(): Check if loader can handle a path

ExporterPlugin¶

name: Unique plugin identifier
description: Human-readable description
default_extension: Default file extension
export(): Export dataset to directory
get_default_config(): Get default configuration

Plugins¶

plugins ¶

Classes¶

NamingStrategy ¶

Functions¶

gen_name ¶

LoaderPlugin ¶

Attributes¶

name abstractmethod property ¶

description abstractmethod property ¶

supported_extensions property ¶

Functions¶

load abstractmethod ¶

validate ¶

ExporterPlugin ¶

Attributes¶

name abstractmethod property ¶

description abstractmethod property ¶

default_extension property ¶

Functions¶

export abstractmethod ¶

get_default_config ¶

Overview¶

Architecture¶

NamingStrategy Protocol¶

LoaderPlugin¶

ExporterPlugin¶

Built-in Plugins¶

Plugin Registry¶

Key Methods¶

LoaderPlugin¶

ExporterPlugin¶

See Also¶

name `abstractmethod` `property` ¶

description `abstractmethod` `property` ¶

supported_extensions `property` ¶

load `abstractmethod` ¶

name `abstractmethod` `property` ¶

description `abstractmethod` `property` ¶

default_extension `property` ¶

export `abstractmethod` ¶