Skip to main content
Version: v0.12.0

API Modules

The Analytics Data Plane API provides specialized functionality for distributed analytics and computation across multiple dataspace participants. This document provides a technical overview of each module and its responsibilities.

Note: The Analytics Data Plane is currently in development and not ready for production use. This documentation reflects the planned architecture and current implementation status.

Core Analytics Modules

AnalysesModule

Purpose: Manages analytics workflows, computation definitions, and analysis execution within the distributed analytics framework.

Key Components:

  • AnalysesController - Provides endpoints for analytics workflow management and execution
  • AnalysesService - Core business logic for analysis orchestration and lifecycle management

Data Entities: AnalysisDao

Responsibilities:

  • Define and manage analytics workflows
  • Execute distributed computations across multiple parties
  • Track analysis progress and results
  • Coordinate with orchestration layer for multi-party execution

OrchestrationModule

Purpose: Orchestrates distributed analytics across multiple dataspace participants, managing coordination, synchronization, and result aggregation.

Key Components:

  • OrchestrationManagementController - Management endpoints for orchestration configuration and monitoring
  • OrchestrationService - Core orchestration logic for distributed analytics coordination

Dependencies: AuthModule, LoggingModule, FilesModule, AnalysesModule

Responsibilities:

  • Coordinate multi-party analytics execution
  • Manage participant synchronization
  • Handle distributed computation workflows
  • Aggregate and manage distributed results

Data Management Modules

FilesModule

Purpose: Manages file operations, data storage, and data exchange for analytics workflows.

Key Components:

  • File upload and download management
  • Data staging for analytics operations
  • Secure file handling and access control
  • Integration with distributed storage systems

Integration: Used by orchestration and analyses modules for data management

DataPlaneTestModule

Purpose: Provides data plane coordination and testing functionality specific to analytics scenarios.

Integration: Coordinates with Control Plane for analytics-specific transfer protocols

Infrastructure Modules

EventsModule

Purpose: Manages event-driven coordination for distributed analytics operations.

Key Components:

  • Event publishing and subscription for distributed coordination
  • Analytics workflow state management
  • Cross-participant communication handling
  • Real-time updates and notifications

Integration: Enables asynchronous coordination between analytics components

LoggingModule

Purpose: Provides specialized logging and audit capabilities for analytics operations.

Key Components:

  • Analytics-specific logging patterns
  • Distributed operation tracking
  • Compliance and audit trail management
  • Performance monitoring for analytics workflows

Support Components

AuthModule

Purpose: Authentication and authorization framework for securing analytics endpoints.

Source: Shared from @tsg-dsp/common-api library

ConfigController

Purpose: Configuration management for analytics data plane settings and operational parameters.

Analytics Architecture Patterns

The Analytics Data Plane implements several specialized patterns for distributed computation:

Distributed Computing

  • Multi-Party Coordination: Orchestrates analytics across multiple independent participants
  • Workflow Management: Manages complex analytics pipelines with dependencies and scheduling
  • Result Aggregation: Collects and combines results from distributed computations
  • Privacy Preservation: Ensures data privacy while enabling collaborative analytics

Event-Driven Coordination

  • Asynchronous Processing: Non-blocking coordination for long-running analytics operations
  • State Synchronization: Maintains consistent state across distributed participants
  • Real-Time Updates: Provides live status updates for analytics workflows
  • Error Handling: Robust error propagation and recovery mechanisms

Data Management

  • Secure File Handling: Encrypted and access-controlled data operations
  • Staging Management: Efficient data preparation for analytics execution
  • Result Storage: Secure storage and retrieval of analytics outputs
  • Cleanup Operations: Automated cleanup of temporary and intermediate data

Development Status

Current Implementation

  • Basic module structure and interfaces
  • Core analytics workflow definitions
  • File management capabilities
  • Event-driven coordination framework

Planned Features

  • Advanced Orchestration: Enhanced multi-party coordination capabilities
  • Privacy-Preserving Analytics: Secure computation protocols for sensitive data
  • ML/AI Integration: Native support for machine learning and AI workloads
  • Performance Optimization: High-performance distributed computing optimizations

Integration Points

  • Control Plane: Enhanced protocol support for analytics transfers
  • Data Sources: Native integration with various data sources and formats
  • External Analytics: Integration with existing analytics platforms and tools
  • Result Distribution: Secure and efficient result sharing mechanisms

The Analytics Data Plane represents a forward-looking approach to distributed analytics within the TSG ecosystem, enabling secure, privacy-preserving collaborative analysis across dataspace participants while maintaining full control over data sovereignty and access policies.