跳转至

生命周期管理架构

本文档详细介绍 MCPStore 生命周期管理的内部架构、组件设计和工作原理。

🏗️ 整体架构图

graph TB
    subgraph "用户接口层"
        UserAPI[用户API]
        Context[MCPStoreContext]
    end

    subgraph "生命周期管理层"
        LifecycleManager[ServiceLifecycleManager<br/>生命周期管理器]
        StateMachine[ServiceStateMachine<br/>状态机]
        HealthManager[HealthManager<br/>健康管理器]
        ReconnectionManager[SmartReconnectionManager<br/>智能重连管理器]
    end

    subgraph "状态处理器"
        InitProcessor[InitializingStateProcessor<br/>初始化处理器]
        EventProcessor[StateChangeEventProcessor<br/>事件处理器]
        ContentManager[ContentManager<br/>内容管理器]
    end

    subgraph "监控系统"
        HealthCheck[健康检查<br/>30秒间隔]
        ToolsUpdate[工具更新<br/>2小时间隔]
        StateMonitor[状态监控<br/>实时]
        PerformanceTracker[性能跟踪器]
    end

    subgraph "数据存储"
        Registry[ServiceRegistry<br/>状态存储]
        Metadata[ServiceStateMetadata<br/>元数据]
        Config[LifecycleConfig<br/>配置]
        HealthHistory[健康历史记录]
    end

    subgraph "外部接口"
        Orchestrator[MCPOrchestrator<br/>编排器]
        FastMCP[FastMCP Client<br/>MCP客户端]
        Services[MCP Services<br/>外部服务]
    end

    %% 核心流程
    UserAPI --> Context
    Context --> LifecycleManager

    LifecycleManager --> StateMachine
    LifecycleManager --> HealthManager
    LifecycleManager --> ReconnectionManager

    StateMachine --> InitProcessor
    StateMachine --> EventProcessor

    HealthManager --> HealthCheck
    HealthCheck --> StateMonitor
    ToolsUpdate --> ContentManager
    StateMonitor --> PerformanceTracker

    %% 数据流
    LifecycleManager --> Registry
    Registry --> Metadata
    Config --> StateMachine
    HealthManager --> HealthHistory

    %% 外部交互
    LifecycleManager --> Orchestrator
    Orchestrator --> FastMCP
    FastMCP --> Services

    %% 反馈循环
    Services -.->|健康状态| HealthCheck
    StateMonitor -.->|状态变化| StateMachine
    ReconnectionManager -.->|重连触发| Orchestrator
    PerformanceTracker -.->|性能数据| Registry

    %% 样式
    classDef user fill:#e3f2fd
    classDef lifecycle fill:#f3e5f5
    classDef processor fill:#e8f5e8
    classDef monitor fill:#fff3e0
    classDef storage fill:#fce4ec
    classDef external fill:#f1f8e9

    class UserAPI,Context user
    class LifecycleManager,StateMachine,HealthManager,ReconnectionManager lifecycle
    class InitProcessor,EventProcessor,ContentManager processor
    class HealthCheck,ToolsUpdate,StateMonitor,PerformanceTracker monitor
    class Registry,Metadata,Config,HealthHistory storage
    class Orchestrator,FastMCP,Services external

🔄 7状态生命周期状态机

stateDiagram-v2
    [*] --> INITIALIZING : 服务注册

    INITIALIZING --> HEALTHY : 连接成功<br/>工具获取完成
    INITIALIZING --> RECONNECTING : 初始化失败<br/>连接超时

    HEALTHY --> WARNING : 偶发失败<br/>响应变慢
    HEALTHY --> RECONNECTING : 连续失败<br/>达到重连阈值
    HEALTHY --> DISCONNECTING : 手动停止<br/>用户操作

    WARNING --> HEALTHY : 恢复正常<br/>响应时间改善
    WARNING --> RECONNECTING : 持续失败<br/>达到重连阈值

    RECONNECTING --> HEALTHY : 重连成功<br/>服务恢复
    RECONNECTING --> UNREACHABLE : 重连失败<br/>超过最大重试次数

    UNREACHABLE --> RECONNECTING : 重试重连<br/>定期尝试
    UNREACHABLE --> DISCONNECTED : 放弃重连<br/>手动停止

    DISCONNECTING --> DISCONNECTED : 断开完成<br/>资源清理

    DISCONNECTED --> [*] : 服务删除<br/>完全移除
    DISCONNECTED --> INITIALIZING : 服务重启<br/>重新注册

    note right of INITIALIZING
        • 配置验证完成
        • 执行首次连接
        • 获取工具列表
        • 设置初始状态
    end note

    note right of HEALTHY
        • 连接正常稳定
        • 心跳检查成功
        • 工具调用可用
        • 响应时间正常
    end note

    note right of WARNING
        • 偶发心跳失败
        • 响应时间变慢
        • 未达到重连阈值
        • 仍可提供服务
    end note

    note right of RECONNECTING
        • 连续失败达到阈值
        • 正在执行重连
        • 服务暂时不可用
        • 自动恢复中
    end note

    note right of UNREACHABLE
        • 重连失败
        • 进入长周期重试
        • 服务完全不可用
        • 需要人工干预
    end note

    note right of DISCONNECTING
        • 执行优雅关闭
        • 清理连接资源
        • 保存状态信息
        • 准备完全停止
    end note

    note right of DISCONNECTED
        • 服务完全终止
        • 等待手动删除
        • 或准备重新启动
        • 状态信息保留
    end note

🧩 核心组件架构

ServiceLifecycleManager

graph TB
    subgraph "ServiceLifecycleManager"
        Core[核心管理器]
        TaskScheduler[任务调度器]
        EventDispatcher[事件分发器]
    end

    subgraph "管理的组件"
        StateMachine[状态机]
        HealthManager[健康管理器]
        ReconnectionManager[重连管理器]
        ContentManager[内容管理器]
    end

    subgraph "定时任务"
        HealthCheckTask[健康检查任务<br/>30秒间隔]
        ToolsUpdateTask[工具更新任务<br/>2小时间隔]
        CleanupTask[清理任务<br/>24小时间隔]
        ReconnectionTask[重连任务<br/>动态间隔]
    end

    subgraph "事件处理"
        StateChangeEvent[状态变化事件]
        HealthChangeEvent[健康变化事件]
        ReconnectionEvent[重连事件]
        ToolsUpdateEvent[工具更新事件]
    end

    Core --> TaskScheduler
    Core --> EventDispatcher

    TaskScheduler --> HealthCheckTask
    TaskScheduler --> ToolsUpdateTask
    TaskScheduler --> CleanupTask
    TaskScheduler --> ReconnectionTask

    EventDispatcher --> StateChangeEvent
    EventDispatcher --> HealthChangeEvent
    EventDispatcher --> ReconnectionEvent
    EventDispatcher --> ToolsUpdateEvent

    Core --> StateMachine
    Core --> HealthManager
    Core --> ReconnectionManager
    Core --> ContentManager

    HealthCheckTask --> HealthManager
    ToolsUpdateTask --> ContentManager
    ReconnectionTask --> ReconnectionManager

    StateMachine --> StateChangeEvent
    HealthManager --> HealthChangeEvent
    ReconnectionManager --> ReconnectionEvent
    ContentManager --> ToolsUpdateEvent

ServiceStateMachine

graph TB
    subgraph "状态转换引擎"
        TransitionEngine[转换引擎]
        RuleValidator[规则验证器]
        ThresholdManager[阈值管理器]
    end

    subgraph "转换规则"
        SuccessRules[成功转换规则]
        FailureRules[失败转换规则]
        TimeoutRules[超时转换规则]
        ManualRules[手动转换规则]
    end

    subgraph "状态处理器"
        InitializingHandler[初始化处理器]
        HealthyHandler[健康处理器]
        WarningHandler[警告处理器]
        ReconnectingHandler[重连处理器]
        UnreachableHandler[不可达处理器]
        DisconnectingHandler[断开处理器]
        DisconnectedHandler[已断开处理器]
    end

    TransitionEngine --> RuleValidator
    TransitionEngine --> ThresholdManager

    RuleValidator --> SuccessRules
    RuleValidator --> FailureRules
    RuleValidator --> TimeoutRules
    RuleValidator --> ManualRules

    TransitionEngine --> InitializingHandler
    TransitionEngine --> HealthyHandler
    TransitionEngine --> WarningHandler
    TransitionEngine --> ReconnectingHandler
    TransitionEngine --> UnreachableHandler
    TransitionEngine --> DisconnectingHandler
    TransitionEngine --> DisconnectedHandler

    ThresholdManager -.-> WarningHandler
    ThresholdManager -.-> ReconnectingHandler
    ThresholdManager -.-> UnreachableHandler

HealthManager

graph TB
    subgraph "健康检查引擎"
        CheckEngine[检查引擎]
        StatusEvaluator[状态评估器]
        TimeoutManager[超时管理器]
    end

    subgraph "检查策略"
        PingCheck[Ping检查]
        ToolsCheck[工具检查]
        ResponseTimeCheck[响应时间检查]
        AvailabilityCheck[可用性检查]
    end

    subgraph "健康等级"
        HealthyLevel[HEALTHY<br/>< 1秒, >95%]
        WarningLevel[WARNING<br/>1-3秒, 90-95%]
        SlowLevel[SLOW<br/>3-10秒, 80-90%]
        UnhealthyLevel[UNHEALTHY<br/>>10秒, <80%]
    end

    subgraph "数据收集"
        ResponseTracker[响应跟踪器]
        FailureTracker[失败跟踪器]
        PerformanceTracker[性能跟踪器]
        HistoryTracker[历史跟踪器]
    end

    CheckEngine --> StatusEvaluator
    CheckEngine --> TimeoutManager

    CheckEngine --> PingCheck
    CheckEngine --> ToolsCheck
    CheckEngine --> ResponseTimeCheck
    CheckEngine --> AvailabilityCheck

    StatusEvaluator --> HealthyLevel
    StatusEvaluator --> WarningLevel
    StatusEvaluator --> SlowLevel
    StatusEvaluator --> UnhealthyLevel

    CheckEngine --> ResponseTracker
    CheckEngine --> FailureTracker
    CheckEngine --> PerformanceTracker
    CheckEngine --> HistoryTracker

    ResponseTracker --> StatusEvaluator
    FailureTracker --> StatusEvaluator
    PerformanceTracker --> StatusEvaluator

SmartReconnectionManager

graph TB
    subgraph "重连策略引擎"
        StrategyEngine[策略引擎]
        BackoffCalculator[退避计算器]
        PriorityManager[优先级管理器]
    end

    subgraph "重连队列"
        CriticalQueue[关键服务队列<br/>0.5x延迟]
        HighQueue[高优先级队列<br/>0.7x延迟]
        NormalQueue[普通队列<br/>1.0x延迟]
        LowQueue[低优先级队列<br/>1.5x延迟]
    end

    subgraph "重连策略"
        ExponentialBackoff[指数退避<br/>60s → 600s]
        MaxAttempts[最大尝试次数<br/>10次]
        CircuitBreaker[熔断器<br/>失败保护]
        HealthyReset[健康重置<br/>成功后清零]
    end

    subgraph "监控指标"
        AttemptCounter[尝试计数器]
        SuccessRate[成功率统计]
        AverageDelay[平均延迟]
        QueueLength[队列长度]
    end

    StrategyEngine --> BackoffCalculator
    StrategyEngine --> PriorityManager

    PriorityManager --> CriticalQueue
    PriorityManager --> HighQueue
    PriorityManager --> NormalQueue
    PriorityManager --> LowQueue

    BackoffCalculator --> ExponentialBackoff
    BackoffCalculator --> MaxAttempts
    BackoffCalculator --> CircuitBreaker
    BackoffCalculator --> HealthyReset

    StrategyEngine --> AttemptCounter
    StrategyEngine --> SuccessRate
    StrategyEngine --> AverageDelay
    StrategyEngine --> QueueLength

📊 数据流架构

健康检查数据流

sequenceDiagram
    participant Timer as 定时器
    participant HealthManager as 健康管理器
    participant FastMCP as FastMCP客户端
    participant Service as MCP服务
    participant StateMachine as 状态机
    participant Registry as 注册表

    Timer->>HealthManager: 触发健康检查
    HealthManager->>FastMCP: ping_service()

    FastMCP->>Service: MCP Ping
    Service-->>FastMCP: Pong + 响应时间

    FastMCP-->>HealthManager: 检查结果
    HealthManager->>HealthManager: 评估健康状态

    alt 状态需要变化
        HealthManager->>StateMachine: 触发状态转换
        StateMachine->>Registry: 更新服务状态
        StateMachine->>HealthManager: 状态变化确认
    end

    HealthManager->>Registry: 更新健康元数据
    Registry-->>HealthManager: 更新完成

重连流程数据流

sequenceDiagram
    participant StateMachine as 状态机
    participant ReconnectionManager as 重连管理器
    participant Orchestrator as 编排器
    participant FastMCP as FastMCP客户端
    participant Service as MCP服务
    participant Registry as 注册表

    StateMachine->>ReconnectionManager: 添加重连任务
    ReconnectionManager->>ReconnectionManager: 计算重连延迟

    loop 重连循环
        ReconnectionManager->>Orchestrator: 触发重连
        Orchestrator->>FastMCP: 断开旧连接
        Orchestrator->>FastMCP: 创建新连接

        FastMCP->>Service: 建立连接

        alt 连接成功
            Service-->>FastMCP: 连接确认
            FastMCP->>Service: 获取工具列表
            Service-->>FastMCP: 工具列表
            FastMCP-->>Orchestrator: 重连成功
            Orchestrator-->>ReconnectionManager: 成功通知
            ReconnectionManager->>StateMachine: 触发成功转换
            StateMachine->>Registry: 更新为HEALTHY
        else 连接失败
            Service-->>FastMCP: 连接失败
            FastMCP-->>Orchestrator: 重连失败
            Orchestrator-->>ReconnectionManager: 失败通知
            ReconnectionManager->>ReconnectionManager: 增加失败计数
            ReconnectionManager->>ReconnectionManager: 计算下次重连时间
        end
    end

🔧 配置架构

生命周期配置层次

graph TB
    subgraph "全局配置"
        GlobalConfig[全局生命周期配置]
        DefaultThresholds[默认阈值配置]
        DefaultTimeouts[默认超时配置]
    end

    subgraph "服务类型配置"
        CriticalConfig[关键服务配置]
        NormalConfig[普通服务配置]
        BackgroundConfig[后台服务配置]
    end

    subgraph "运行时配置"
        DynamicThresholds[动态阈值调整]
        AdaptiveTimeouts[自适应超时]
        LoadBasedConfig[负载基础配置]
    end

    subgraph "服务特定配置"
        ServiceOverrides[服务特定覆盖]
        CustomHealthChecks[自定义健康检查]
        SpecialHandling[特殊处理规则]
    end

    GlobalConfig --> DefaultThresholds
    GlobalConfig --> DefaultTimeouts

    GlobalConfig --> CriticalConfig
    GlobalConfig --> NormalConfig
    GlobalConfig --> BackgroundConfig

    CriticalConfig --> DynamicThresholds
    NormalConfig --> AdaptiveTimeouts
    BackgroundConfig --> LoadBasedConfig

    DynamicThresholds --> ServiceOverrides
    AdaptiveTimeouts --> CustomHealthChecks
    LoadBasedConfig --> SpecialHandling

📈 性能优化架构

并发处理架构

graph TB
    subgraph "任务调度器"
        MainScheduler[主调度器]
        HealthScheduler[健康检查调度器]
        ReconnectionScheduler[重连调度器]
        CleanupScheduler[清理调度器]
    end

    subgraph "工作线程池"
        HealthWorkers[健康检查工作线程<br/>并发执行]
        ReconnectionWorkers[重连工作线程<br/>优先级队列]
        CleanupWorkers[清理工作线程<br/>后台执行]
    end

    subgraph "缓存层"
        StateCache[状态缓存<br/>快速访问]
        HealthCache[健康缓存<br/>减少检查]
        MetadataCache[元数据缓存<br/>性能优化]
    end

    subgraph "批处理优化"
        BatchHealthCheck[批量健康检查]
        BatchStateUpdate[批量状态更新]
        BatchNotification[批量通知]
    end

    MainScheduler --> HealthScheduler
    MainScheduler --> ReconnectionScheduler
    MainScheduler --> CleanupScheduler

    HealthScheduler --> HealthWorkers
    ReconnectionScheduler --> ReconnectionWorkers
    CleanupScheduler --> CleanupWorkers

    HealthWorkers --> StateCache
    ReconnectionWorkers --> HealthCache
    CleanupWorkers --> MetadataCache

    HealthWorkers --> BatchHealthCheck
    HealthWorkers --> BatchStateUpdate
    HealthWorkers --> BatchNotification

🔗 相关文档

🎯 下一步