Optimization of Nuclear Power Accident Diagnosis Procedures Based on SAC Reinforcement Learning
-
摘要: 基于Soft Actor-Critic (SAC)算法的核电事故诊断规程优化方法,以决策树模型为基础,对事故检测规程判断策略进行优化,在显著提高事故检测性能的同时保持了决策模型的可解释性。模型使用SAC作为强化学习算法,将状态定义为当前运行数据和历史数据的组合,动作设定为诊断规程决策阈值的调整,回报反映了诊断的准确性。借助SAC算法,系统不断地调整阈值进行策略优化以获得最佳的诊断效果。在主蒸汽管道破裂(MSLB)模拟工况事故中,模型能更好地适应和理解复杂高维数据,找到特定性能指标下的最优控制策略,准确率稳步趋近于1。本文方法显著减少了误判率,不仅更准确地检测核电事故,而且在减少误警方面表现出优秀的结果,提高了核电运行的安全性。Abstract: This paper proposes an optimization method for nuclear accident diagnosis procedures based on the Soft Actor-Critic (SAC) reinforcement learning model. Using a decision tree model as the foundation to optimize the judgment strategy of accident detection procedures, which significantly improves the performance of accident detection while maintaining the interpretability of the decision model. The model employs SAC as the reinforcement learning algorithm, which defines the state as a combination of current operating data and historical data, sets the actions as the adjustment of the decision threshold of diagnostic procedures, and reflects the accuracy of diagnosis through the returns. With the help of SAC algorithm, the system constantly adjusts the threshold to optimize the strategy to obtain the best diagnosis effect. In a simulated Main Steam Line Break (MSLB) accident scenario, the model can better adapt to and comprehend complex high-dimensional data, find the optimal control strategy under specific performance indicators, and the accuracy is steadily approaching 1. The proposed method significantly reduces the false positive rate, and it not only detects nuclear power accidents more accurately, but also shows excellent results in reducing false alarms, thus improving the safety of nuclear power operation.
-
Key words:
- Nuclear power accident /
- Reinforcement learning /
- Procedure optimization /
- MSLB
-
-
[1] 许勇,蔡云泽,宋林. 基于数据驱动的核电设备状态评估研究综述[J]. 上海交通大学学报,2022, 56(3): 267-278. [2] 齐奔,梁金刚,张立国,等. 基于贝叶斯分类器的核电厂事故诊断方法研究[J]. 原子能科学技术,2022, 56(3): 512-519. doi: 10.7538/yzk.2021.youxian.0120 [3] 蒋建军,张力,王以群,等. 基于隐马尔可夫的核电厂半数字化人-机界面事故诊断过程人因可靠性模型[J]. 核动力工程,2012, 33(5): 79-82,128. doi: 10.3969/j.issn.0258-0926.2012.05.017 [4] 李映林. 数字化核电站智能诊断系统研究[D]. 哈尔滨: 哈尔滨工程大学,2008. [5] 张燕,周志伟,董秀臣. 核电厂实时故障诊断专家系统的设计与实现[J]. 原子能科学技术,2006, 40(4): 420-423. [6] LAHEY JR R T, MOODY F J. The thermal-hydraulics of a boiling water nuclear reactor[M]. Illinois: Amer Nuclear Society, 1993: 25-27. [7] LEE D, ARIGI A M, KIM J. Algorithm for autonomous power-increase operation using deep reinforcement learning and a rule-based system[J]. IEEE Access, 2020, 8: 196727-196746. doi: 10.1109/ACCESS.2020.3034218 [8] FU H B, LIU W M, WU S, et al. Actor-critic policy optimization in a large-scale imperfect-information game[C]//Proceedings of the 10th International Conference on Learning Representations. OpenReview. net, 2022. [9] DEGRAVE J, FELICI F, BUCHLI J, et al. Magnetic control of tokamak plasmas through deep reinforcement learning[J]. Nature, 2022, 602(7897): 414-419. doi: 10.1038/s41586-021-04301-9 [10] 俞尔俊. 秦山核电厂主蒸汽管道破裂事故的分析研究[J]. 原子能科学技术,1989, 23(5): 15-22. doi: 10.7538/yzk.1989.23.05.0015 [11] TOROMANOFF M, WIRBEL E, MOUTARDE F. End-to-end model-free reinforcement learning for urban driving using implicit affordances[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 7151-7160. [12] PARK J, KIM T, SEONG S, et al. Control automation in the heat-up mode of a nuclear power plant using reinforcement learning[J]. Progress in Nuclear Energy, 2022, 145: 104107. doi: 10.1016/j.pnucene.2021.104107