基于决策树的川滇地区地震序列类型判定特征重要性研究

(1.云南省地震局,云南 昆明 650224; 2.中国地震台网中心,北京 100045)

地震序列类型; 机器学习; 特征参数; 决策树

Research on the Importance of Feature Parameters in Seismic Sequence Type Determination in Sichuan-Yannan Region Based on Decision Tree
ZHAO Xiaoyan1,JIANG Haikun2,MENG Lingyuan2,SU Youjin1,HE Suge1

(1.Yunnan Earthquake Agency,Kunming 650224,Yunnan,China;2.China Earthquake Networks Center,Beijing 100045,China)

earthquake sequence type; machine learning; characteristic parameters; decision tree

DOI: 10.20015/j.cnki.ISSN1000-0666.2024.0039

备注

基于1966—2021年川滇地区225次5级以上地震目录、地震序列目录和历史地震震源机制资料,参考以往研究和震后趋势预测实践经验,构建了10个基于地震观测数据的机器学习序列类型判定特征样本数据集。基于地震序列分类定义,设置多震型、主余型、孤立型三类样本“标签”。对样本进行不均衡处理、对特征参数进行缺失处理后,采用决策树模型对特征参数的重要性进行研究。结果显示:不同时间段特征参数重要性类别有一定差异,随着序列数据资料的增加,序列类型判断更倚重动态的序列数据资料; 主震震源机制相关参数和主震参数对序列分类有较高的贡献率,序列参数对序列分类贡献率不高。整体而言,模型给出的结果与实际经验性预报方法较为一致。
Based on the catalog of 225 earthquakes with magnitude 5 or above,the catalog of earthquake sequences,and the focal mechanism of the historical earthquakes in Sichuan-Yunnan region from 1966 to 2021,and referring to the previous research and practice on the estimation of the tendency of the aftershock activity,10 sample datasets for the judging features of the earthquake sequence types have been constructed.According to the earthquake sequences types—swarm type,mainshock-aftershock type,as well isolated type—three labels have been made.After processing the imbalanced state and the missing state of the feature parameters,a decision tree model was used to study and analyze the importance of feature parameters.The results showed that there were differences in the importance categories of the feature parameters in different periods.As the sequence data increased,sequence type judgement relied more on dynamic sequence data; the parameters related to the main shocks' focal mechanism and the main shocks' parameters had a high contribution rate to the sequence classification,while the contribution rate of sequence parameters was extremely low.In overall,the results provided by the model are consistent with the actual empirical prediction methods.The above results can provide some ideas for the preliminary screening,exclusion,and selection of the complex and numerous feature parameters.
·