|本期目录/Table of Contents|

[1]蒋海昆,王锦红.适用于机器学习的地震序列类型判定特征重要性讨论[J].地震研究,2023,46(02):155-172.[doi:10.20015/j.cnki.ISSN1000-0666.2023.0034]
 JIANG Haikun,WANG Jinhong.Discussion on the Importance of the Features for the Judgement of Earthquake Sequence Types Applicable to Machine Learning[J].Journal of Seismological Research,2023,46(02):155-172.[doi:10.20015/j.cnki.ISSN1000-0666.2023.0034]
点击复制

适用于机器学习的地震序列类型判定特征重要性讨论(PDF/HTML)

《地震研究》[ISSN:1000-0666/CN:53-1062/P]

卷:
46
期数:
2023年02期
页码:
155-172
栏目:
出版日期:
2023-06-01

文章信息/Info

Title:
Discussion on the Importance of the Features for the Judgement of Earthquake Sequence Types Applicable to Machine Learning
作者:
蒋海昆1王锦红2
(1.中国地震台网中心,北京 100045; 2.中国地震局地震预测研究所,北京 100036)
Author(s):
JIANG Haikun1WANG Jinhong2
(1.China Earthquake Networks Center,Beijing 100045,China)(2.Institute of Earthquake Forecasting,China Earthquake Administration,Beijing 100036,China)
关键词:
地震序列类型 机器学习 特征 互信息
Keywords:
earthquake sequence type machine learning feature mutual information
分类号:
P315.7
DOI:
10.20015/j.cnki.ISSN1000-0666.2023.0034
摘要:
基于1970—2021年中国大陆及边邻地区地震目录、地震序列目录和历史地震震源机制资料,参考以往研究和震后趋势预测实践,构建基于地震观测数据的机器学习序列类型判定特征样本数据集。基于地震序列分类,设置多震型、主余型、孤立型3类样本标签,初步提出44个可用于机器学习地震序列类型判定的备选特征,包括主震及震源机制相关参数、历史地震序列类型、序列衰减和G-R关系相关参数、震级及频次相关参数。以44个备选特征为基础,变换震级下限、统计时段等参数,可以扩充出更多的机器学习备选特征。基于特征与标签之间的关联特性,评估特征对序列分类的重要性。宏观来看,震级相关参数、G-R关系和序列衰减相关参数、历史地震序列类型、震源机制相关参数等特征对序列分类有贡献,其中震级相关参数特征与标签之间的互信息值明显较大且排序稳定。补齐缺失特征不但能够增加可用的训练和检验样本,还可明显提升特征与序列类型之间的关联性,这意味着恰当的数据预处理在一定程度上有可能提高特征的序列分类能力。添加原始数据的交互特征是拓展可用特征数量的重要方式之一,非独立特征经信息交互处理之后显示出与序列标签更强的关联性,这意味着特征选择应以模型预测效能的综合评价结果为准,不宜过分强调特征参数的独立性。
Abstract:
Based on the catalog and focal mechanism of earthquakes in Chinese mainland since 1970 and referring to the previous research and practice on estimation of aftershock activity tendency,a feature sample dataset for judgement of earthquake sequence types by machine learning has been constructed.Three labels—multiplet mainshocks type,mainshock-aftershock type,as well as isolated earthquake type—have been set up according to the earthquake sequences.Forty-four alternative features that can be used for machine learning for earthquake sequence type judgement have been proposed preliminarily,including mainshock and focal-mechanism-related parameters,historical earthquake sequence types,sequence decay and G-R relationship-related parameters,magnitude- and frequency-related parameters.Based on the 44 alternative features,more features can be expanded by different threshold magnitude or statistical period.Based on the mutual information between features and labels,the feature importance or contribution rate of feature parameters to sequence classification has been evaluated.In summary,the magnitude-related parameters,G-R relationship,sequence-decay-related parameters,historical earthquake sequence type,focal mechanism related parameters are contributory for sequence classification.Especially,the mutual information between magnitude-related parameters and labels are obviously large and the ranking is stable.Our results show that the complementing of missing features can not only increase the available samples for model training and testing,but also significantly improve the correlation between features and labels,which means that appropriate data preprocessing on features may improve the ability of sequence classification to a certain extent.Adding interactive features of original data is one of the important ways to expand the number of available features,the independent features show a stronger correlation with sequence labels after information interaction processing in this paper,reminding us that the feature selection should be based on the results of efficiency estimation of the final model,and the feature independence should not be overemphasized.

参考文献/References:


蔡立冬,宫家文,甘俊人,等.1994.地震序列类型预测的人工神经网络方法[J].地震研究,17(1):40-45.
陈立德,蔡静观,孙志民.1992.震后趋势早期判定的初步研究[J].地震研究,15(4):355-364.
陈荣华,吴开统,刘杰,等.1994.不同地震序列类型的早期特征[J].地震,14(1):44-47.
陈学忠,王小平,王林瑛,等.2003.地震视应力用于震后趋势快速判定的可能性[J].国际地震动态,(7):1-4.
陈颙.1980.用震源机制一致性作为描述地震活动性的新参数[J].地球物理学报,2(3):39-47.
崔子健,李志雄,陈章立,等.2012.判别小震群序列类型的新方法研究——谱振幅相关分析法[J].地球物理学报,55(5):1718-1724.
戴英华,李钦祖,王泽皋,等.1990.地震现场综合地震学预报方法[J].地震,10(1):1-13.
刁守中,王红卫,华爱军.1995.中国大陆地区地震序列显著地震的时间分布特征[J].中国地震,11(4):315-326.
杜迎春.2000.1998年张北地震及其较大余震 的应力降[J].华北地震科学,18(2):66-69.
郭大庆,刘蒲雄,袁一凡,等.1998.地震现场工作大纲和技术指南[M].北京:地震出版社.
国家地震局科技监测司.1990.地震学分析预报方法程式指南[M].北京:地震出版社.
韩渭宾,王虹,曾健,等.1993.中强以上地震的震后趋势早期综合判断方法的研究[J].地震学报,15(1):15-21.
华卫,陈章立,郑斯华,等.2012.水库诱发地震与构造地震震源参数特征差异性研究——以龙滩水库为例[J].地球物理学进展,27(3):924-935.
黄浩,付虹.2014.2008年以来滇西地区地震序列的谱振幅相关系数变化特征[J].地震学报,36(4):631-639.
贾若,蒋海昆.1994.基于同震库仑应力变化的汶川余震序列频次研究[J].中国地震,31(1):74-90.
姜文煊,段友祥,孙歧峰.2021.基于交互信息的混合特征选择算法[J].应用科学学报,39(4):545-558.
蒋海昆,代磊,侯海峰,等.2006a.余震序列性质判定单参数判据的统计研究[J].地震,26(3):17-25.
蒋海昆,李永莉,曲延军,等.2006b.中国大陆中强地震序列类型的空间分布特征[J].地震学报,28(4):389-398.
蒋海昆,曲延军,李永莉,等.2006c.中国大陆中强地震余震序列的部分统计特征[J].地球物理学报,49(4):1110-1117.
蒋海昆,杨马陵,付虹,等.2015.震后趋势判定参考指南[M].北京:地震出版社.
蒋海昆,郑建常,代磊,等.2007a.中国大陆余震序列类型的综合判定[J].地震,27(1):17-25.
蒋海昆,郑建常,吴琼,等.2007b.中国大陆中强以上地震余震分布尺度的统计特征[J].地震学报,29(2):151-164.
蒋海昆,周少辉.2020.前震:预测意义及识别方法[J].地震地磁观测与研究,41(5):223-225.
李冬梅,周翠英,朱成林,等.2013.基于SVM的地震序列类型早期预测研究[J].地震研究,36(1):69-73.
李欣倩,杨哲,任佳.2022.基于互信息与层次聚类双重特征选择的改进朴素贝叶斯算法[J].模式识别与人工智能,41(2):36-69.
李振,王辉.2011.前震序列时间空间统计特点[C]//中国地球物理学会.中国地球物理学会第二十七届年会论文集.北京:中国地球物理学会,353.
刘健,张维明.2008.基于互信息的文本特征选择方法研究与改进[J].计算机工程与应用,44(10):135-137.
刘蒲雄,陈修启,吕晓健,等.1996.地震序列的后续显著地震的预测研究[J].地震学报,18(1):27-33.
刘正荣,孔绍麟.1986.地震频度衰减与地震预报[J].地震研究,9(1):6-8.
刘正荣,钱兆霞,王维清,等.1979.前震的一个标志——地震频度的衰减[J].地震研究,2(4):1-9.
刘正荣.1995.b值特征的研究[J].地震研究,18(2):168-173.
刘珠妹,蒋海昆,李盛乐,等.2019.基于震例类比的震后趋势快速判定技术系统建设[J].中国地震,34(4):602-615.
吕晓健,高孟潭,陈丹.2010.全球大陆7级浅源大地震强余震震级和空间分布特征[J].地震,30(3):108-122.
秦保燕,刘武英.1992.发震构造类型与震型预测[J].西北地震学报,14(1):29-36.
秦嘉政,钱晓东,叶建庆,等.2005.2001年施甸MS5.9地震序列的震源参数研究[J].地震学报,27(3):250-259.
曲均浩,蒋海昆,宋金,等.2015.介质黏滞性质对余震活动影响的数值模拟[J].地震地质,37(1):53-67.
苏有锦,李忠华,赵小艳,等.2014.全球7级以上地震序列研究[M].昆明:云南大学出版社.
苏有锦,刘祖荫,蔡民军,等.1999.云南地区强震分布的深部地球介质背景[J].地震学报,21(3):313-332.
王东.2021.机器学习导论[M].北京:清华出版社.
王华林,周翠英,耿杰.1997.中国大陆及邻区地震序列类型的分区特征和震源环境讨论[J].地震,17(1):34-42.
王锦红,蒋海昆.2023.基于地震数据的机器学习地震预测研究进展综述[J].地震研究,46(2),doi:173-187.
王俊国,刁桂苓.2005.千岛岛弧大震前哈佛大学矩心矩张量(CMT)解一致性的预测意义[J].地震学报,27(2):178-183.
王林瑛,舒曦.1997.利用熵值及多分形方法研究地震序列类型的早期判定[C]//地震短临预报的理论与方法——“八五”攻关三级课题论文集.北京:地震出版社,136-142.
王培玲,姚家骏,刘文邦,等.2013.玉树地区两次强震序列应力降对比研究[J].内陆地震,27(4):295-302.
王志东,焦远碧,吴开统.1982.地震序列的持续时间与震级的关系[J].地震,2(5):34-39.
吴开统,焦远碧,吕培苓,等.1990.地震序列概论[M].北京:北京大学出版社.
吴开统.1971.地震序列的基本类型及其在地震预报中的应用[J].地震战线,7(11):45-51.
吴忠良,陈运泰,Mozaffari P.1999.应力降的标度性质与震源谱高频衰减常数[J].地震学报,21(5):460-468.
徐洪峰,孙振强.2019.多标签学习中基于互信息的快速特征选择方法[J].计算机应用,39(10):2815-2821.
薛艳,刘杰,余怀忠,等.2012.2011年日本本州东海岸附近9.0级地震活动特征[J].科学通报,57(8):634-640.
杨秋良,王钰,杨杏丽,等.2021.基于互信息F统计量特征选择技术的地基气象云图分类[J].计算机与现代化,(2):18-23.
张国民,钮凤林,邵志刚,等.2010.中国大陆MS≥7.8大震余震活动差异性特征及其成因研究[J].地震,30(4):1-12.
赵翠萍,陈章立,华卫,等.2011.中国大陆主要地震活动区中小地震震源参数研究[J].地球物理学报,54(6):1478-1489.
赵志勇.2018.Python机器学习算法[M].北京:电子工业出版社.
钟羽云,张帆,张震峰,等.2004.应用强震应力降和视应力进行震后趋势快速判定的可能性[J].防灾减灾工程学报,24(1):8-14.
周翠英,张宇霞,王红卫.1996.以模式识别方法提取地震序列早期判断的综合指标[J].地震学报,18(1):118-124.
周惠兰,房桂荣,章爱娣,等.1980.地震震型判断方法探讨[J],西北地震学报,2(2):45-59.
周少辉,蒋海昆.2017.景谷6.6级、鲁甸6.5级地震序列应力降变化对比研究[J].中国地震,33(1):23-37.
周志华.2016.机器学习[M].北京:清华大学出版社.
朱传镇,王琳瑛.1989.震群信息熵异常与地震预报[C]//地震预报方法实用化研究文集(地震学专辑).北京:学术书刊出版社,229-242.
庄昆元,王炜,章纯,等.2001.震后趋势决策支持系统PTDSS[J].西北地震学报,23(4):21-28.
Abercrombie R E.1995.Earthquake source scaling relationships from -1 to 5 using seismograms recorded at 2.5 km depth[J].J Geophys Res,100(B12):24015-24036.
Al Banna M H,Taher K A,Kaiser M S,et al.2020.Application of artificial intelligence in predicting earthquakes:state-of-the-art and future challenges[J].IEEE Access,8:192880-192923.
Allman B P,Shearer P M.2009.Global variations of stress drop for moderate to large earthquakes[J].J Geophys Res,114(B1):B01310.
Aochi H,Ide S.2009.Complexity in earthquake sequences controlled by multiscale heterogeneity in fault fracture energy[J].J Geophys Res,114(B3):Bo3305.
Apostol B F.2021.Correlations and Bath’s law[J].Results in Geophysical Sciences,doi:org/10.1016/j.ringps.2021.100011.
Asencio-Cortés G,Morales-Esteban A,Shang X,et al.2018.Earthquake prediction in California using regression algorithms and cloud-based big data infrastructure[J].Computers & Geosci-ences,115(5):198-210.
Baltay A,Ide S,Prieto G,et al.2011.Variability in earthquake stress drop and apparent stress[J].Geophys Res Lett,38(6):L06303.
Ben-Zion Y,James R R.1993.Earthquake failure sequences along a cellular fault zone in a three- dimensional elastic solid containing asperity and nonasperity regions[J].J Geophys Res,B8:14109-14131.
Ben-Zion Y,Lyakhovsky V.2006.Analysis of aftershocks in a lithospheric model with seismogenic zone governed by damage rheology[J].Geophys J Int,165(1):197-210.
Breiman L.2001.random forests[J].Machine Learning,45:5-32.
B?th M.1965.Lateral inhomogeneities in the upper mantle[J].Tectonophysics,2(6):438-514.
Cai J,Luo J W,Wang S L,et al.2018.Feature selection in machine learning:a new perspective[J].Neurocomputing,300(26):70-79.
Chen Y T,Knopoff L.1987.Simuation of earthquake sequences[J].J Geophys Res,91(3):693-703.
Creamer F H,Kisslinger C.1993.The relation between temperature and the Omori decay parameter for aftershock sequences near Japan[J].EOS74,43(S),417.
Dysart P S,Snoke J A,Sacks I S.1988.Source parameters and scaling relations for small earthquakes in the Matsushiro region,southwest Honshu,Japan[J].Bull Seism Soc Am,78(2):571-589.
Felzer K R,Abercrombie R E,Ekstrom G.2004.A common origin for aftershocks,foreshocks,and multiplets.Bulletin of the Seismological Society of America[J].94(1):88-98.
Freed A M,Lin J.2001.Delayed triggering of the 1999 Hector Mine earthquake by viscoelastic stress transfer[J].Nature,411(6834):180-183.
Giacomo D D.Bondár I,Storchak D A,et al.2015.ISC-GEM:Global instrumental earthquake catalogue(1900-2009),III.Re-computed MS and mb,proxy MW,final magnitude composition and completeness assessment[J].Physics of the Earth and Planetary Interiors,239(1B):33-47.
Grinsztajn L,Oyallon E,Varoquaux G.2022.Why do tree-based models still outperform deep learning on tabular data[J].Neurlps arXiv:2207.08815v1[cs.LG]18 Jul 2022.
Gulia L,Wiemer S.2019.Real-time discrimination of earthquake foreshocks and aftershocks[J].Nature,574(7777):193-199.
Helmstetter A,Sornette D.2003.B?th’s law derived from the Gutenberg-Richter law and from aftershock properties[J].Geophys Res Lett,30(20):1-4.
Ide S,Beroz G C.2001.Does apparent stress vary with earthquake size?[J].Geophys Res Lett,28(17):3349-3352.
Jakulin A,Bratko I.2003.Analyzing attribute dependencies[J].Lect Notes Artif Intell,2838:229-240.
Jones A G,Craven J A.1990.The North American central plains conductivity anomaly and its correlation with gravity,magnetics,seismic,and heat flow data in the province of Saskatchewan[J].Phys Earth planet Inter,60(1-2):169-194.
Jones L M,Molnar P.1979.Some characteristics of foreshocks and their possible relationship to earthquake prediction and premonitory slip on fault [J].J Geophys Res,84(B7):3596-3608.
Kanamori H,Brodsky E E.2004.The physics of earthquakes[J].Rep Prog Phys,67(8):1429-1496.
Kisslinger C,Jones L M.1991.Properties of aftershock sequences in southern California[J].J Geophys Res,96(B7):11947-11958.
Kraskov A,Stogbauer H,Grassberger P.2004.Estimating mutual information[J].Phys Rev E,69:66138-66154.
Lolli B,Gasperini P.2003.Aftershocks hazard in Italy Part I:Estimation of time-magnitude distribution model parameters and computation of probabilities of occurrence[J].Journal of Seismology,7(2):235-257.
Lyakhovsky V,Ben-Zion Y,Agnon A.2005.A visco-elastic damage rheology and rate-and state-dependent friction[J].Geophys J Int,161(1):179-190.
Marone C,Richardson E.2016.Connections between fault roughness,dynamic weakening,and fault zone structure[J].Geology,44(1):79-80.
Mignan A,Broccardo M.2020.Neural network applications in earthquake prediction(1994-2019):Meta-analytic and statistical insights on their limitations[J].Seismological Research Letters,91(4):2330-2342.
Mogi K.1962.Magnitude-frequency relation for elastic shocks accompanying fractures of various materials and some related problems in earthquakes[J].Bull Earthq Res Inst,40:831-853.
Mousavi S M,Beroza G C.2020.A machine-learning approach for earthquake magnitude estimation[J].Geophysical Research Letters,47(1):e2019GL085976.
Narteau C.2009.Common dependence on stress for the two fundamental laws of statistical seismology[J].Nature,462(3):642-645.
Rabinowitz N,Steinberg D M.1998.Aftershock decay of three recent strong earthquakes in the Levant[J].BSSA,88(6):1580-1587.
Reasenberg P A.1999.Foreshock occurrence before large earthquakes[J].J Geophys Res,104(B3):4755-4768.
Reyes J,Morales-Esteban A,Martínez-?lvarez F.2013.Neural networks to predict earthquakes in Chile[J].Applied Soft Computing,13(2):1314-1328.
Rodríguez-Pérez Q,Zú?iga F R.2016.B?th’s law and its relation to the tectonic environment:A case study for earthquakes in Mexico[J].Tectonophysics,687:66-77.
Ross B C.2014.Mutual information between discrete and continuous data sets[J].PLoS ONE,9(2):e87357.
Scholz C H.2002.The mechanics of earthquakes and faulting[M].New York:Cambridge Univ Press.
Shcherbakov R,Goda K,Ivanian A.et al.2013.Aftershock statistics of major subduction earthquakes[J].Bull Seism Soc Am,103(6):3222-3234.
Shcherbakov R,Turcotte D L.2004.A modified form of Bath’s law[J].Bull Seism Soc Am,94(5):1968-1975.
Shodiq M N,Kusuma D H,Rifqi M G.et al.2017.Spatial analysis of magnitude distribution for earthquake prediction using neural network based on automatic clustering in Indonesia[C]//International Electronics Symposium on Knowledge Creation and Intelligent Computing(IESKCIC).IEEE,246-251.
Shodiq M N,Kusuma D H,Rifqi M G.et al.2018.Neural network for earthquake prediction based on automatic clustering in Indonesia[J].International Journal on Informatics Visualization(JOIV),2(1):37-43.
Somerville P,Irikura K,Graves R,et al.1999.Characterizing crustal earthquake slip models for the prediction of strong ground motion[J].Seismological Research Letters,70(1):59-80.
Strobl C,Boulesteix A,Zeileis A,et al.2007.Bias in random forest variable importance measures:Illustrations,sources and a solution[J].BMC Bioinformatics,8(1):25.
Takayuki W,Hirata G A.1987.Omori’s Power law aftershock sequences of microftacturing in rock fracture experiment[J].J Geophys Res,92(B7):6215-6221.
Trifu C I,Radulian M.1989.Asperity distribution and percolation as fundamentals of an earthquake cycle[J].Phys Earth Planet Inter,58(4):277-288.
Trugman D T,Ross Z.2019.Pervasive foreshock activity across southern California[J].Geophysical Research Letters,46(15):8782-8781.
Utsu T,Ogata Y,Matsuura R S.1995.The Centenary of the Omori formula for a decay law of aftershock activity[J].J Phys Earth,43(1):1-33.
Utsu T.2002.Statistical features of seismicity[J].International Geophysics,81(A):719-731.
Wells D L,Coppersmith K J.1994.New empirical relationships among magnitude,rupture Length,rupture width,rupture area,and surface displacement[J].Bulletin of the Seismological Society of America,84(4):974-1002.
Wiemer S,Wyss M.2002.Mapping spatial variability of the frequency-magnitude distribution of earthquakes[J].Adv Geophys,45:259-302.
Yamanaka Y,Kikuchi M.2004.Asperity map along the subduction zone in northeastern Japan inferred from regional seismic data[J].J Geophys Res,109:B07307.
alohar J.2014.Explaining the physical origin of B?th’s law[J].Journal of Structural Geology,60(B2):30-45.

备注/Memo

备注/Memo:
收稿日期:2022-09-19.
基金项目:地震动力学国家重点实验室开放基金(LED2022B05).
第一作者简介:蒋海昆(1964-),研究员,博士,主要从事余震统计、余震机理及余震预测研究.E-mail:jianghaikun@seis.ac.cn.
更新日期/Last Update: 2023-03-10