Volume 40 Issue 3
May  2023
Turn off MathJax
Article Contents

LIU Tingting, YANG Jinfan, ZHOU Ruliang, LIU Lin. Prediction of spread risk of pine wilt disease based on geographic raster variables and machine learning models[J]. Journal of Zhejiang A&F University, 2023, 40(3): 617-626. doi: 10.11833/j.issn.2095-0756.20220470
Citation: LIU Tingting, YANG Jinfan, ZHOU Ruliang, LIU Lin. Prediction of spread risk of pine wilt disease based on geographic raster variables and machine learning models[J]. Journal of Zhejiang A&F University, 2023, 40(3): 617-626. doi: 10.11833/j.issn.2095-0756.20220470

Prediction of spread risk of pine wilt disease based on geographic raster variables and machine learning models

doi: 10.11833/j.issn.2095-0756.20220470
  • Received Date: 2022-07-19
  • Accepted Date: 2023-02-14
  • Rev Recd Date: 2023-01-12
  • Publish Date: 2023-05-20
  •   Objective  Considering the fact that pine wilt disease (PWD) has been the most serious disease threatening forest ecosystem in China. This study, with the simulation and expression of the driving variables by using geographic grid model and map algebra operation, is aimed to construct PWD measurement and forecasting system and form the spatial continuous measurement and prediction by using the dual scale of county area and geographic grid unit.   Method  First, datasets were formed by integrating geographic raster spatial data such as topography, meteorology, host, human activities and land use that affect PWD dispersal. Then, the model was constructed using the machine learning method of random forest and support vector machine before predicted results were superimposed with the susceptibility map of Pinaceae plants to perform the map algebraic operation of infection probability. Finally, an analysis was conducted of the risk level of PWD spread in the whole country based on geographic grid unit.   Result  (1) The prediction accuracy was 83.95% for the random forest model and 77.97% for the support vector machine model; (2) Altitude, average annual minimum precipitation, average annual precipitation and average annual low temperature were the main factors affecting the occurrence of PWD, with their contribution rates to the model construction being 0.151, 0.303, 0.258 and 0.194 respectively whereas human activity variables were the decisive variable affecting the diffusion of PWD with their contribution rate to the model construction being 0.194; (3) The potential dispersal areas were located in low altitude areas with dense human activities, forest areas adjacent to roads, urban distribution areas and plantation distribution areas while the highest risk areas were mainly Zhejiang, Jiangxi and Fujian in East China, Guangxi and Guangdong in South China as well as Hunan in central China.   Conclusion  With the employment of spatial simulation and machine learning methods, a mapping model was constructed to predict the spatial transmission pattern of PWD with diffusion risk of PWD predicted of specific geographical grid cells. It provides a reference for the accurate supervision of forest and grass disasters with practical guiding significance for the prevention and control of PWD epidemic in China. [Ch, 2 fig. 7 tab. 36 ref.]
  • [1] DENG Lili, LIU Qinghua, ZHOU Zhichun, GAO Kai, LUO Dinghui.  Genetic diversity analysis and core collection of pinewood nematodiasis-resistant Pinus massoniana germplasm resources . Journal of Zhejiang A&F University, 2024, 41(1): 67-78. doi: 10.11833/j.issn.2095-0756.20230333
    [2] HU Ao, ZHAO Yihui, WU Jilai, WU Yanping, LI Tongxin, YAN Yibo, YE Jianfeng, WANG Yixiang.  Effects of natural vegetation restoration after logging on soil organic carbon and its active components in Pinus massoniana secondary forests . Journal of Zhejiang A&F University, 2024, 41(6): 1189-1200. doi: 10.11833/j.issn.2095-0756.20240264
    [3] SHEN Peng, LI Gongquan.  Risk assessment of Bursaphelenchus xylophilus in Hubei Province based on ecological niche factor analysis model . Journal of Zhejiang A&F University, 2021, 38(3): 560-566. doi: 10.11833/j.issn.2095-0756.20200365
    [4] LI Wanyuan, TIAN Jia, MA Qin, JIN Xuejuan, YANG Zekang, YANG Penghui.  Dynamic monitoring of loess terraces based on Google Earth Engine and machine learning . Journal of Zhejiang A&F University, 2021, 38(4): 730-736. doi: 10.11833/j.issn.2095-0756.20200673
    [5] ZHANG Huafeng, CHEN Siyu, LIU Gang, WANG Yixiang.  Effects of sanitation cutting pine wilt diseased trees on the stand structure of pure Pinus massoniana plantation . Journal of Zhejiang A&F University, 2020, 37(4): 745-751. doi: 10.11833/j.issn.2095-0756.20190487
    [6] DU Yufei, WU Baoguo, CHEN Yuling.  Eucalyptus suitability in Guangxi based on machine learning algorithms . Journal of Zhejiang A&F University, 2020, 37(1): 122-128. doi: 10.11833/j.issn.2095-0756.2020.01.016
    [7] LI Qiao, ZHANG Shaoyong, QIU Huikang, CHEN Ming, YE Yuzhu, YUAN Yuan.  Field efficacy of six trunk-injected pesticides on Bursaphelenchus xylophilus . Journal of Zhejiang A&F University, 2016, 33(4): 718-723. doi: 10.11833/j.issn.2095-0756.2016.04.023
    [8] YUE Zhaoyang, ZHAO Bianjian, WANG Yulan, ZHANG Jingwen, ZHANG Xinping, WANG Chengxiang.  Risk analysis of Eulecanium gigantea in Xinjiang . Journal of Zhejiang A&F University, 2013, 30(1): 153-156. doi: 10.11833/j.issn.2095-0756.2013.01.023
    [9] ZHANG Ting-ting, XU Hua-chao, JIANG Ting.  Survey and risk analysis of forest pests in Nanhu District,Jiaxing City . Journal of Zhejiang A&F University, 2012, 29(4): 621-625. doi: 10.11833/j.issn.2095-0756.2012.04.022
    [10] XIANG Yi-juan, GUAN Jian-feng, LI Jian, HUANG Ji-yu, MA Liang-jin.  A pine-wilt disease forecasting model for Zhejiang Province . Journal of Zhejiang A&F University, 2011, 28(5): 775-778. doi: 10.11833/j.issn.2095-0756.2011.05.015
    [11] WANG Guo-ming, ZHAO Ying, CHEN Bin, LU Zhuan, CHEN Ye-ping, QIU Hai-sheng.  Species diversity with natural restoration in slash after control of pine wilt disease . Journal of Zhejiang A&F University, 2010, 27(2): 170-177. doi: 10.11833/j.issn.2095-0756.2010.02.002
    [12] LAI Yan-xue, MA Ling-fei, CHI Shu-you, ZHANG Yi-feng, WANG Ya-hong, JIN Yong-ming.  Comparing physical and mechanical properties of dead wood resulting from pine wilt disease and healthy wood . Journal of Zhejiang A&F University, 2008, 25(1): 7-10.
    [13] LAI Yan-xue.  Quick detection of Bursaphelenchus xylophilus by the method of dissecting pine twigs . Journal of Zhejiang A&F University, 2005, 22(2): 188-192.
    [14] LAI Yan-xue, ZHOU Yong-pin, ZHANG Yi-feng, YU Lin-xiang, ZHANG De-sheng.  Controlling pine wilt disease by means of aircraft spray of PEM with ultra-low-volume . Journal of Zhejiang A&F University, 2002, 19(3): 282-287.
    [15] LAI Yan-xue, YU Lin-xiang, ZHOU Yong-ping, LI Guo-ping, SHEN Bing-shun.  Trapping adults of Monochamus alternatus with chemically-treated pine logs and branches . Journal of Zhejiang A&F University, 2001, 18(1): 60-65.
    [16] LAI Yan-xue.  A tactic for controlling pine wilt disease based on its distribution and spread . Journal of Zhejiang A&F University, 2000, 17(2): 170-175.
    [17] LAI Yan-xue, YU Lin-xiang, ZHOU Yong-pin, WANG Liang-yan, SHEN You-lian.  First aid techniques and mechanism for dying pine trees damaged by pine wilt disease . Journal of Zhejiang A&F University, 2000, 17(4): 404-409.
    [18] LAI Yan-xue, ZHOU Lin-ping, YU Lin-xiang, SHEN Bing-shun, CAI Dao-yao.  Cause of pine wilt disease infecting healthy pine forest . Journal of Zhejiang A&F University, 1999, 16(4): 425-429.
    [19] Lai Yanxue.  Flight behavor of Monochamus alternatus and strategic thoughts to control pine wilt disease . Journal of Zhejiang A&F University, 1998, 15(3): 320-323.
    [20] Li Xiupeng, Wang Yijiao.  Pinus strobus var. chiapensis and Other Exotic Pines: Investigation and Analysis on Natural Nematode Infections and Death. . Journal of Zhejiang A&F University, 1997, 14(3): 273-276.
  • [1]
    MOTA M M, FUTAI K, VIEIRA P. Pine wilt disease and the pinewood nematode, Bursaphelenchus xylophilus [M]// CIANCIO A, MUKERJI K G. Integrated Management of Fruit Crops Nematodes Vol 4. Dordrecht: Springer, 2009: 253 − 274.
    [2]
    VICENTE C, ESPADA M, VIEIRA P, et al. Pine wilt disease: a threat to European forestry [J]. European Journal of Plant Pathology, 2012, 133(1): 89 − 99.
    [3]
    SHIN S C. Pine wilt disease in Korea [M]// ZHAO B G, FUTAI K, SUTHERLAND J R, et al. Pine Wilt Disease. Tokyo: Springer, 2008: 26 − 32.
    [4]
    National Forestry and Grassland Administration. Five-year Action Plan for Prevention and Control of Pine Wood Disease (2021−2025) [EB/OL]. 2021-07-07[2022-03-07]. http://www.forestry. gov.cn.
    [5]
    CHENG Hurui, LIN Maosong, LI Weiqiang, et al. Wilt nematode disease on black pine in Nanjing [J]. Forest Pest and Disease, 1983, 2(4): 1 − 5.
    [6]
    LI Jishun, PAN Jialiang, LIU Chao, et al. Analysis of the epidemic situation of pine wilt disease in China in 2020 [J]. Forest Pest and Disease, 2021, 40(4): 1 − 4.
    [7]
    National Forestry and Grassland Administration. Announcement of National Forestry and Grassland Administration (No. 5, 2020) [EB/OL]. 2021-10-18[2022-03-07].
    [8]
    YU Zhijun, LI Shuo, ZHOU Yantao, et al. Spatial estimation and prediction of suitable distribution of Bursaphelenchus xylophilus with different warming modes in China [J]. Journal of Northeast Forestry University, 2018, 46(1): 85 − 91.
    [9]
    LINIT M J. Nemtaode-vector relationships in the pine wilt disease system [J]. Journal of Nematology, 1988, 20(2): 227 − 235.
    [10]
    IL C W, SONG H J, SOO K D, et al. Dispersal patterns of pine wilt disease in the early stage of its invasion in South Korea [J]. Forests, 2017, 8(11): 411.
    [11]
    WANG Xinrong, ZHU Xiaowei, HU Yueqing, et al. A PCR-based method for detecting Bursaphelenchus xylophilus from Monochamus alternatus [J]. Scientia Silvae Sinicae, 2009, 45(7): 70 − 75.
    [12]
    SHEN Peng, LI Gongquan. Risk assessment of Bursaphelenchus xylophilus in Hubei Province based on ecological niche factor analysis model [J]. Journal of Zhejiang A&F University, 2021, 38(3): 560 − 566.
    [13]
    HAN Bing, PIAO Chungen, WANG Laifa, et al. Development status of pinewood nematode disease and its management strategies in China [J]. Chinese Agricultural Science Bulletin, 2007, 23(2): 146 − 150.
    [14]
    RYSS A Y, KULINICH O A, SUTHERLAND J R. Pine wilt disease: a short review of worldwide research [J]. Forestry Studies in China, 2011, 13(2): 132 − 138.
    [15]
    PAN Hongyang, YE Jianren, WU Xiaoqin. Spatial distribution patterns of pine wilt disease in China [J]. Acta Ecologica Sinica, 2009, 29(8): 4325 − 4331.
    [16]
    TU Yegou, YU Ailin, QUE Shengquan, et al. Effects of different host plants on longevity and reproduction of Monochamus alternatus Hope adults [J]. Southwest China Journal of Agricultural Sciences, 2019, 32(8): 1801 − 1804.
    [17]
    TU Yegou, LI Yi, YU Ailin, et al. Feeding and oviposition preferences of Monochamus alternatus adults among different host plants [J]. China Plant Protection, 2019, 39(5): 50 − 52, 57.
    [18]
    XI Yan, NIU Shukui. The effects of climatic factors on pine wilt disease [J]. Forest Resources Management, 2008(4): 70 − 76.
    [19]
    YU Haiying, WU Hao. New host plants and new vector insects for Bursaphelenchus xylophilus found in Liaoning [J]. Forest Pest and Disease, 2018, 37(5): 61.
    [20]
    ZHANG Chao. Spread Trend of Pine Wilt Disease in China and the Impact of Climate on the Epidemic [D]. Beijing: Beijing Forestry University, 2020.
    [21]
    LEE D S, CHOI W I, NAM Y, et al. Predicting potential occurrence of pine wilt disease based on environmental factors in South Korea using machine learning algorithms [J/OL]. Ecological Informatics, 2021, 64: 101378[2022-08-09]. doi: 10.1016/J.ECOINF.2021.101378.
    [22]
    YANG Hongyan, DU Jianmin, RUAN Peiying, et al. Vegetation classification of desert steppe based on Unmanned Aerial Vehicle Remote Sensing and Random Forest [J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52(6): 186 − 194.
    [23]
    LI Hao, FANG Weiquan, LI Langlang, et al. Recognition of pine wood infected with pine nematode disease based on deep learning [J]. Journal of Forestry Engineering, 2021, 6(6): 142 − 147.
    [24]
    YANG Baojun, WANG Qiuli, ZOU Weidong, et al. The resistance of species to pine wood nematode, Bursaphelenchus xylophilus [J]. Acta Phytopathologica Sinica, 1987, 17(4): 211 − 214.
    [25]
    YAN Yunfeng. Research and Application of Regression Modeling Methods Based on Decision Forests [D]. Hangzhou: Zhejiang University, 2019.
    [26]
    LIANG Huiling, LIN Yurui, YANG Guang, et al. Application of Random Forest Algorithm on the forest fire prediction in Tahe area based on meteorological factors [J]. Scientia Silvae Sinicae, 2016, 52(1): 89 − 98.
    [27]
    HE Yun, HUANG Chong, LI He, et al. Land-cover classification of random forest based on Sentinel-2A image feature optimization [J]. Resources Science, 2019, 41(5): 992 − 1001.
    [28]
    ZHOU Yun. Multi-scale Spatialization of Urban Population Based on Random Forest Algorithm [D]. Chongqing: Southwest University, 2021.
    [29]
    YIN Hua, HU Yuping. An imbalanced feature selection algorithm based on Random Forest [J]. Acta Scientiarum Naturalium Universitatis Sunyatseni, 2014, 53(5): 59 − 65.
    [30]
    VAPNIK V N. The Nature of Statistical Learning Theory [M]. New York: Springer, 1996.
    [31]
    LIU Fangyuan, WANG Shuihua, ZHANG Yidong. Overview on models and applications of Support Vector Machine [J]. Computer Systems &Applications, 2018, 27(4): 1 − 9.
    [32]
    WANG Yanguang, ZHU Hongbin, XU Weichao. A review on ROC curve and analysis [J]. Journal of Guangdong University of Technology, 2021, 38(1): 46 − 53.
    [33]
    LIU Huihe, XU Weichao, LIU Shun. Dimension reduction method applying in three-class ROC analysis based on SVM [J]. Computer and Modernization, 2016(7): 49 − 54.
    [34]
    ZHANG Liping. Automatic Synthesis Study of Map Graphics Based on Raster Patterns [D]. Wuhan: Wuhan University, 2009.
    [35]
    YE Jianren. Epidemic status of pine wilt disease in china and its prevention and control techniques and counter measures [J]. Scientia Silvae Sinicae, 2019, 55(9): 1 − 10.
    [36]
    YE Jiangxia, WANG Jingwen, ZHANG Mingsha, et al. Risk pattern analysis of Hyphantria cunea based on spatial matrix model and 0−1 measure [J]. Scientia Silvae Sinicae, 2021, 57(1): 140 − 152.
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(2)  / Tables(7)

Article views(414) PDF downloads(43) Cited by()

Related
Proportional views

Prediction of spread risk of pine wilt disease based on geographic raster variables and machine learning models

doi: 10.11833/j.issn.2095-0756.20220470

Abstract:   Objective  Considering the fact that pine wilt disease (PWD) has been the most serious disease threatening forest ecosystem in China. This study, with the simulation and expression of the driving variables by using geographic grid model and map algebra operation, is aimed to construct PWD measurement and forecasting system and form the spatial continuous measurement and prediction by using the dual scale of county area and geographic grid unit.   Method  First, datasets were formed by integrating geographic raster spatial data such as topography, meteorology, host, human activities and land use that affect PWD dispersal. Then, the model was constructed using the machine learning method of random forest and support vector machine before predicted results were superimposed with the susceptibility map of Pinaceae plants to perform the map algebraic operation of infection probability. Finally, an analysis was conducted of the risk level of PWD spread in the whole country based on geographic grid unit.   Result  (1) The prediction accuracy was 83.95% for the random forest model and 77.97% for the support vector machine model; (2) Altitude, average annual minimum precipitation, average annual precipitation and average annual low temperature were the main factors affecting the occurrence of PWD, with their contribution rates to the model construction being 0.151, 0.303, 0.258 and 0.194 respectively whereas human activity variables were the decisive variable affecting the diffusion of PWD with their contribution rate to the model construction being 0.194; (3) The potential dispersal areas were located in low altitude areas with dense human activities, forest areas adjacent to roads, urban distribution areas and plantation distribution areas while the highest risk areas were mainly Zhejiang, Jiangxi and Fujian in East China, Guangxi and Guangdong in South China as well as Hunan in central China.   Conclusion  With the employment of spatial simulation and machine learning methods, a mapping model was constructed to predict the spatial transmission pattern of PWD with diffusion risk of PWD predicted of specific geographical grid cells. It provides a reference for the accurate supervision of forest and grass disasters with practical guiding significance for the prevention and control of PWD epidemic in China. [Ch, 2 fig. 7 tab. 36 ref.]

LIU Tingting, YANG Jinfan, ZHOU Ruliang, LIU Lin. Prediction of spread risk of pine wilt disease based on geographic raster variables and machine learning models[J]. Journal of Zhejiang A&F University, 2023, 40(3): 617-626. doi: 10.11833/j.issn.2095-0756.20220470
Citation: LIU Tingting, YANG Jinfan, ZHOU Ruliang, LIU Lin. Prediction of spread risk of pine wilt disease based on geographic raster variables and machine learning models[J]. Journal of Zhejiang A&F University, 2023, 40(3): 617-626. doi: 10.11833/j.issn.2095-0756.20220470
  • 由松材线虫Bursaphelenchus xylophilus引起的松材线虫病是目前松树上最具破坏性的病害之一。自1905年日本首次报道发生松材线虫病以来,亚欧其他国家相继出现松材线虫病疫区[12]。中国的松材线虫病最早报道于1982年,出现在南京中山陵的黑松Pinus thunbergii上,之后该病害快速扩散,据国家林业和草原局统计,中国现共有松材线虫病县级疫区726个、乡镇级疫点5 479个,发生面积达180.9万hm2,包括轻型疫区206个(含2021年初新增疫情发生区)、重型疫区518个[34]。目前松材线虫病已成为中国最为严重的林业病害之一[56]。为实现《全国松材线虫病疫情防控五年攻坚行动计划(2021—2025)》中新发疫情“早发现、早报告、早除治、早拔除”的目的,达到控制增量的总体目标,就需要把握疫情的总体趋势,精准定位和定量各疫区的疫情,提高防疫针对性[7]

    已有研究发现:全球变暖环境下,气温增高会影响松材线虫病的地理分布[8];人类活动是影响松材线虫病远距离、跳跃式、梅花式扩散传播的重要因素[911];年均气温、年均降水量等是影响松材线虫定殖的主要气候因素[12];中国松材线虫病的发生受到寄主植物分布的强烈影响[1314];媒介昆虫对寄主的选择性也影响松材线虫病的发生[1517]。当前对松材线虫病传播风险的研究多集中于采用传统的地统计学方法来分析松材线虫病在省、市范围的风险格局[1820],而将机器学习引入病害风险测报研究较少见。机器学习方法中随机森林、支持向量机具有需要样本量少,能够同时利用多变量进行非线性回归、分类预测并且结果精度高等优点,使风险预测精确化分析成为可能[2123]

    本研究综合考虑松材线虫病扩散的影响因素,利用地理环境、气候环境、卫星遥感、道路交通网络等构成地理栅格数据集。对比随机森林和支持向量机2种算法建立松材线虫病风险扩散模型,探究机器学习算法在松材线虫病扩散传播预测研究中的可行性和准确性,寻求最佳预测方案,将潜在疫区根据等级进行风险区划分,并进行栅格化分析,将全国松材线虫病测报落实到“山头地块”上,以期为林业生态系统中松科Pinaceae植物的动态监测提供有效的手段。

  • 以中国县区为行政单元,以国家林业和草原局20182021年关于松材线虫病疫区公告为基础[4],收集整理了中国松材线虫病发生区731个以及未发生区2 131个,以松材线虫病发生(1)或未发生(0)作为模型响应变量,由点及面构建了2 862个样点。其中随机选取30%样本点(954个)作为模型独立检验样本。总样本点分布见表1

    省级行
    政区
    县域总
    数/个
    疫区数/
    占比/
    %
    省级行
    政区
    县域总
    数/个
    疫区数/
    占比/
    %
    重庆383592.1陕西1072523.4
    江西1008585.0四川1834223.0
    浙江897179.8辽宁1002020.0
    湖北1037774.8贵州881618.2
    福建855767.1山东1372417.5
    湖南1227662.3天津1616.2
    广东1257660.8河南15895.7
    安徽1055047.6吉林6023.3
    广西1113935.1云南12921.5
    江苏972323.7甘肃8711.1

    Table 1.  Pine wilt disease endemic area

  • 根据以往研究[8, 12],选取1970—2000年共30 a逐月气象数据获得年均气温、年均降水量以及年均最高气温、年均最低气温、年均最低降水量与年均最高降水量数据。空间分辨率为 250 m的气候数据集来源于https://www.worldclim.org/

  • 选取影响昆虫媒介墨天牛属Monochamus的地理因子,包括高程数据,以及根据高程数据计算提取的坡度与坡向指数。选取海拔、坡度、植被覆盖度以及温度构建阻力面,表征其自然传播的能力。根据全国高速公路、国道、省道以及一级水系等交通网络,利用欧氏距离来模拟人类活动导致的有害生物空间传播度量[9]。道路、水系矢量数据等人为影响数据来源于Bigemap平台,利用ArcGIS处理形成综合交通便利性数据;土地利用类型数据来源于GLOBELAND 30平台。

  • 全国1∶100万植被分布数据来源于资源环境数据云平台http://www.Resdc.cn/,从中提取出易受松材线虫病感染的针叶林植被。根据前人研究结果[24]将不同树种的寄主易感指数赋分为0~100,数值越高表示寄主易感性越高,划分标准见表2

    科属植被类型易感指数
    松科Pinaceae
    松属Pinus
    赤松 P. densiflora,黑松 P. thunbergii,马尾松 P. massoniana,思茅松 P. kesiya var. langbianensis 100
    松科松属 云南松 P. yunnanensis,华山松 P. armandii,油松 P. tabuliformis,高山松 P. densata,台湾松 P.
     taiwanensis
    80
    松科松属 白皮松 P. bungeana,巴山松 P. henryi,长叶松 P. palustris,乔松 P. wallichiana,海南五针松 P.
     fenzeliana,偃松 P. pumila,樟子松 P. sylvestris var. mongolica,西伯利亚红松 P. sibirica
    70
    松科落叶松属Larix 落叶松 L. gmelinii,新疆落叶松 L. sibirica,黄花落叶松 L. olgensis,日本落叶松 L. kaempferi,华北
     落叶松 L. gmelinii var. principis-rupprechtii,秦岭红杉L. potaninii var. chinensis,红杉 L. potaninii
     大果红杉 L. var. macrocarpa
    60
    松科冷杉属Abies 冷杉 A. firma,长苞冷杉 A. georgei,川滇冷杉 A. forrestii,臭冷杉 A. nephrolepis,台湾冷杉 A.
     kawakamii,巴山冷杉 A. fargesii,紫果冷杉 A. recurvata,鳞皮冷杉 A. squamata,苍山冷杉 A.
     delavayi,墨脱冷杉 A. delavayi var. motuoensis,西藏冷杉 A. spectabilis
    50
    柏科 Cupressaceae 柏木 Cupressus funebris,杉木 Cunninghamia lanceolata,侧柏 Platycladus orientalis,杜松 Juniperus
     rigida,祁连圆柏 J. przewalskii,塔枝圆柏 J. komarovii,玉山圆柏 J. morrisonicola,大果圆柏 J.
     tibetica,方枝柏 J. saltuaria,密枝圆柏 J. convallium
    40
    松科云杉属Picea和铁杉属Tsuga 云杉 P. asperata,鱼鳞云杉 P. jezoensis,白扦P. meyeri,青扦P. wilsonii,青海云杉P. crassifolia,雪
     岭杉P. schrenkiana,紫果云杉P. purpurea,麦吊云杉P. brachytyla,丽江云杉P. likiangensis,川西
     云杉P. likiangensis var. rubescens,林芝云杉P. linzhiensis,台湾云杉P. morrisonicola,台湾铁杉T.
     formosana,云南铁杉T. dumosa
    20
    针阔混交林 10
    其他非植被类型 0

    Table 2.  Quantification of pine wilt disease hosts

    综上,选择16个环境变量因子作为主要环境数据(表3)。环境数据集的空间分辨率通过重采样统一为250 m。地理数据为1∶320万的中国国界、省界以及县界行政区划图,来自于自然资源部标准地图服务网http://bzdt.ch.mnr.gov.cn/

    变量类型变量代号变量名称变量类型变量代号变量名称
    生物气候变量 Bio1 年平均气温 人为扩散因子 Wspct 水系影响力
    Bio2 年均高温 Hwpct 高速公路影响力
    Bio3 年均低温 Nrpct 国道影响力
    Bio4 年均降水量 Pvrpct 省道影响力
    Bio5 年均最高降水量 Rawpct 铁路影响力
    Bio6 年均最低降水量
    地理环境因子 Ele 海拔
    生物类因子  Hst 寄主分布 Slp 坡度
    LU 土地利用 Asp 坡向指数

    Table 3.  Independent variables of the prediction model of pine wilt disease

    • 随机森林(Random Forest Classifier,RF)是一种监督学习的机器学习算法,即将多个决策树分类器集合在一起进行数据的分类和回归[2527]。模型从原始样本抽取的随机性,以及利用小于样本集特征的数据建立训练集,增加了不同决策树结果的差异性,中和了过拟合现象,使得模型的泛化能力提高使其具有较高的分类准确性[2829]。支持向量机(Support Vector Machine,SVM)算法能够解决机器学习中样本量小、非线性和高维模式识别中“维数灾难”和“过学习”等方面的问题[3031]。该方法选择适当的核函数通过非线性变换将输入空间变换到一个高维空间,然后在这个高维空间求取最优分类面,找到输入变量和输出变量之间的一种非线性关系。

    • ROC-AUC特征曲线分析方法用来评价二值分类器的优劣[32]。ROC曲线反映了输出概率的准确性,曲线上每个点反映对同一信号刺激的感受性。横轴为负正类率(false postive rate,FPR)特异度,纵轴为真正类率(true postive rate,TPR),灵敏度均为[0, 1]。ROC曲线在斜对角线以下,表示该分类器效果差于随机分类器,反之,效果好于随机分类器[33]。ROC曲线下的面积(area under curve,AUC)作为数值可以直观地评价分类器的好坏。

    • RF和SVM模型均以松材线虫病在全国县域尺度中存在(1)或不存在(0)作为响应变量,对所有环境变量在输入前进行归一化处理,以消除由于纲量导致的差异。数据集以7∶3的比例随机分成两部分进行模型训练和测试。在建模过程中16个环境变量根据模型重要性及其影响,筛除部分低贡献度变量,并将针叶林分布变量提出作为重要变量与预测风险区叠加分析。模型预测精度和ROC-AUC曲线用来评价每个模型的性能,利用预测结果与独立检验样本通过混淆矩阵进行精确性检验。通过Python转为0~1之间的发生概率,输入ArcGIS中进行风险可视化处理。除此之外,全国栅格尺度更能够体现空间连续变化,在进行运算操作、叠加分析方面也比县级矢量数据更具优势[34]。利用从全国1∶100万植被分布类型中提取出来的寄主松科植物,根据松科植物不同的易感性重分类,栅格化后与模型预测的松材线虫病潜在发生区域叠加分析。

    • 通过模型精度和ROC-AUC检验结果(表4)可知:2个模型均有较好的预测能力,其预测精度和模型稳定性均在75%以上。模型精度是通过RF和SVM模型对独立检验样本数据进行松材线虫病的发生风险概率值预测,设定临界值为0.5,将大于此值县域视为松材线虫病“存在”(y=1),小于此值归为“不存在”(y=0),得出分类结果与独立检验样本的实际值来验证模型预测的精度。结果表明:构建模型的测试集预测精度在RF和SVM模型中总精度分别为83.95%和77.97%。

      观测 yRF 预测y正确率/%总体百分比/%SVM 预测y正确率/%总体百分比/%
      0101
      03845986.6883.9536010677.2577.97
      19641581.2110438478.68

      Table 4.  Model accuracy verification data sheet

      RF和SVM模型的ROC-AUC曲线值(图1)分别为0.89和0.77 (平均标准误差为0.01)。这证明2个模型在预测松材线虫病县级风险分布过程中具有逻辑可行性,预测结果具有可信性,但RF的效果、模型稳定性、模型分类结果泛化能力均高于SVM模型。

      Figure 1.  ROC-AUC curve

    • 根据RF模型对数据贡献度(图2)结果得出,重要性排序中位于前列的为气候变量,包括年均最低降水量(0.302 59)、年均降水量(0.170 33)、年均低温(0.102 98);其次是海拔变量(0.150 84),这2类是影响松材线虫病的重要环境变量。松材线虫病发生概率在低海拔地区较高,随着年均低温的升高风险概率增加,而高温、干旱、低湿气候直接影响松材线虫病的爆发。在传播媒介影响中,人为因素中各级交通网络影响力重要性更为突出(0.067 58),说明密集的人流和物流是松材线虫病远距离传播的重要媒介。在重要性排序中松科植物类型数据贡献度与实际影响松材线虫病的情况不符,寄主是影响松材线虫病存在和发展的必然条件,故将针叶林分布单独提出栅格化,与预测风险区叠加分析其影响。

      Figure 2.  Contribution of RF model variables

    • 利用Python将预测结果提取为0~1之间的概率数据,依据离散程度划分为5个等级,由高到低依次为极高风险、高风险、中等风险、低风险、极低风险。因样本原因,不涉及中国香港、台湾、澳门行政区。对比RF与SVM模型的分析结果(表5表6),SVM对重要变量的模糊表达导致其预测结果等级差距小、重点疫区不突出、预测精确性低、全国大部分地区呈同一等级,这使得对松材线虫病疫区的监测与防治难度增大。相比而言,RF预测结果具有更高的准确性,其潜在疫区分布与原有疫区重合度高,危险等级一致,并能够明显表达城市、道路以及地形对潜在疫区分布的影响。

      省级行政区县域数不同等级风险区占比/%省级行政区县域数不同等级风险区占比/%
      极高极低极高极低
      北京 1612.5018.7512.5025.0031.25湖南 12282.7913.932.460.820
      天津 1606.2525.0043.7525.00广东 12582.405.601.600.809.60
      河北 1684.767.7427.3829.7630.36海南 2850.0021.4214.29014.29
      山西 117000.853.4295.73重庆 3836.8439.4818.425.260
      辽宁 1008.0037.0023.0025.007.00四川 1837.1013.6620.2214.7644.26
      吉林 60010.0041.6735.0013.33贵州 886.8213.6430.6840.917.95
      黑龙江13306.7710.5236.0946.62云南 12900.7812.4027.1359.69
      上海 1612.5025.0018.756.2537.50陕西 10707.4810.2823.3658.88
      江苏 9727.8438.1410.313.0920.62甘肃 870005.7594.25
      浙江 8974.1620.222.251.122.25青海 450006.6793.33
      安徽 10567.6216.198.5707.62宁夏 220004.5595.45
      福建 8563.5321.1811.762.351.18新疆 105000100.000
      江西 10095.005.00000内蒙古10300013.5986.41
      山东 13727.0018.9823.3622.638.03广西 11178.389.918.112.700.90
      河南 15813.9337.9722.7813.9311.39西藏 740000100.00
      湖北 10365.0517.4810.682.913.88

      Table 5.  Potential risk levels of pine wilt disease in counties based on RF

      省级行政区县域数不同等级风险区占比/%省级行政区县域数不同等级风险区占比/%
      极高极低极高极低
      北京 160056.0044.000湖南 1227.3820.4950.8221.310
      天津 160013.0068.0019.00广东 1258.0028.8048.8013.600.80
      河北 16800.6048.8044.056.55海南 283.5714.2942.8525.0014.29
      山西 1170013.6886.320重庆 382.632.6394.7400
      辽宁 1002.0012.0072.0014.000四川 18301.0950.8245.362.73
      吉林 600078.0022.000贵州 880040.9159.090
      黑龙江13302.2651.8845.860云南 1290019.3871.329.30
      上海 160081.2518.750陕西 1070032.7166.360.93
      江苏 9707.2282.4710.310甘肃 87009.1983.916.90
      浙江 8910.1113.4868.547.870青海 45004.4475.5620.00
      安徽 1055.7118.1062.8613.330宁夏 22009.0981.829.09
      福建 857.0615.2963.5314.120新疆 1050018.0080.002.00
      江西 10010.0030.0048.0012.000内蒙古10300.9732.0466.020.97
      山东 13705.1161.3131.392.19广西 1117.2111.7164.8714.411.80
      河南 15806.3372.7820.890西藏 740001000
      湖北 1030.9722.3363.1113.590

      Table 6.  SVM potential risk levels of pine wilt disease in counties

    • 寄主植物根据其易感性划分等级,松材线虫病易感性松科植物多分布于浙江、福建、湖南、广西等省,另外,云南、贵州、四川以及东北林区也有大量寄主植物分布。利用RF模型叠加寄主易感性分布数据对全国31个省进行地理栅格尺度的疫区划分与分析,其中概率大于0.8以上的栅格点为极易感染区域。结果(表7)表明:松材线虫病的极高风险区为华东地区的浙江、江西、福建,华南地区的广西、广东以及华中地区的湖南,高易感性寄主分布与极高风险区预测结果高度重合,疫区内多为赤松、黑松、马尾松;高风险区包括华东地区的安徽,华中地区的湖北,以及西南地区的重庆、贵州,高风险区寄主也呈高易感性,但其分布分散且多于其他植被混生,森林具有类型多样化;中等风险区包括华中的河南、江苏,东北地区辽宁,华东地区山东,寄主植物易感性降低,多呈中等易感性且面积小,多为散布状态,其中很多省份为粮食大省,以种植农作物为主,林业分布较少;无风险区包括西北地区甘肃、宁夏、青海、新疆,华北地区的内蒙古、山西和西南地区的西藏,缺少寄主植物的分布,人流、物流相较于其他地区也较少,难以形成松材线虫病潜在风险区;其余省份划分为低风险区,其中需注意的是云南、四川大部分地区存在高易感性的思茅松、云南松等优势种,风险等级容易上升。

      省级行政区1000 m
      栅格数
      平均值标准差变异系
      数/%
      预测值
      ≥0.8
      极易感染
      区占比/%
      省级行政区1000 m
      栅格数
      平均值标准差变异系
      数/%
      预测值
      ≥0.8
      极易感染
      区占比/%
      福建132 4130.720.2534.7262 34947.09云南 383 0060.160.16100.00350.01
      浙江113 2950.640.2539.0637 88133.44吉林 214 2420.180.1161.1100
      湖南220 3120.620.2337.1061 78028.04黑龙江505 9850.120.0975.0000
      广西242 2970.520.2344.2346 20719.07上海 7 4640.240.0416.6700
      江西178 4860.580.1932.7630 78717.25河北 197 8810.040.06150.0000
      广东186 7050.550.1832.7327 31814.63海南 35 1070.390.0615.3800
      贵州178 8080.400.2255.0018 34310.26北京 17 3220.050.0360.0000
      安徽150 8850.450.2044.4414 6229.69天津 12 0790.060.0350.0000
      湖北193 5500.470.1634.0416 6398.60甘肃 427 3320.020.06300.0000
      重庆83 8450.430.1534.885 5206.58青海 699 26200.02000
      山东166 8850.20.1995.004 7262.83宁夏 52 44000.01000
      河南172 7460.240.1979.173 6992.14新疆 1 693 15700.02000
      江苏110 9800.310.1135.481 3581.22内蒙古1 190 1050.030.08266.6700
      四川488 1130.160.16100.002 3940.49山西 162 1730.030.09300.0000
      陕西209 6150.130.17130.776480.31西藏 1237 0770.020.07350.0000
      辽宁159 7840.270.1555.564380.27

      Table 7.  Prediction of the potential risk of geographic grids for RF and host susceptibility to pine wilt disease

    • 相比于县域行政单元尺度的传统地学统计方法,利用地理栅格信息描述驱动变量,以RF模型建模预报风险的方法,实现了测报变量的数字矩阵描述,解决了测报结果的空间分异性与连续化描述的难题。全国栅格尺度的松材线虫病风险测报能够更好地反映人类活动、地理环境变化、寄主易感性对松材线虫病传播轨迹的影响,对基层单位开展检疫、预防和治理等工作有指示和支持作用。

      降水、气温和海拔是影响松材线虫病生存的主要因素,统计分析得出松材线虫病最低生存年均降水量范围为110~121 mm。影响松材线虫病发生的年均低温范围为10~13 ℃,这与叶建仁[35]以年均气温10~14 ℃区域为松材线虫病可以发生区,大于14 ℃为流行区划分的预测区域基本一致。影响松材线虫病发生的海拔范围为4~247 m,这与LEE等[21]预测的松材线虫病最适生存海拔小于200 m基本一致。在传播影响中,除了媒介昆虫通过自然传播,人类活动对松材线虫病的扩散也具有重要的影响。松材线虫病早期发生地点多位于城市化活跃、人类活动强烈的区域内,其原因为城市用地、道路用地增加以及人工育林替代原始植被降低了生态稳定性[36],这为松材线虫病爆发与传播埋下巨大隐患。

      全国地理栅格扩散风险格局分析表明:松材线虫病在华北平原和长江中下游平原持续扩散,并且存在3条传播廊道:一是沿着长江水系向西南地区及四川盆地扩散,现以水热条件优越、高城市化的重庆潜在风险等级为最高;二是沿着黄渤海临海城市向东北平原的扩散廊道,这个廊道中北京和天津的潜在风险等级突出,可能是受到高城市化发展和密集交通网络的影响;三是沿金沙江、珠江流域形成了一条由广东、海南、广西向云贵高原扩散的通道,虽西南地区海拔较高,但其丰富的松科植物仍为松材线虫病的生存和传播提供基础。

    • 基于多类型数据建立的RF与SVM模型均有较好的预测性能(ROC-AUC>75%)。RF模型具有更高的精度为83.95%,并且预测的极高风险区与寄主高易感性分布高度重合,证明RF模型分类结果更符合实际。

      影响松材线虫病发生的敏感变量是年均降水量、年均低温和海拔,而坡度小、人类活动强度高是导致松材线虫病快速传播的重要因素。潜在扩散区位于人类活动密集的低海拔地区、道路通达的林区、城市城镇分布区和人工林分布区。其中,极高风险分布地区主要位于华东地区的浙江、江西及福建,华南地区的广西、广东以及华中地区的湖南。松材线虫扩散过程中存在明显传播廊道与地理阻隔。松材线虫在华北平原的扩散传播受到秦岭和太行山脉的阻隔,并未侵入西北地区的黄土高原,而高风险疫区多集中在黄河水系以南地区。

      本研究仅使用了RF及SVM共2种机器学习算法,样点数据仅为国家林业和草原局2018—2021年松材线虫病疫区数据,也还未考虑环境动态变化导致的松材线虫病疫区发生的风险变化。未来的研究应以松材线虫病实际病害发生位置为基础,通过更多的机器学习算法或深度学习算法进行对比研究,充分考虑与研究区相适应的自然环境、人类活动等时空数据,以期为松材线虫病疫区的预测提供更加准确、精细的风险测报。

Reference (36)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return