Application of classification trees-J48 to model the presence of roach (Rutilus rutilus) in rivers

Author

Dept. of Environmental Science, Faculty of Natural Resources, University of Guilan, P.O.Box 1144, Sowmehsara, Guilan, Iran. E-mail: rzarkami2002@yahoo.co.uk

Abstract

In the present study, classification trees (CTs-J48 algorithm) were used to study the occurrence of roach in rivers in Flanders (Belgium). The presence/absence of roach was modelled based on a set of river characteristics. The predictive performance of the CTs models was assessed based on the percentage of Correctly Classified Instances (CCI) and Cohen's kappa statistics. To find the best model performance, a 3-fold cross validation techniques was applied on the dataset. The effect of Pruning Confidence Factors (PCFs) was examined on the reliability and model complexity. Based on the obtained results, the induced model could predict well the presence/absence of roach in the rivers. The highest overall means of two model performances showed that the models were reliable. When analyzing the ecological relevance of CTs, it seemed that the structural-habitat variables were more the main predictors than the water quality ones to predict the occurrence of roach in rivers. In particular, the distance from the source and width contributed more to the prediction of roach while among water quality variables, only electric conductivity was relatively important in this regard.
 
 
REFERENCES
 Belpaire, C., Smolders, R., Vanden Auweele, I., Ercken, D., Breine, J., Van Thuyne, G. and Ollevier, F. (2000) An Index of Biotic Integrity characterizing fish populations and the ecological quality of Flandrian water bodies. Hydrobiologia. 434, 17-33.
Brabrand, A. (1985) Food of roach, Rutilus rutilus and ide, Leuciscus idus: significance of diet shift for interspecific competition in omnivorous fishes. Oecologia. 66, 461-467.
Brabrand, A. and Faafeng, B. (1994) Habitat shift in roach, Rutilus rutilus induced by the introduction of pikeperch, Stizostedion lucioperca. Limnologie. 25, 21-23.
Breine, J., Simoens, I., Goethals, P.L.M., Quataert, P., Ercken, D., Chris, V. L. and Belpaire, C. (2004) A fish-based index of biotic integrity for upstream brooks in Flanders (Belgium). Hydrobiologia. 522, 133-148.
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression Trees. Wadsworth, Pacific Grove, CA, USA.
Brosse, S. and Lek, S. (2000) Modelling roach, Rutilus rutilus microhabitat using linear and nonlinear techniques. Freshwater. Bio. 44, 34-41.
Cohen, J. (1960) A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20 (1), 37-46.
Copp, G. H. (1990) Shifts in the microhabitat of larval and juvenile the roach, Rutilus rutilus L. in a floodplain channel. J. Fish Biol. 36, 683-692.
Copp, G. H. (1992) An empirical model for predicting microhabitat of 0+ juvenile fishes in a lowland river catchment. Oecologia. 91, 338-345.
Dakou, E., Goethals, P.L.M., D’heygere, T., Dedecker, A.P., Gabriels, W. and De Pauw, N. (2006) Development of artificial neural network models predicting macroinvertebrate taxa in the river Axios (Northern Greece). Annales de Limnologie, Ann. Limnol., Int. J. Lim. 42, 241- 250.
Dakou, E., D'heygere, T., Dedecker, A.P., Goethals, P.L.M., LazaridouDimitriadou, M. and De Pauw, N., (2007) Decision tree models for prediction of macroinvertebrate taxa in the river Axios (Northern Greece). Aquat. Ecol. 41,399-411.
Dedecker, A.P., Goethas P.L.M., Gabriels, W. and De Pauw, N. (2002) Comparison of Artificial Neural Network (ANN) model developments methods for prediction of macroinvertebrates communities in the Zwalm river basin in Flanders, Belgium. The ScientificWorldJo. 2, 96-104.
D’heygere, T., Goethals, P. L. M. and De Pauw, N. (2003) Use of genetic algorithms to select input variables in decision tree models for the prediction of benthic macroinverteberates. Ecol. Model. 160, 291-300.
D’heygere, T., Goethals, P. L. M. and De Pauw, N. (2006) Genetic algorithms for optimization of predictive ecosystems models based on decision trees and neural networks. Ecol. Model. 195, 20-29.
Dzeroski, S., Grobovic, J. and Walley, W.J. (1997) Machine learning applications in biological classification of river water quality, pp.429-448. In: Michalski, R.S., Bratko, I. & Kubat, M. Machine learning data mining: methods and applications. John Wiley and Sons Ltd., New York.
Eklov, P. (1997) Effects of habitat complexity and prey abundance on the spatial and temporal distributions of perch, Perca fluviatilis and pike, Esox lucius. Can. J. Fish. Aquat. Sci. 54, 1520- 1531.
Fielding, A.H. and Bell, J.F. (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ. Conserv. 24, 38-49.
Garner, P. (1995) Suitability indices for Zarkami 197 juvenile 0+ roach, Rutilus rutilus (L.) using point abundance sampling data. Regul. River. 10, 99-104.
Goethals, P.L.M. and De Pauw, N. (2001) Development of a concept for integrated ecological river assessment in Flanders, Belgium. J. Limnol. 60, 7-16.
Goethals, P. L. M. (2005) Data driven development of predictive ecological models for benthic macroinvertebrates in rivers. PhD thesis. University of Ghent, 377 p.
Goethals, P.L.M., Dedecker, A.P., Gabriels, W., Lek, S. and De Pauw, N. (2007) Applications of artificial neural networks predicting macroinvertebrates in freshwaters. Aquat. Ecol. 41, 491-508.
Horppila, J. (1994) The diet and growth of roach, Rutilus rutilus (L.)) in Lake Vesijarvi and possible changes in the course of biomanipulation. Hydrobiologia. 294, 35-41.
Kahl, U., Dorner, H., Radke, R.J., Wagner, A. and Benndorf, J. (2001) The roach population in the hypertrophic Bautzen Reservoir: structure, diet and impact on Daphnia galeata. Limnologica. 31: 61-68.
Kahl, U. and Radke, R. J. (2006) Habitat and food resource use of perch and roach in a deep mesotrophic reservoir: enough space to avoid competition? Ecol. Freshw. Fish. 15, 48-56.
Jackson, D.A. and Harvey, H.H. (1997) Qualitative and quantitative sampling of lake fish communities. Can. J. Fish. Aquat. Sci. 54, 2807-2813.
Johansson, L. and Persson, L. (1986) The fish community of temperate, eutrophic lakes. In: Riemann, M.B.S. ed. Carbon dynamics of eutrophic, temperate lakes: the structure and functions of the pelagic environment. Amsterdam: Elsevier, pp. 237-266.
Lawton, J. (1996) Patterns in ecology. Oikos. 75, 145-147. Hoang, T.H., Lock, K., Mouton, A. and Goethals, P. L.M. (2010) Application of classification trees and support vector machines to model the presence of macroinvertebrates in rivers in Vietnam. Ecol. Inform. 5, 140-146.
Manel, S., Dias, J.M., Buckton, S.T. and Ormerod, S. J. (1999) Alternatives methods for predicting species distribution: an illustration with Hialayan river birds. J. Appl. Ecol. 36,734-747.
Manel, S., Williams, H.C. and Ormerod, S.J. (2001) Evaluating presence-absence models in ecology: the need to account for prevalence. J. Appl. Ecol. 38, 921-931.
Olden, J.D. and Jackson, D.A. (2002) A comparison of statistical approaches for modelling fish species distributions. Freshwater Biol. 47, 1976- 1995.
Quinlan, J.R. (1986) Induction of decision trees. Mach.Learn. 1, 81-106.
Quinlan, J.R. (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco, USA.
Persson, L. and Greenberg, L.A. (1990) Juvenile competitive bottlenecks- the perch, Perca fluviatilis- roach, Rutilus rutilus interaction. Ecology. 71, 44-56.
Persson, L. (1983) Effects of intraspecific and interspecific competition on dynamics and size structure of a perch, Perca fluviatilis and a roach, Rutilus rutilus population. Oikos. 41,126-132.
Poizat G. and Pont D. (1996) Multi-scale approach to species-habitat relationships: juvenile fish in a large river section. Freshwater Biol. 36, 611- 622.
Ricciardi, A. and Rasmussen, J.B. (1999) Extinction rates of North American freshwater fauna. Conserv. Biol. 13, 1220-1222.
Rossier, O., Castella, E. and Lachavanne, J.B. (1996) Influence of submerged aquatic vegetation on size class distribution of perch, Perca fluviatilis and roach, Rutilus rutilus in the littoral zone of Lake Geneva (Switzerland). Aquat. Sci. 58, 1-14.
Schoener, T. (1974) Resource partitioning in ecological communities. Science. 185, 27-39.
Schulze, T., Dörner, H., Hölker, F. and Mehner, T. (2006) Determinants of habitat use in large roach. J. Fish Biol. 69, 1136-1150.
 Sharma, C.M. and Borgstrøm, R. (2008) Shift in density, habitat use, and diet of perch and roach: An effect changed predation pressure after manipulation of pike. Fish. Res. 91, 98-106.
Skov, C., Berg, S., Jacobsen, L. and Jepsen, N. (2002) Habitat use and foraging 198 The occurrence of the roach in the rivers success of 0+ Pike, Esox lucius (L.) in experimental ponds related to prey fish, water transparency and light intensity. Ecol. Freshw. Fish. 11, 65-73.
Vinni, M., Horppila, J., Olin, M., Ruuhijarvi, J. and Nyberg, K., (2000) The food, growth and abundance of five co-existing cyprinids in lake basins of different morphometry and water quality. Aquat. Ecol. 34, 421-431.
Werner, E.E., Hall, D.J., Laughlin, D.R., Wagner, D.J., Wilsmann, L.A. and Funk, F.C. (1977) Habitat partitioning in a freshwater fish community. J. Fish.Res.Board.Can. 34, 360-370.
Witten, J.H. and Frank, E. (2000) Data mining: practical machine learning tools and techniques with Java implementations, Morgan Kaufman publishers, San Francisco. 369 p.

Keywords


Belpaire, C., Smolders, R., Vanden Auweele, I., Ercken, D., Breine, J., Van Thuyne, G. and Ollevier, F. (2000) An Index of Biotic Integrity characterizing fish populations and the ecological quality of Flandrian water bodies. Hydrobiologia. 434, 17-33.
 
Brabrand, A. (1985) Food of roach, Rutilus rutilus and ide, Leuciscus idus: significance of diet shift for interspecific competition in omnivorous fishes. Oecologia. 66, 461-467.
 
Brabrand, A. and Faafeng, B. (1994) Habitat shift in roach, Rutilus rutilusinduced by the introduction of pike-perch, Stizostedion lucioperca. Limnologie. 25, 21-23.
 
Breine, J., Simoens, I., Goethals, P.L.M., Quataert, P., Ercken, D., Chris, V. L. and Belpaire,C. (2004) A fish-based index of biotic integrity for upstream brooks in Flanders (Belgium). Hydrobiologia.522, 133-148.
 
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression Trees. Wadsworth, Pacific Grove, CA, USA.
 
Brosse, S. and Lek, S. (2000) Modelling roach, Rutilus rutilus microhabitat using linear and nonlinear techniques. Freshwater. Bio.44, 34-41.
 
Cohen, J. (1960) A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20 (1),37-46. Copp, G. H. (1990) Shifts in the microhabitat of larval and juvenile the roach, Rutilus rutilus L. in a floodplain channel. J. Fish Biol. 36, 683-692.
 
Copp, G. H. (1992) An empirical model for predicting microhabitat of 0+ juvenile fishes in a lowland river catchment. Oecologia. 91, 338-345.
 
Dakou, E., Goethals, P.L.M., D’heygere, T., Dedecker, A.P., Gabriels, W. and De Pauw, N. (2006) Development of artificial neural network models predicting macroinvertebrate taxa in the river Axios (Northern Greece). Annales de Limnologie-Ann. Limnol.- Int. J. Lim. 42, 241- 250.
 
Dakou, E., D'heygere, T., Dedecker, A.P., Goethals, P.L.M., Lazaridou-Dimitriadou, M. and De Pauw, N., (2007) Decision tree models for prediction of macroinvertebrate taxa in the river Axios (Northern Greece). Aquat. Ecol. 41,399-411.
 
Dedecker, A.P., Goethas P.L.M., Gabriels, W. and De Pauw, N. (2002) Comparison of Artificial Neural Network (ANN) model developments methods for prediction of macroinvertebrates communities in the Zwalm river basin in Flanders, Belgium. The ScientificWorldJo.2, 96-104.
 
D’heygere, T., Goethals, P. L. M. and De Pauw, N. (2003) Use of genetic algorithms to select input variables in decision tree models for the prediction of benthic macroinverteberates. Ecol. Model. 160, 291-300.
 
D’heygere, T., Goethals, P. L. M. and De Pauw, N. (2006) Genetic algorithms for optimization of predictive ecosystems models based on decision trees and neural networks. Ecol. Model. 195, 20-29.
 
Dzeroski, S., Grobovic, J. and Walley, W.J. (1997) Machine learning applications in biological classification of river water quality, pp.429-448. In: Michalski, R.S., Bratko,
 
I. & Kubat, M. Machine learning data mining: methods and applications. John Wiley and Sons Ltd., New York.
 
Eklov, P. (1997) Effects of habitat complexity and prey abundance on the spatial and temporal distributions of perch, Perca fluviatilis and pike, Esox lucius. Can. J. Fish. Aquat. Sci.54, 1520-1531.
 
Fielding, A.H. and Bell, J.F. (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ. Conserv. 24, 38-49. Garner, P. (1995) Suitability indices for Zarkami 197juvenile 0+ roach, Rutilus rutilus (L.) using point abundance sampling data. Regul. River. 10, 99-104.
 
 Goethals, P.L.M. and De Pauw, N. (2001) Development of a concept for integrated ecological river assessment in Flanders, Belgium. J. Limnol. 60, 7-16.
 
Goethals, P. L. M. (2005) Data driven development of predictive ecological models for benthic macroinvertebrates in rivers. PhD thesis. University of Ghent. 377 pp.
 
Goethals, P.L.M., Dedecker, A.P., Gabriels, W., Lek, S. and De Pauw, N. (2007) Applications of artificial neural networks predicting macroinvertebrates in freshwaters. Aquat. Ecol. 41, 491-508.
 
Horppila, J. (1994) The diet and growth of roach, Rutilus rutilus (L.)) in Lake Vesijarvi and possible changes in the course of biomanipulation. Hydrobiologia. 294, 35-41.
 
Kahl, U., Dorner, H., Radke, R.J., Wagner, A. and Benndorf, J. (2001) The roach population in the hypertrophic Bautzen Reservoir: structure, diet and impact on Daphnia galeata. Limnologica. 31: 61-68.
 
Kahl, U. and Radke, R. J. (2006) Habitat and food resource use of perch and roach in a deep mesotrophic reservoir: enough space to avoid competition? Ecol. Freshw. Fish.15, 48-56.
 
Jackson, D.A. and Harvey, H.H. (1997) Qualitative and quantitative sampling of lake fish communities. Can. J. Fish. Aquat. Sci.54, 2807-2813.
 
Johansson, L. and Persson, L. (1986) The fish community of temperate, eutrophic lakes. In: Riemann, M.B.S. ed. Carbon dynamics of eutrophic, temperate lakes: the structure and functions of the pelagic environment. Amsterdam: Elsevier, pp. 237-266. Lawton, J. (1996) Patterns in ecology. Oikos. 75, 145-147.
 
Hoang, T.H., Lock, K., Mouton, A. and Goethals, P. L.M. (2010) Application of classification trees and support vector machines to model the presence of macroinvertebrates in rivers in Vietnam. Ecol. Inform.5, 140-146.
 
Manel, S., Dias, J.M., Buckton, S.T. and Ormerod, S. J. (1999) Alternatives methods for predicting species distribution: an illustration with Hialayan river birds. J. Appl. Ecol. 36,734-747.
 
Manel, S., Williams, H.C. and Ormerod, S.J. (2001) Evaluating presence-absence models in ecology: the need to account for prevalence. J. Appl. Ecol. 38, 921-931.
 
Olden, J.D. and Jackson, D.A. (2002) A comparison of statistical approaches for modelling fish species distributions. Freshwater Biol.47, 1976-1995. Quinlan, J.R. (1986) Induction of decision trees. Mach.Learn.1(1), 81-106.
 
Quinlan, J.R. (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco, USA. Persson, L. and Greenberg, L.A. (1990) Juvenile competitive bottlenecks- the perch, Perca fluviatilis- roach, Rutilus rutilus interaction. Ecology. 71, 44-56.
 
 Persson, L. (1983) Effects of intraspecific and interspecific competition on dynamics and size structure of a perch, Perca fluviatilis and a roach, Rutilus rutilus population. Oikos. 41,126-132.
 
Poizat G. and Pont D. (1996) Multi-scale approach to species-habitat relationships: juvenile fish in a large river section. Freshwater Biol.36, 611-622. Ricciardi, A. and Rasmussen, J.B. (1999) Extinction rates of North American freshwater fauna. Conserv. Biol. 13, 1220-1222.
 
Rossier, O., Castella, E. and Lachavanne, J.B. (1996) Influence of submerged aquatic vegetation on size class distribution of perch, Perca fluviatilisand roach, Rutilus rutilus in the littoral zone of Lake Geneva (Switzerland).Aquat. Sci.58, 1-14.
 
Schoener, T. (1974) Resource partitioning in ecological communities. Science. 185, 27-39. Schulze, T., Dörner, H., Hölker, F. and Mehner, T. (2006) Determinants of habitat use in large roach. J. Fish Biol.69, 1136-1150.
 
Sharma, C.M. and Borgstrøm, R. (2008) Shift in density, habitat use, and diet of perch and roach: An effect changed predation pressure after manipulation of pike. Fish. Res.91, 98-106.
 
Skov, C., Berg, S., Jacobsen, L. and Jepsen, N. (2002) Habitat use and foraging
198The occurrence of the roach in the riverssuccess of 0+ Pike, Esox lucius (L.) in experimental ponds related to prey fish, water transparency and light intensity. Ecol. Freshw. Fish. 11, 65-73.
 
Vinni, M., Horppila, J., Olin, M., Ruuhijarvi, J. and Nyberg, K., (2000) The food, growth and abundance of five co-existing cyprinids in lake basins of different morphometry and water quality. Aquat. Ecol. 34, 421-431.
Werner, E.E., Hall, D.J., Laughlin, D.R., Wagner, D.J., Wilsmann, L.A. and Funk, F.C. (1977) Habitat partitioning in a freshwater fish community. J. Fish.Res.Board.Can. 34, 360-370.
 
Witten, J.H. and Frank, E. (2000) Data mining: practical machine learning tools and techniques with Java implementations, Morgan Kaufman publishers, San Francisco. 369 pp.