Cover Page

Epidemiology and Geography

Principles, Methods and Tools of Spatial Analysis

Marc Souris

images

Foreword

This book is the result of a long series of scientific works that the author has conducted for over 30 years. With an initial background in mathematics and computer science, Marc Souris is one of the few researchers who have focused their research efforts on the methodological development applied to spatial data, which he realized through many research programs involving various disciplines (geology, geography, epidemiology, etc.). Due to his quite unique positioning, his capacity to go beyond the frontiers of his initial academic training and his ability to clearly and objectively present principles that may seem complicated at first glance, this work has a particularly remarkable and unique character.

This book offers a very rich state of the art of the concepts, methods and tools of spatial analysis currently used in epidemiology and in certain works related to health geography, and that the author has intelligently organized in coherent chapters. This type of book is all the more valuable as overviews covering such a wide range of methods are rare. The author devotes particular attention to describing the formalism, the terminology and the scientific approach to be adopted by anyone willing to apply spatial analysis in the health field. The author warns, and with good reason, on the numerous pitfalls (confusion factors, ecological error, layout of spatial substratum, edge effect, etc.) and limits (oversimplification of reality, inadequacy between the level of analysis and the spatial scale of the processes, data reliability, uncertainty in localization, etc.) that have to be dealt with and for which solutions are proposed. The author uses compelling examples, particularly in relation to vector-borne infectious diseases, without, however, omitting other categories of diseases (notably chronic or degenerative diseases, such as long-term disorders or diabetes), although the latter are less often mentioned in this book. The examples refer to study sites predominantly located in southern countries (Latin America, Southeast Asia and Africa).

A further very significant aspect is that the methods are presented in a highly didactic manner, by means of simply formulated questions to which they allow an answer. Whenever possible, several software solutions are suggested in order to implement the advocated methods. Furthermore, it is worth noting that Marc Souris has himself optimized a number of methods presented in this book and has also developed new ones, among which is an operational implementation using the SavGIS free software.

Finally, this book features a balanced integration of theoretical and methodological issues, practical examples and elements related to the software to be used. There are summaries at the end of each chapter, numerous illustrations (maps and graphic representations), many boxed texts, a glossary, a rich bibliography and two detailed practical cases, all of these presented in a very accessible style, which facilitates the reading experience. Although it primarily addresses students enrolled in Master and PhD programs, researchers, research analysts or managers working in the healthcare sector, it is also a further reaching resource that can prove valuable for anyone willing to acquire knowledge on spatial analysis methods, regardless of the field of application.

Florent DEMORAES

Lecturer and Researcher –University of Rennes 2,
Deputy Director of UMR 6590 ESO
(Spaces and Societies – CNRS)

Preface

“I lie only to tell the truth”

Chinese proverb

This book gives an overview of the objectives, principles and methods of spatial analysis and of geographic information systems used in the healthcare sector, and particularly in the study of infectious diseases and of health–environment relations. It is designed as a practical introduction to spatial and space-time analysis for epidemiology and health geography. Its objective is to offer a detailed description of the objectives, concepts, and most of the methods and techniques available in this field, with a didactic approach illustrated by concrete examples. It is aimed at students and public health professionals, epidemiologists, public health inspectors, health geographers and experts in (human or animal) health–environment relations, who are interested in a comprehensive overview of the subject that does not require in-depth mathematics or statistics knowledge. Finally, the book also aims to be a tool that can be used by all of those interested in an introduction to the general methods of spatial analysis.

Spatial analysis includes any technique that studies objects and their attributes using topological or geometric properties, generally in a two- or three-dimensional metric space. This is a very general definition that applies to many domains. Spatial analysis is not a recent discipline; it has been used for many years in biology, botany, epidemiology, image processing, network analysis, electronic design, chemistry, cosmology, climatology, hydrology, economics, etc. Obviously, it has been used in geography, where spatial analysis is defined as “formalized analysis of the configuration and properties of the geographic space, as it is produced and experienced by human societies” [PUM 97].

In epidemiology (study of the factors influencing a population’s health and diseases) and in health geography (geographic analysis of the health system and of the spatial distribution of diseases)1, the term spatial analysis will be used to describe the analysis techniques applied to the “objects” described or used in epidemiology or geography, since they are localized in space and the analysis uses this localization: individuals, vectors, reservoirs, populations, territories, natural, urban or rural environment, etc. Spatial analysis uses topological or geometric relations of the individuals with their environment and among them. It is not concerned with what happens “inside” the sick person (in the organ, cell, or in terms of biology of the pathogen agent). For example, this book does not cover medical imaging and the techniques associated with image processing, although some of these techniques are sometimes very close to those described here.

Spatial distribution of health phenomena is rarely random: a health phenomenon often involves risk factors related to geographic factors, mesological factors and spatial relations among individuals. The use of localization is therefore essential in the analysis and comprehension of a health phenomenon and of its mechanisms. Spatial analysis facilitates the identification and comprehension of the mechanisms and processes that underlie the health phenomenon, by considering the spatial relations and interactions between the actors of the disease perceived as a complex system.

In epidemiology, spatial analysis also provides the elements that contribute to the consolidation of “traditional” epidemiology and feed the research and parameterization of models. It also enhances the analytical approach in health geography, whose methodological body also integrates a whole set of qualitative approaches. Descriptive spatial analysis includes cartographic analysis, search of geometric and space-time characteristics, analysis of the space variability of a value, cluster detection, spatial scale analysis, environmental correlation analysis, etc. Explanatory spatial analysis is essentially statistical, with the search of statistical models including spatial relations between individuals. Modeling of spatial processes is only briefly touched upon, this subject being beyond the scope of this book.

In health studies, spatial analysis is not only used for studies conducted in epidemiology or in geography. It also plays a role in public policies, with the development of new applications in public health: early warning systems, crisis management systems, risk analysis and prevention systems, preparation of vaccination campaigns, surveys and polls.

This book aims to present the general concepts that underlie spatial analysis and to explain and clarify the principles used in methods of analysis. Practical use of these methods is also highlighted: many concrete examples based on real data are provided throughout this book. These examples cover situations that are often encountered in practice.

In recent years, spatial analysis has been increasingly used in the health sector due to the development of geomatics and geographic information systems (GIS). In health, as well as in other fields of application, spatial analysis has benefited from the spread of GIS use, the development of their technical functionalities and the growing availability of geographic data, despite their often inadequate quality.

It is difficult, if not impossible, to manage, transform, handle, analyze and represent spatial information without using GIS.

Finally, I would like to thank all of those who have contributed to this book. Firstly, Jean-Paul Gonzalez, physician and virologist, who offered unequalled inspiration, motivation and management with unrelenting enthusiasm; Florent Demoraes, geographer, who contributed to the reinforcement, consolidation and completion of these reflections; Bernard Lortic, engineer, whose highly demanding approach was unparalleled; all the colleagues, students, PhD students and interns who have directly or indirectly contributed to the improvement of this book, and in particular Nitin Tripati, José Tupiza, Somsakun Maneerat, Julie Vallée, Jothiganesh Sundaram and Tania Serrano. I am taking this opportunity to express my sincere gratitude to all of them.

Marc SOURIS

December 2018

Introduction
Software and Databases

The reader will find throughout the text information on how to apply the methods presented in this book using several pieces of software that have been selected for this purpose. Several databases or file sets that can be downloaded and used for the replication of the examples mentioned in this book are also presented.

An appendix presenting the principles and diverse functionalities of geographic information systems (GIS) is available for the reader to download at www.iste.co.uk/souris/epidemiology.zip.

I.1. Software

Several software programs which can be used to apply the methods presented in this book have been selected: general geographic information systems (QGIS, ArcGIS, SavGIS) or more specific software programs (R, GeoDA, SaTScanTM, GWR4). Alongside descriptions of methods of spatial analysis, procedures to be used and links to find information on these preocedures, whenever available, will be briefly presented for each software program, without further details. If needed, the reader can refer to the software user manuals.

I.1.1. QGIS

images

Quantum GIS (QGIS) is a free and open-source geographic information system. It operates under Linux, Unix, Mac OS X, Windows and Android, and supports numerous formats (vector and matrix) of data and databases. QGIS offers a continuously increasing number of functionalities provided by the basic functions and plugins. Detailed information, documentation, downloads and tutorials are available at http://www.qgis.org.

I.1.2. ArcGIS

images

The ArcGIS geographic information system is a commercial product from the Environmental Systems Research Institute (ESRI). This software is quite comprehensive, consisting of a large number of functionalities. The system’s infrastructure allows us to share maps and geographic information among an enterprise, a community or on the Web. Further information on the ArcGIS software can be accessed at https://www.arcgis.com.

I.1.3. SavGIS

images

SavGIS is a free geographic information system running under Windows. This complete and powerful software is the result of research and is constantly evolving, providing innovative solutions for processing of localized information, with many developments related to spatial analysis and modeling for epidemiology. Besides being freely accessible, it has many advantages: rigorous data management, data sharing, powerful analysis, advanced functions for spatial analysis and statistical analysis functions. Further information can be found at http://www.savgis.org.

I.1.4. R

images

R (free and open-source software) is a programming language for statistical analysis of data, and also an environment for data analysis and graphic visualization. Scientists and researchers have created a large number of specialized procedures for a wide variety of applications that are directly integrated in R. R-GIS.net is a website that aims to discuss spatial data manipulation and analysis in R. Several packages are available for procedures related to spatialized data: sp, spdep, etc. Further information can be found at http://r-gis.net and framabook.org/r-et-espace.

I.1.5. GeoDA

images

GeoDa is a free and open-source software for spatial analysis, developed since 2003 by the State University of Arizona (USA). GeoDa is a software tool focused on spatial analysis and spatial models. The program provides a user-friendly graphical interface for the exploratory spatial data analysis (ESDA) methods, such as spatial autocorrelation statistics for aggregated data and basic spatial regression analysis for punctual and zonal data. Further information can be found at http://geodacenter.github.io.

I.1.6. SaTScanTM

images

SaTScanTM is a free software that analyzes spatial data by means of spatial, temporal or space-time statistics. The main objective of SaTScanTM is the detection of aggregates and the implementation of early warning or early detection of disease systems. The software can also be used for similar problems in other scientific fields. Further information can be found at https://www.satscan.org.

I.1.7. GWR4

Geographically weighted regression (GWR) is a spatial analysis technique that takes into account variables exhibiting autocorrelation and local variations. This regression technique allows the modeling of local relations between predictive variables and the variable to be explained. Several software programs allow the execution of geographically weighted regressions (ArcGIS, SpaceStat, SAM, spgwr, gwrr packages or GWmodel of R), but GWR4 is an autonomous Windows application. Further information on GWR4 can be found at http://gwr.maynoothuniversity.ie/gwr4-software/.

I.1.8. Gama

images

GAMA is a free agent-based modeling and simulation platform developed since 2007 (http://gama-plaform.org), and more specifically dedicated to the simulation of spatialized phenomena. GAMA offers a certain number of advanced functionalities: advanced management of geographic data; a set of structures and controls facilitating the definition of multilevel models; automated tools that support the exploration of models allowing the definition of experience plans and their execution on high performance calculation resources (cluster, grid); a plug-in system that allows the extension of GAML language for specific needs; and bridges and possibilities of coupling with other tools used in the field of modeling of complex systems.

I.2. Data for the examples

The methods presented in this book are illustrated with examples drawn from real situations and databases. The data related to these examples are available as EXCEL files for non-localized data, Shapefile format for geolocalized data, or complete geographic databases directly exploitable with the SavGIS software. These files, as well as the SavGIS databases, can be downloaded from www.savgis.org.