In the waters off the Caribbean island nation of Dominica, whale habitat and marine traffic overlap spatially, putting whale populations in danger. This analysis identifies a speed reduction zone off the island of Dominica for the purpose of reducing the occurrence of ships striking whales and quantifies the impact of reduced travel speeds on marine traffic.
Setup
Load libraries and packages
library(reticulate)
import pandas as pd
import geopandas as gpd
/Library/Frameworks/R.framework/Versions/4.1/Resources/library/reticulate/python/rpytools/loader.py:39: UserWarning: Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas will still use PyGEOS by default for now. To force to use and test Shapely 2.0, you have to set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:
import os
os.environ['USE_PYGEOS'] = '0'
import geopandas
In a future release, GeoPandas will switch to using Shapely by default. If you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).
module = _import(
import os
import matplotlib.pyplot as plt
import numpy as np
import shapely
import contextily as ctx
import pygeos
Dominica outline
Read in a shapefile that contains the outline of the country of Dominica.
# Define path to folder
= r"data/dominica"
input_folder
# Join folder path and filename
= os.path.join(input_folder, "dma_admn_adm0_py_s1_dominode_v2.shp") fp
data/dominica/dma_admn_adm0_py_s1_dominode_v2.shp
# Read file using gpd.read_file()
= gpd.read_file(fp) dominica
# check the object type
type(dominica)
<class 'geopandas.geodataframe.GeoDataFrame'>
dominica.head()
ADM0_PCODE ADM0_EN geometry
0 DM Dominica POLYGON ((-61.43023 15.63952, -61.43019 15.639...
From the head()
function, we can see that the geodataframe consists of one row - a polygon representing the outline of Dominica. Next, check the coordinate reference system and project the Dominica geodataframe to an appropriate projection for the study area.
dominica.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
# change the projection to Dominica 1945 / British West Indies Grid (metric units)
= 2002 # project area coordinate system EPSG code
proj_area_crs = dominica.to_crs(epsg=proj_area_crs) dominica
# check the coordinate reference system again to make sure the conversion worked
dominica.crs
<Derived Projected CRS: EPSG:2002>
Name: Dominica 1945 / British West Indies Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Dominica - onshore.
- bounds: (-61.55, 15.14, -61.2, 15.69)
Coordinate Operation:
- name: British West Indies Grid
- method: Transverse Mercator
Datum: Dominica 1945
- Ellipsoid: Clarke 1880 (RGS)
- Prime Meridian: Greenwich
Now that we have the correct project, plot the Dominica outline against a basemap.
= dominica.plot(color="none", figsize=(9, 9))
ax =dominica.crs.to_string())
ctx.add_basemap(ax, crsTrue)
ax.grid( plt.show()
Whale sighting data
Whale habitat was approximated by using data from ~5,000 whale sightings between 2008 and 2015.
# Define path to folder
= r"data"
input_folder2
# Join folder path and filename
= os.path.join(input_folder2, "sightings2005_2018.csv") fp2
# Read file using gpd.read_file()
= gpd.read_file(fp2) whales
Again, check the object type and view the first few rows. Use the columns from latitude and longitude to create a geometry column. Each whale sighting is represented as a point.
type(whales)
<class 'geopandas.geodataframe.GeoDataFrame'>
whales.head()
field_1 GPStime Lat Long geometry
0 0 2005-01-15 07:43:27 15.36977117 -61.49328433 None
1 1 2005-01-15 08:07:13 15.3834075 -61.503702 None
2 2 2005-01-15 08:31:17 15.38106333 -61.50486067 None
3 3 2005-01-15 09:19:10 15.33532083 -61.46858117 None
4 4 2005-01-15 10:08:00 15.294224 -61.45318517 None
# bootstrap the geometries
= gpd.points_from_xy(whales['Long'], whales['Lat'])
whale_points = gpd.GeoDataFrame(whales, geometry=whale_points) whale_gdf
whale_gdf.head()
field_1 GPStime ... Long geometry
0 0 2005-01-15 07:43:27 ... -61.49328433 POINT (-61.49328 15.36977)
1 1 2005-01-15 08:07:13 ... -61.503702 POINT (-61.50370 15.38341)
2 2 2005-01-15 08:31:17 ... -61.50486067 POINT (-61.50486 15.38106)
3 3 2005-01-15 09:19:10 ... -61.46858117 POINT (-61.46858 15.33532)
4 4 2005-01-15 10:08:00 ... -61.45318517 POINT (-61.45319 15.29422)
[5 rows x 5 columns]
type(whale_gdf)
<class 'geopandas.geodataframe.GeoDataFrame'>
Set the coordinate reference system then project to the study area projection.
# project the dataset into an appropriate CRS
= whale_gdf.set_crs(epsg=4326)
whale_gdf = whale_gdf.to_crs(epsg=proj_area_crs) whale_gdf
whale_gdf.crs
<Derived Projected CRS: EPSG:2002>
Name: Dominica 1945 / British West Indies Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Dominica - onshore.
- bounds: (-61.55, 15.14, -61.2, 15.69)
Coordinate Operation:
- name: British West Indies Grid
- method: Transverse Mercator
Datum: Dominica 1945
- Ellipsoid: Clarke 1880 (RGS)
- Prime Meridian: Greenwich
This figure shows the distribution of whale sightings.
= plt.subplots(figsize=(15,15), dpi=100)
fig, ax =ax) whale_gdf.plot(ax
Create grid
Create grid cells over the project study area to help quantify the spatial density of whale sightings.
# this function returns a tuple of the minimum and maximum bounds of the whale sighting data
= whale_gdf.total_bounds
xmin, ymin, xmax, ymax xmin, ymin, xmax, ymax
(408480.65208368783, 1532792.7459409237, 498500.3049570159, 1796964.3997029224)
= 2000
cell_size = 2000
length = 2000
width
= list(np.arange(xmin, xmax + width, width))
xs # return evenly spaced values starting at xmin, stopping (but not including) at xmax+width, by a step of width
= list(np.arange(ymin, ymax + length, length)) ys
# function to convert corner points into a cell polygon
def make_cell(x, y, cell_size):
= [
ring
(x, y),+ cell_size, y),
(x + cell_size, y + cell_size),
(x + cell_size)
(x, y
]= shapely.geometry.Polygon(ring)
cell return cell
# iterate over each combination of x and y coordinates in two nested for loops
= []
cells for x in xs:
for y in ys:
= make_cell(x, y, cell_size)
cell cells.append(cell)
Convert the grid to a geodataframe with the correct projection.
= gpd.GeoDataFrame({'geometry':cells}, crs=proj_area_crs)
grid #grid.to_file("grid.shp")
grid.head()
geometry
0 POLYGON ((408480.652 1532792.746, 410480.652 1...
1 POLYGON ((408480.652 1534792.746, 410480.652 1...
2 POLYGON ((408480.652 1536792.746, 410480.652 1...
3 POLYGON ((408480.652 1538792.746, 410480.652 1...
4 POLYGON ((408480.652 1540792.746, 410480.652 1...
grid.crs
<Derived Projected CRS: EPSG:2002>
Name: Dominica 1945 / British West Indies Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Dominica - onshore.
- bounds: (-61.55, 15.14, -61.2, 15.69)
Coordinate Operation:
- name: British West Indies Grid
- method: Transverse Mercator
Datum: Dominica 1945
- Ellipsoid: Clarke 1880 (RGS)
- Prime Meridian: Greenwich
= plt.subplots(figsize=(10,10), dpi=100)
fig, ax =ax, linewidth = 0.5)
grid.boundary.plot(ax plt.show()
Now visualize the island of Dominica, whale sighting locations, and grid.
= dominica.plot(facecolor='none', edgecolor='black', linewidth=2, figsize=(15, 15))
base =base, facecolor='red', markersize=2)
whale_gdf.plot(ax=base, linewidth = 0.5)
grid.boundary.plot(ax=base, crs=dominica.crs.to_string())
ctx.add_basemap(ax plt.show()
Extract whale habitat
To identify an approximate whale habitata area, spatially join the grid with whale sighting data, then count the number of sightings in each cell.
# use an inner join since we're not interested in grid cells without any sightings
= grid.sjoin(whale_gdf, how="inner")
whale_grid whale_grid
geometry ... Long
124 POLYGON ((408480.652 1780792.746, 410480.652 1... ... -61.896866
124 POLYGON ((408480.652 1780792.746, 410480.652 1... ... -61.900766
124 POLYGON ((408480.652 1780792.746, 410480.652 1... ... -61.903366
125 POLYGON ((408480.652 1782792.746, 410480.652 1... ... -61.900116
125 POLYGON ((408480.652 1782792.746, 410480.652 1... ... -61.897716
... ... ... ...
5171 POLYGON ((484480.652 1690792.746, 486480.652 1... ... -61.194134
5172 POLYGON ((484480.652 1692792.746, 486480.652 1... ... -61.19188
6030 POLYGON ((498480.652 1532792.746, 500480.652 1... ... -61.0794355
6030 POLYGON ((498480.652 1532792.746, 500480.652 1... ... -61.0794355
6030 POLYGON ((498480.652 1532792.746, 500480.652 1... ... -61.0794355
[4893 rows x 6 columns]
=(5,10)) whale_grid.boundary.plot(figsize
whale_grid.crs
<Derived Projected CRS: EPSG:2002>
Name: Dominica 1945 / British West Indies Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Dominica - onshore.
- bounds: (-61.55, 15.14, -61.2, 15.69)
Coordinate Operation:
- name: British West Indies Grid
- method: Transverse Mercator
Datum: Dominica 1945
- Ellipsoid: Clarke 1880 (RGS)
- Prime Meridian: Greenwich
'count'] = whale_grid.groupby(whale_grid.index).count()['index_right']
grid[# the column name spacified after .count() is arbitrary, since we are only counting rows
grid
geometry count
0 POLYGON ((408480.652 1532792.746, 410480.652 1... NaN
1 POLYGON ((408480.652 1534792.746, 410480.652 1... NaN
2 POLYGON ((408480.652 1536792.746, 410480.652 1... NaN
3 POLYGON ((408480.652 1538792.746, 410480.652 1... NaN
4 POLYGON ((408480.652 1540792.746, 410480.652 1... NaN
... ... ...
6293 POLYGON ((500480.652 1790792.746, 502480.652 1... NaN
6294 POLYGON ((500480.652 1792792.746, 502480.652 1... NaN
6295 POLYGON ((500480.652 1794792.746, 502480.652 1... NaN
6296 POLYGON ((500480.652 1796792.746, 502480.652 1... NaN
6297 POLYGON ((500480.652 1798792.746, 502480.652 1... NaN
[6298 rows x 2 columns]
# subset the grid dataframe to cells that have more than 20 sightings
= (grid['count'] > 20)
whale_mask whale_mask
0 False
1 False
2 False
3 False
4 False
...
6293 False
6294 False
6295 False
6296 False
6297 False
Name: count, Length: 6298, dtype: bool
= grid[whale_mask] whale_habitat
The speed reduction zone represents the area with frequent whale sightings.
= whale_habitat.unary_union.convex_hull
speed_reduction_zone speed_reduction_zone
<POLYGON ((460480.652 1680792.746, 456480.652 1682792.746, 454480.652 168479...>
type(speed_reduction_zone)
# need to create a new GeoDataFrame with the speed_reduction_zone as a single feature
<class 'shapely.geometry.polygon.Polygon'>
= gpd.GeoDataFrame(index=[0], crs='epsg:2002', geometry=[speed_reduction_zone]) speed_reduction_zone
speed_reduction_zone.crs
<Derived Projected CRS: EPSG:2002>
Name: Dominica 1945 / British West Indies Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Dominica - onshore.
- bounds: (-61.55, 15.14, -61.2, 15.69)
Coordinate Operation:
- name: British West Indies Grid
- method: Transverse Mercator
Datum: Dominica 1945
- Ellipsoid: Clarke 1880 (RGS)
- Prime Meridian: Greenwich
type(speed_reduction_zone)
<class 'geopandas.geodataframe.GeoDataFrame'>
= dominica.plot(facecolor='none', edgecolor='black', linewidth=3, figsize=(15, 15))
base =base, facecolor='lightgray', edgecolor='red', alpha=0.5, linewidth=5)
speed_reduction_zone.plot(ax=base, crs=dominica.crs.to_string())
ctx.add_basemap(ax plt.show()
Vessel data
Vessel data was obtained from Automatic Identification System (AIS) transceivers from 2015.
Load data
# Join folder path and filename
= os.path.join(input_folder2, "station1249.csv") fp3
# Read file using gpd.read_file()
= gpd.read_file(fp3) vessels
type(vessels)
<class 'geopandas.geodataframe.GeoDataFrame'>
vessels.head()
field_1 MMSI LON LAT TIMESTAMP geometry
0 0 233092000 -61.84788 15.23238 2015-05-22 13:53:26 None
1 1 255803280 -61.74397 15.96114 2015-05-22 13:52:57 None
2 2 329002300 -61.38968 15.29744 2015-05-22 13:52:32 None
3 3 257674000 -61.54395 16.2334 2015-05-22 13:52:24 None
4 4 636092006 -61.52401 15.81954 2015-05-22 13:51:23 None
Similar to the whale sighting data, set the geometry and coordinate reference system for the vessel data.
# bootstrap the geometries
= gpd.points_from_xy(vessels['LON'], vessels['LAT'])
vessel_points = gpd.GeoDataFrame(vessels, geometry=vessel_points) vessel_gdf
# project the dataset into an appropriate CRS
= vessel_gdf.set_crs(epsg=4326)
vessel_gdf = vessel_gdf.to_crs(epsg=proj_area_crs) vessel_gdf
vessel_gdf.crs
<Derived Projected CRS: EPSG:2002>
Name: Dominica 1945 / British West Indies Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Dominica - onshore.
- bounds: (-61.55, 15.14, -61.2, 15.69)
Coordinate Operation:
- name: British West Indies Grid
- method: Transverse Mercator
Datum: Dominica 1945
- Ellipsoid: Clarke 1880 (RGS)
- Prime Meridian: Greenwich
'TIMESTAMP'] = pd.to_datetime(vessel_gdf['TIMESTAMP']) vessel_gdf[
vessel_gdf.head()
field_1 MMSI ... TIMESTAMP geometry
0 0 233092000 ... 2015-05-22 13:53:26 POINT (415373.315 1683307.035)
1 1 255803280 ... 2015-05-22 13:52:57 POINT (426434.345 1763918.193)
2 2 329002300 ... 2015-05-22 13:52:32 POINT (464555.392 1690588.725)
3 3 257674000 ... 2015-05-22 13:52:24 POINT (447770.634 1794068.620)
4 4 636092006 ... 2015-05-22 13:51:23 POINT (450006.361 1748297.844)
[5 rows x 6 columns]
# plot vessel points against Dominca outline and speed reduction zone
= dominica.plot(facecolor='none', edgecolor='black', linewidth=3, figsize=(15, 15))
base =base, markersize = 3)
vessel_gdf.plot(ax=base, edgecolor='red', linewidth=2) speed_reduction_zone.plot(ax
# spatially subset AIS data to only include vessels within identified whale habitat
= vessel_gdf.sjoin(speed_reduction_zone, how="inner")
vessels_in_whale_habitat vessels_in_whale_habitat
field_1 MMSI ... geometry index_right
2 2 329002300 ... POINT (464555.392 1690588.725) 0
7 7 338143127 ... POINT (463892.452 1694650.397) 0
13 13 329002300 ... POINT (464555.389 1690589.831) 0
15 15 338143015 ... POINT (463910.683 1694655.978) 0
16 16 338143127 ... POINT (463697.964 1694341.275) 0
... ... ... ... ... ...
617252 238717 329002300 ... POINT (453901.647 1709712.916) 0
617253 238718 338143015 ... POINT (463915.972 1694683.643) 0
617255 238720 338143127 ... POINT (463905.177 1694705.734) 0
617259 238724 377907247 ... POINT (464023.288 1691613.624) 0
617261 238726 329002300 ... POINT (454741.236 1707161.130) 0
[167411 rows x 7 columns]
Calculate distance and speed
# plot of only vessel points within speed reduction zone
= dominica.plot(facecolor='none', linewidth=3, figsize=(15, 15))
base =base, facecolor='none', edgecolor='red', linewidth=3)
speed_reduction_zone.plot(ax=base, markersize = 0.5, facecolor='black')
vessels_in_whale_habitat.plot(ax=base, crs=dominica.crs.to_string())
ctx.add_basemap(ax plt.show()
# sort vessel dataframe by MMSI and time
= vessels_in_whale_habitat.sort_values(by=['MMSI', 'TIMESTAMP'])
vessels_in_whale_habitat vessels_in_whale_habitat
field_1 MMSI ... geometry index_right
235025 235025 203106200 ... POINT (462476.396 1680935.224) 0
235018 235018 203106200 ... POINT (462283.995 1681393.698) 0
235000 235000 203106200 ... POINT (461936.769 1682722.187) 0
234989 234989 203106200 ... POINT (461798.818 1683708.377) 0
234984 234984 203106200 ... POINT (461654.150 1683997.765) 0
... ... ... ... ... ...
259103 259103 983191049 ... POINT (465250.372 1690066.434) 0
259094 259094 983191049 ... POINT (465243.965 1690054.249) 0
258954 258954 983191049 ... POINT (465226.597 1690121.667) 0
258930 258930 983191049 ... POINT (465242.895 1690053.140) 0
258206 258206 983191049 ... POINT (465272.964 1690049.908) 0
[167411 rows x 7 columns]
# create a copy of the vessel dataframe and shift each observation down one row using `shift()`
= vessels_in_whale_habitat.copy(deep=True).shift(periods=1) vessels_shift
/Users/marierivers/Library/r-miniconda/lib/python3.9/site-packages/geopandas/array.py:1406: UserWarning: CRS not set for some of the concatenation inputs. Setting output's CRS as Dominica 1945 / British West Indies Grid (the single non-null crs provided).
warnings.warn(
# rename shifted column names
= vessels_shift.rename(columns={"field_1": "field_1_shift", "MMSI": "MMSI_shift", "LON": "LON_shift", "LAT": "LAT_shift", "TIMESTAMP": "TIMESTAMP_shift", "geometry": "geometry_shift", "index_right": "index_right_shift"}) vessels_shift
# join original dataframe with the shifted copy using `join()`
= vessels_in_whale_habitat.join(vessels_shift).sort_values(by=['MMSI', 'TIMESTAMP']) vessels_shift_join
# drop all rows in the joined dataframe in which the MMSI of the left is not the same as the one on the right
= vessels_shift_join.drop(vessels_shift_join[vessels_shift_join['MMSI'] != vessels_shift_join['MMSI_shift']].index) vessels_keep
# set the geometry column
= vessels_keep.set_geometry("geometry")
vessels_keep = vessels_keep.set_geometry("geometry_shift") vessels_keep2
# calculate distance between each observation
'distance_m'] = vessels_keep.distance(vessels_keep2) vessels_keep[
# calculate time difference between each observation to the next
'time'] = vessels_keep['TIMESTAMP'] - vessels_keep['TIMESTAMP_shift'] vessels_keep[
# calculate speed
= 1852
meters_per_nm
'speed_m_per_sec'] = vessels_keep['distance_m'] / vessels_keep['time'].dt.total_seconds()
vessels_keep['speed_knots'] = vessels_keep['speed_m_per_sec'] * 60 * 60 / meters_per_nm
vessels_keep['time_10knots_minutes'] = (vessels_keep['distance_m'] * 60 ) / ( meters_per_nm * 10 )
vessels_keep['time_dif_minutes'] = vessels_keep['time_10knots_minutes'] - (vessels_keep['time'].dt.total_seconds() / 60 )
vessels_keep[ vessels_keep
field_1 MMSI ... time_10knots_minutes time_dif_minutes
235018 235018 203106200 ... 1.610828 -0.889172
235000 235000 203106200 ... 4.448540 -3.034793
234989 234989 203106200 ... 3.226109 -1.773891
234984 234984 203106200 ... 1.048164 -1.468503
234972 234972 203106200 ... 1.394116 -3.589217
... ... ... ... ... ...
259103 259103 983191049 ... 0.043139 -5.940194
259094 259094 983191049 ... 0.044599 -4.355401
258954 258954 983191049 ... 0.225548 -55.824452
258930 258930 983191049 ... 0.228202 -11.471798
258206 258206 983191049 ... 0.097976 -248.585358
[166255 rows x 20 columns]
= vessels_keep.sort_values(by=['speed_knots'], ascending=False) vessels_keep
# look at the vessels that would be affected by the speed reduction zone
= vessels_keep.drop(vessels_keep[vessels_keep['time_dif_minutes'] < 0].index)
vessels_going_too_fast vessels_going_too_fast
field_1 MMSI ... time_10knots_minutes time_dif_minutes
585844 207309 341387000 ... 0.209101 0.209101
67091 67091 227528210 ... 7.979167 6.245834
66925 66925 228008600 ... 13.481637 10.498303
499754 121219 329002300 ... 8.728323 6.711656
546817 168282 329002300 ... 12.866650 9.849984
... ... ... ... ... ...
616429 237894 636091437 ... 0.000000 0.000000
616427 237892 636091437 ... 0.000000 0.000000
616422 237887 636091437 ... 0.000000 0.000000
616419 237884 636091437 ... 0.000000 0.000000
616408 237873 636091437 ... 0.000000 0.000000
[21410 rows x 20 columns]
= vessels_going_too_fast['time_dif_minutes'].sum()
shipping_impact_minutes = round(shipping_impact_minutes / ( 60 * 24), 2)
shipping_impact_days shipping_impact_days
27.88
A 10-knot reduced speed zone in the identified whale habitat will increase travel time by approximately 27.88 days.
Citation
@online{rivers,
author = {Rivers, Marie},
title = {Protecting {Whales} from {Ships}},
url = {https://marierivers.github.io/code_samples/protecting-whales-from-ships/},
langid = {en}
}