AIS Example#
This example uses AIS data from a CSV file containing a full day of observations provided by the Danish Maritime Authority to show how to read, manage, and write information using PyMEOS. It is divided in 3 sections, each corresponding to one MEOS example:
This example uses the plotting and database capabilities of PyMEOS. To get the necessary dependencies for this demo, you can run the following command:
pip install pymeos[dbp,plot] pandas tqdm contextily
If you don’t have MobilityDB set up, you can use the Docker image:
docker run -d -p 5432:5432 --name pymeos-demo-db mobilitydb/mobilitydb:16-3.4-1.1-beta1
Note that due to its size, the data file used is not provided. It can be downloaded using the following URL:
https://web.ais.dk/aisdata/aisdk-2023-08-01.zip
Store the file in the data directory. There is no need to decompress it, since we will use pandas
to read the file and it supports reading compressed files.
import matplotlib.pyplot as plt
import pandas as pd
from pymeos import *
from tqdm.notebook import tqdm
import contextily as cx
tqdm.pandas()
pymeos_initialize()
Reading from File (MEOS Example)#
In this section, AIS data is read from a CSV file. We then use the read information to recreate the trajectories of the ships.
First, let’s read the CSV file using pandas. We will read only the columns we’re interested in, and remove the observations that are either erroneous or not in Denmark.
%%time
ais = pd.read_csv(
"./data/aisdk-2023-08-01.zip",
usecols=["# Timestamp", "MMSI", "Latitude", "Longitude", "SOG"],
)
ais.columns = ["t", "mmsi", "lat", "lon", "sog"]
ais = ais[ais["t"] != 0]
ais["t"] = pd.to_datetime(ais["t"], format='%d/%m/%Y %H:%M:%S')
ais = ais[ais["mmsi"] != 0]
ais = ais.drop_duplicates(["t", "mmsi"])
ais = ais[(ais["lat"] >= 40.18) & (ais["lat"] <= 84.17)]
ais = ais[(ais["lon"] >= -16.1) & (ais["lon"] <= 32.88)]
ais = ais[(ais["sog"] >= 0) & (ais["sog"] <= 1022)]
ais.dropna()
ais.head()
CPU times: user 27.5 s, sys: 954 ms, total: 28.5 s
Wall time: 28.5 s
t | mmsi | lat | lon | sog | |
---|---|---|---|---|---|
0 | 2023-08-01 | 219020187 | 56.795448 | 8.864770 | 0.0 |
1 | 2023-08-01 | 219000873 | 56.990862 | 10.304587 | 0.0 |
2 | 2023-08-01 | 219008746 | 56.609180 | 10.300487 | 0.0 |
4 | 2023-08-01 | 219005496 | 57.754300 | 10.151212 | 0.2 |
5 | 2023-08-01 | 219014072 | 54.918928 | 9.597852 | 0.0 |
Now, we will create the PyMEOS object representing the position and the SOG.
ais["instant"] = ais.progress_apply(
lambda row: TGeogPointInst(point=(row["lon"], row["lat"]), timestamp=row["t"]),
axis=1,
)
ais["sog"] = ais.progress_apply(
lambda row: TFloatInst(value=row["sog"], timestamp=row["t"]), axis=1
)
ais.drop(["lat", "lon"], axis=1, inplace=True)
ais.head()
t | mmsi | sog | instant | |
---|---|---|---|---|
0 | 2023-08-01 | 219020187 | 0@2023-08-01 00:00:00+05 | POINT(8.86477 56.795448)@2023-08-01 00:00:00+05 |
1 | 2023-08-01 | 219000873 | 0@2023-08-01 00:00:00+05 | POINT(10.304587 56.990862)@2023-08-01 00:00:00+05 |
2 | 2023-08-01 | 219008746 | 0@2023-08-01 00:00:00+05 | POINT(10.300487 56.60918)@2023-08-01 00:00:00+05 |
4 | 2023-08-01 | 219005496 | 0.2@2023-08-01 00:00:00+05 | POINT(10.151212 57.7543)@2023-08-01 00:00:00+05 |
5 | 2023-08-01 | 219014072 | 0@2023-08-01 00:00:00+05 | POINT(9.597852 54.918928)@2023-08-01 00:00:00+05 |
Assembling Trips (MEOS Example)#
Now, we will create the trajectory (TGeogPointSeq) and the SOG evolution (TFloatSeq) for every ship (identified by the mmsi) using the instants we have created.
%%time
trajectories = (
ais.groupby("mmsi")
.aggregate(
{
"instant": lambda x: TGeogPointSeq.from_instants(x, upper_inc=True),
"sog": lambda x: TFloatSeq.from_instants(x, upper_inc=True),
}
)
.rename({"instant": "trajectory"}, axis=1)
)
trajectories["distance"] = trajectories["trajectory"].apply(lambda t: t.length())
trajectories.head()
CPU times: user 34.7 s, sys: 444 ms, total: 35.1 s
Wall time: 35.2 s
trajectory | sog | distance | |
---|---|---|---|
mmsi | |||
148 | [POINT(12.606613 55.684455)@2023-08-01 00:01:3... | [0.8@2023-08-01 00:01:34+05, 0@2023-08-01 00:0... | 31191.067480 |
5322 | [POINT(12.606042 55.684293)@2023-08-01 00:00:0... | [0.1@2023-08-01 00:00:00+05, 0.2@2023-08-01 00... | 39797.594133 |
100046 | [POINT(4.164757 58.76664)@2023-08-01 19:20:52+... | [148@2023-08-01 19:20:52+05, 148@2023-08-01 19... | 46476.358036 |
2190045 | [POINT(8.423332 55.47179)@2023-08-01 00:00:07+... | [0@2023-08-01 00:00:07+05, 0@2023-08-01 02:51:... | 2436.943448 |
9132759 | [POINT(8.310862 56.552482)@2023-08-01 17:32:00... | [0@2023-08-01 17:32:00+05, 0@2023-08-01 17:44:... | 368.171260 |
Here we can see that PyMEOS has been able to reduce the number of points stored (and thus memory used) without losing any information.
comparison = pd.concat(
[
ais.groupby("mmsi")["t"].count().rename("original #points"),
trajectories["trajectory"]
.apply(lambda t: t.num_instants())
.rename("PyMEOS #points"),
],
axis=1,
)
comparison["Points kept (%)"] = (
comparison["PyMEOS #points"] / comparison["original #points"] * 100
)
comparison
original #points | PyMEOS #points | Points kept (%) | |
---|---|---|---|
mmsi | |||
148 | 3882 | 3625 | 93.379701 |
5322 | 8224 | 8008 | 97.373541 |
100046 | 3 | 3 | 100.000000 |
2190045 | 8824 | 4732 | 53.626473 |
9132759 | 159 | 103 | 64.779874 |
... | ... | ... | ... |
970010921 | 638 | 378 | 59.247649 |
970021666 | 2 | 2 | 100.000000 |
970021790 | 3 | 2 | 66.666667 |
970100094 | 8 | 3 | 37.500000 |
972609012 | 6 | 3 | 50.000000 |
6064 rows × 3 columns
We can visualize the trajectories and the SOG evolutions by plotting them. We will plot only 100 of the trajectories.
fig, axes = plt.subplots(1, 2, figsize=(15, 8))
for _, ship in trajectories[100:200].iterrows():
ship["trajectory"].plot(axes=axes[0])
ship["sog"].plot(axes=axes[1])
cx.add_basemap(axes[0], crs=4326, source=cx.providers.CartoDB.Voyager)
plt.show()
Storing in MobilityDB (MEOS Example)#
Now we will store the temporal objects we have created in MobilityDB.
To connect to MobilityDB (PostgreSQL), psycopg
is used. However, asyncpg
and psycopg2
are also supported.
First, set up the connection parameters. Change any of the following values according to your configuration. If you are using the docker image, you don’t need to change anything (except maybe the port).
from pymeos.db.psycopg import MobilityDB
host = "localhost"
port = 5432
db = "mobilitydb"
user = "docker"
password = "docker"
connection = MobilityDB.connect(
host=host, port=port, dbname=db, user=user, password=password
)
cursor = connection.cursor()
Now, we will create the table where we will write our objects.
cursor.execute("DROP TABLE IF EXISTS public.PyMEOS_demo;")
cursor.execute(
"CREATE TABLE public.PyMEOS_demo"
"(MMSI integer, trajectory public.tgeogpoint, SOG public.tfloat);"
)
connection.commit()
Let’s insert now the rows of the DataFrame into the MobilityDB table. First, we’ll create a SQL INSERT query with all the values, and then we will execute it in MobilityDB.
query = "INSERT INTO public.PyMEOS_demo(MMSI, trajectory, SOG) VALUES"
for mmsi, row in trajectories.iterrows():
query += f"\n({mmsi}, '{row.trajectory}', '{row.sog}'),"
query = query[:-1] + ";"
%%time
cursor.execute(query)
connection.commit()
CPU times: user 1.68 s, sys: 626 ms, total: 2.31 s
Wall time: 54.7 s
Let’s check how many rows we just added
cursor.execute("SELECT COUNT(*) FROM public.PyMEOS_demo;")
cursor.fetchone()[0]
6064
Now, we will read one of the records that we just wrote
cursor.execute("SELECT * FROM public.PyMEOS_demo WHERE MMSI = 97000050;")
mmsi, trajectory, sog = cursor.fetchone()
fig, axes = plt.subplots(1, 2, figsize=(20, 10))
trajectory.plot(axes=axes[0])
sog.plot(axes=axes[1])
plt.suptitle(f"Ship {mmsi}")
plt.show()
Finally, let’s close the connection and wrap everything up
connection.commit()
cursor.close()
pymeos_finalize()