AIS Example#

This example uses AIS data from a CSV file containing a full day of observations provided by the Danish Maritime Authority to show how to read, manage, and write information using PyMEOS. It is divided in 3 sections, each corresponding to one MEOS example:

This example uses the plotting and database capabilities of PyMEOS. To get the necessary dependencies for this demo, you can run the following command:

pip install pymeos[dbp,plot] pandas tqdm contextily

If you don’t have MobilityDB set up, you can use the Docker image:

docker run -d -p 5432:5432 --name pymeos-demo-db mobilitydb/mobilitydb:16-3.4-1.1-beta1

Note that due to its size, the data file used is not provided. It can be downloaded using the following URL:

https://web.ais.dk/aisdata/aisdk-2023-08-01.zip

Store the file in the data directory. There is no need to decompress it, since we will use pandas to read the file and it supports reading compressed files.

import matplotlib.pyplot as plt
import pandas as pd
from pymeos import *
from tqdm.notebook import tqdm
import contextily as cx

tqdm.pandas()

pymeos_initialize()

Reading from File (MEOS Example)#

In this section, AIS data is read from a CSV file. We then use the read information to recreate the trajectories of the ships.

First, let’s read the CSV file using pandas. We will read only the columns we’re interested in, and remove the observations that are either erroneous or not in Denmark.

%%time
ais = pd.read_csv(
    "./data/aisdk-2023-08-01.zip",
    usecols=["# Timestamp", "MMSI", "Latitude", "Longitude", "SOG"],
)
ais.columns = ["t", "mmsi", "lat", "lon", "sog"]
ais = ais[ais["t"] != 0]
ais["t"] = pd.to_datetime(ais["t"], format='%d/%m/%Y %H:%M:%S')
ais = ais[ais["mmsi"] != 0]
ais = ais.drop_duplicates(["t", "mmsi"])
ais = ais[(ais["lat"] >= 40.18) & (ais["lat"] <= 84.17)]
ais = ais[(ais["lon"] >= -16.1) & (ais["lon"] <= 32.88)]
ais = ais[(ais["sog"] >= 0) & (ais["sog"] <= 1022)]
ais.dropna()
ais.head()
CPU times: user 27.5 s, sys: 954 ms, total: 28.5 s
Wall time: 28.5 s
t mmsi lat lon sog
0 2023-08-01 219020187 56.795448 8.864770 0.0
1 2023-08-01 219000873 56.990862 10.304587 0.0
2 2023-08-01 219008746 56.609180 10.300487 0.0
4 2023-08-01 219005496 57.754300 10.151212 0.2
5 2023-08-01 219014072 54.918928 9.597852 0.0

Now, we will create the PyMEOS object representing the position and the SOG.

ais["instant"] = ais.progress_apply(
    lambda row: TGeogPointInst(point=(row["lon"], row["lat"]), timestamp=row["t"]),
    axis=1,
)
ais["sog"] = ais.progress_apply(
    lambda row: TFloatInst(value=row["sog"], timestamp=row["t"]), axis=1
)
ais.drop(["lat", "lon"], axis=1, inplace=True)
ais.head()
t mmsi sog instant
0 2023-08-01 219020187 0@2023-08-01 00:00:00+05 POINT(8.86477 56.795448)@2023-08-01 00:00:00+05
1 2023-08-01 219000873 0@2023-08-01 00:00:00+05 POINT(10.304587 56.990862)@2023-08-01 00:00:00+05
2 2023-08-01 219008746 0@2023-08-01 00:00:00+05 POINT(10.300487 56.60918)@2023-08-01 00:00:00+05
4 2023-08-01 219005496 0.2@2023-08-01 00:00:00+05 POINT(10.151212 57.7543)@2023-08-01 00:00:00+05
5 2023-08-01 219014072 0@2023-08-01 00:00:00+05 POINT(9.597852 54.918928)@2023-08-01 00:00:00+05

Assembling Trips (MEOS Example)#

Now, we will create the trajectory (TGeogPointSeq) and the SOG evolution (TFloatSeq) for every ship (identified by the mmsi) using the instants we have created.

%%time
trajectories = (
    ais.groupby("mmsi")
    .aggregate(
        {
            "instant": lambda x: TGeogPointSeq.from_instants(x, upper_inc=True),
            "sog": lambda x: TFloatSeq.from_instants(x, upper_inc=True),
        }
    )
    .rename({"instant": "trajectory"}, axis=1)
)
trajectories["distance"] = trajectories["trajectory"].apply(lambda t: t.length())
trajectories.head()
CPU times: user 34.7 s, sys: 444 ms, total: 35.1 s
Wall time: 35.2 s
trajectory sog distance
mmsi
148 [POINT(12.606613 55.684455)@2023-08-01 00:01:3... [0.8@2023-08-01 00:01:34+05, 0@2023-08-01 00:0... 31191.067480
5322 [POINT(12.606042 55.684293)@2023-08-01 00:00:0... [0.1@2023-08-01 00:00:00+05, 0.2@2023-08-01 00... 39797.594133
100046 [POINT(4.164757 58.76664)@2023-08-01 19:20:52+... [148@2023-08-01 19:20:52+05, 148@2023-08-01 19... 46476.358036
2190045 [POINT(8.423332 55.47179)@2023-08-01 00:00:07+... [0@2023-08-01 00:00:07+05, 0@2023-08-01 02:51:... 2436.943448
9132759 [POINT(8.310862 56.552482)@2023-08-01 17:32:00... [0@2023-08-01 17:32:00+05, 0@2023-08-01 17:44:... 368.171260

Here we can see that PyMEOS has been able to reduce the number of points stored (and thus memory used) without losing any information.

comparison = pd.concat(
    [
        ais.groupby("mmsi")["t"].count().rename("original #points"),
        trajectories["trajectory"]
        .apply(lambda t: t.num_instants())
        .rename("PyMEOS #points"),
    ],
    axis=1,
)
comparison["Points kept (%)"] = (
    comparison["PyMEOS #points"] / comparison["original #points"] * 100
)
comparison
original #points PyMEOS #points Points kept (%)
mmsi
148 3882 3625 93.379701
5322 8224 8008 97.373541
100046 3 3 100.000000
2190045 8824 4732 53.626473
9132759 159 103 64.779874
... ... ... ...
970010921 638 378 59.247649
970021666 2 2 100.000000
970021790 3 2 66.666667
970100094 8 3 37.500000
972609012 6 3 50.000000

6064 rows × 3 columns

We can visualize the trajectories and the SOG evolutions by plotting them. We will plot only 100 of the trajectories.

fig, axes = plt.subplots(1, 2, figsize=(15, 8))
for _, ship in trajectories[100:200].iterrows():
    ship["trajectory"].plot(axes=axes[0])
    ship["sog"].plot(axes=axes[1])
cx.add_basemap(axes[0], crs=4326, source=cx.providers.CartoDB.Voyager)
plt.show()
../../_images/0fb8ab28cd74ca7075442e41a84af1fe6c6099025fab3aa9e7e28310e24891b5.png

Storing in MobilityDB (MEOS Example)#

Now we will store the temporal objects we have created in MobilityDB. To connect to MobilityDB (PostgreSQL), psycopg is used. However, asyncpg and psycopg2 are also supported.

First, set up the connection parameters. Change any of the following values according to your configuration. If you are using the docker image, you don’t need to change anything (except maybe the port).

from pymeos.db.psycopg import MobilityDB

host = "localhost"
port = 5432
db = "mobilitydb"
user = "docker"
password = "docker"

connection = MobilityDB.connect(
    host=host, port=port, dbname=db, user=user, password=password
)
cursor = connection.cursor()

Now, we will create the table where we will write our objects.

cursor.execute("DROP TABLE IF EXISTS public.PyMEOS_demo;")
cursor.execute(
    "CREATE TABLE public.PyMEOS_demo"
    "(MMSI integer, trajectory public.tgeogpoint, SOG public.tfloat);"
)
connection.commit()

Let’s insert now the rows of the DataFrame into the MobilityDB table. First, we’ll create a SQL INSERT query with all the values, and then we will execute it in MobilityDB.

query = "INSERT INTO public.PyMEOS_demo(MMSI, trajectory, SOG) VALUES"
for mmsi, row in trajectories.iterrows():
    query += f"\n({mmsi}, '{row.trajectory}', '{row.sog}'),"
query = query[:-1] + ";"
%%time
cursor.execute(query)
connection.commit()
CPU times: user 1.68 s, sys: 626 ms, total: 2.31 s
Wall time: 54.7 s

Let’s check how many rows we just added

cursor.execute("SELECT COUNT(*) FROM public.PyMEOS_demo;")
cursor.fetchone()[0]
6064

Now, we will read one of the records that we just wrote

cursor.execute("SELECT * FROM public.PyMEOS_demo WHERE MMSI = 97000050;")
mmsi, trajectory, sog = cursor.fetchone()
fig, axes = plt.subplots(1, 2, figsize=(20, 10))
trajectory.plot(axes=axes[0])
sog.plot(axes=axes[1])
plt.suptitle(f"Ship {mmsi}")
plt.show()
../../_images/cb8c2e6be736001a02efce8419818d55a38644ddfa6aa9236c75854144bd3b54.png

Finally, let’s close the connection and wrap everything up

connection.commit()
cursor.close()
pymeos_finalize()