BerlinMOD Example#

This example uses synthetic trip data in Brussels generated by the MobilityDB-BerlinMOD generator to show some data manipulations available in PyMEOS. It is divided in 5 sections, each corresponding to one MEOS example:

This example uses the plotting capabilities of PyMEOS. To get the necessary dependencies for this example, you can run the following command:

pip install pymeos[plot, pandas] contextily distinctipy
from datetime import timedelta
from functools import partial

import contextily as cx
import distinctipy
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
import shapely.geometry as shp
from pymeos import *
from pymeos.plotters import (
    TemporalPointSequenceSetPlotter,
    TemporalPointSequencePlotter,
)
from shapely import wkb

pymeos_initialize()

In every following section we use the data from the BerlinMOD Generator, so we will go ahead and read it.

trips = pd.read_csv(
    "./data/trips.csv",
    converters={
        "trip": TGeomPointSeq,
    },
)
trips.head()
vehicle day seq trip
0 1 2020-06-01 1 [POINT(496253.84080706234 6601691.244869352)@2...
1 1 2020-06-01 2 [POINT(481241.17182724166 6588272.126511315)@2...
2 1 2020-06-02 1 [POINT(496253.84080706234 6601691.244869352)@2...
3 1 2020-06-02 2 [POINT(481241.17182724166 6588272.126511315)@2...
4 1 2020-06-03 1 [POINT(496253.84080706234 6601691.244869352)@2...

Disassembling Trips (MEOS Example)#

In this section, we disassemble the trips into individual observations and write them in a CSV file ordered by timestamp.

First, we’ll obtain every timestamp and value from the trajectory object (TGeogPointSeq) using the timestamps and values functions respectively

records = trips.drop("trip", axis=1).copy()
records["geom"] = trips["trip"].apply(lambda tr: tr.values())
records["t"] = trips["trip"].apply(lambda tr: tr.timestamps())
records.head()
vehicle day seq geom t
0 1 2020-06-01 1 [POINT (496253.84080706234 6601691.244869352),... [2020-06-01 11:48:50.886000+02:00, 2020-06-01 ...
1 1 2020-06-01 2 [POINT (481241.17182724166 6588272.126511315),... [2020-06-01 18:00:27.208000+02:00, 2020-06-01 ...
2 1 2020-06-02 1 [POINT (496253.84080706234 6601691.244869352),... [2020-06-02 11:31:07.888000+02:00, 2020-06-02 ...
3 1 2020-06-02 2 [POINT (481241.17182724166 6588272.126511315),... [2020-06-02 18:47:26.738000+02:00, 2020-06-02 ...
4 1 2020-06-03 1 [POINT (496253.84080706234 6601691.244869352),... [2020-06-03 11:30:29.334000+02:00, 2020-06-03 ...

Now we “explode” the t and geom columns to get a row per record. Then, we sort the records by timestamp.

records = records.explode(["geom", "t"], ignore_index=True)
records = records.sort_values("t")
records.head()
vehicle day seq geom t
110254 8 2020-06-01 1 POINT (493442.7231019071 6595901.46838472) 2020-06-01 10:22:51.025000+02:00
110255 8 2020-06-01 1 POINT (493443.8028284878 6595906.350412016) 2020-06-01 10:22:52.525000+02:00
110256 8 2020-06-01 1 POINT (493444.404026218 6595909.068751807) 2020-06-01 10:22:54.574117+02:00
110257 8 2020-06-01 1 POINT (493441.7878750556 6595913.329708636) 2020-06-01 10:22:55.639765+02:00
110258 8 2020-06-01 1 POINT (493428.7071192439 6595934.634492779) 2020-06-01 10:23:00.139765+02:00

Finally, we will write it in a CSV file, encoding first the geometries as HexWKB

records["geom"] = records["geom"].apply(partial(wkb.dumps, hex=True))
records.head()
vehicle day seq geom t
110254 8 2020-06-01 1 01010000008BD374E40A1E1E41E803FA5D4F295941 2020-06-01 10:22:51.025000+02:00
110255 8 2020-06-01 1 0101000000CEAB18360F1E1E4185266D9650295941 2020-06-01 10:22:52.525000+02:00
110256 8 2020-06-01 1 0101000000840CB99D111E1E41FB6D664451295941 2020-06-01 10:22:54.574117+02:00
110257 8 2020-06-01 1 0101000000F5B7C826071E1E4140F2195552295941 2020-06-01 10:22:55.639765+02:00
110258 8 2020-06-01 1 01010000002C1117D4D21D1E419A879BA857295941 2020-06-01 10:23:00.139765+02:00
records.to_csv("./data/trip_instants.csv", index=False)

Clipping Trips to Geometries (MEOS Example)#

In this section, we compute the distance traversed by the trips in the 19 Brussels municipalities (communes in French)

First, we will read the files containing the information about the Brussels region and its municipalities and take a look at their shape

brussels = pd.read_csv(
    "./data/brussels_region.csv", converters={"geom": partial(wkb.loads, hex=True)}
)
brussels = gpd.GeoDataFrame(brussels, geometry="geom")
brussels_geom = brussels["geom"][0]
brussels
name geom
0 Région de Bruxelles-Capitale - Brussels Hoofds... POLYGON ((472413.848 6589441.686, 472482.354 6...
communes = pd.read_csv(
    "./data/communes.csv", converters={"geom": partial(wkb.loads, hex=True)}
)
communes["color"] = [
    "#00ff00",
    "#ff00ff",
    "#007fff",
    "#ff7f00",
    "#7fbf7f",
    "#47139e",
    "#ad0414",
    "#d981f9",
    "#138014",
    "#f4fc2c",
    "#00ffff",
    "#00ff7f",
    "#c04d7c",
    "#85eaf3",
    "#85d601",
    "#fca880",
    "#0000ff",
    "#019d92",
    "#907817",
]
communes = gpd.GeoDataFrame(communes, geometry="geom")
communes
id name population geom color
0 1 Anderlecht 118241 POLYGON ((476959.746 6593981.357, 476965.401 6... #00ff00
1 2 Auderghem - Oudergem 33313 POLYGON ((497620.844 6584185.646, 497856.652 6... #ff00ff
2 3 Berchem-Sainte-Agathe - Sint-Agatha-Berchem 24701 POLYGON ((477788.720 6599192.996, 477782.631 6... #007fff
3 4 Etterbeek 176545 POLYGON ((489804.602 6593678.230, 489685.067 6... #ff7f00
4 5 Evere 47414 POLYGON ((492771.667 6597070.173, 492633.519 6... #7fbf7f
5 6 Forest - Vorst 40394 POLYGON ((479909.268 6585570.952, 479881.003 6... #47139e
6 7 Ganshoren 55746 POLYGON ((478477.209 6600450.089, 478499.863 6... #ad0414
7 8 Ixelles - Elsene 24596 MULTIPOLYGON (((487496.626 6592340.790, 487544... #d981f9
8 9 Jette 86244 POLYGON ((481479.217 6597749.534, 481526.350 6... #138014
9 10 Koekelberg 51933 POLYGON ((480236.024 6597825.520, 480178.160 6... #f4fc2c
10 11 Molenbeek-Saint-Jean - Sint-Jans-Molenbeek 21609 POLYGON ((477542.960 6595837.350, 477533.621 6... #00ffff
11 12 Saint-Gilles - Sint-Gillis 96629 POLYGON ((483153.140 6593066.827, 483122.204 6... #00ff7f
12 13 Saint-Josse-ten-Noode - Sint-Joost-ten-Node 50471 POLYGON ((485033.738 6596581.156, 484882.700 6... #c04d7c
13 14 Schaerbeek - Schaarbeek 27115 POLYGON ((488501.707 6600305.815, 488394.028 6... #85eaf3
14 15 Uccle - Ukkel 133042 POLYGON ((486268.683 6588562.809, 486253.744 6... #85d601
15 16 Ville de Bruxelles - Stad Brussel 82307 POLYGON ((483153.140 6593066.827, 483186.758 6... #fca880
16 17 Watermael-Boitsfort - Watermaal-Bosvoorde 24871 POLYGON ((491240.935 6581131.551, 491340.822 6... #0000ff
17 18 Woluwe-Saint-Lambert - Sint-Lambrechts-Woluwe 55216 POLYGON ((496682.744 6594589.884, 496653.033 6... #019d92
18 19 Woluwe-Saint-Pierre - Sint-Pieters-Woluwe 41217 POLYGON ((496135.453 6589271.827, 496258.928 6... #907817
_, axes = plt.subplots(1, 2, figsize=(20, 10))
communes.plot(color=communes["color"], ax=axes[0])
communes.exterior.plot(ax=axes[0])
communes.plot(color=communes["color"], ax=axes[1], alpha=0.4)
communes.exterior.plot(ax=axes[1])
cx.add_basemap(axes[1])
_ = plt.suptitle("Brussels Municipalities", size=25, y=0.92)
../../_images/0fe3f3eaf0911b367d8f40b9896e0e474f1fdc564a9241d0b2fb94150afacd37.png

Now, let’s take a look at the BerlinMOD trajectories plotted over Brussels

_, axes = plt.subplots(1, 2, figsize=(20, 10))
communes.plot(color=communes["color"], ax=axes[0], alpha=0.4)

TemporalPointSequencePlotter.plot_sequences_xy(
    trips["trip"], axes=axes[0], show_markers=False, show_grid=False
)

communes.plot(color=communes["color"], ax=axes[1], alpha=0.1)
communes.exterior.plot(ax=axes[1])
TemporalPointSequencePlotter.plot_sequences_xy(
    trips["trip"], axes=axes[1], color="black", show_markers=False, show_grid=False
)
cx.add_basemap(axes[1])

_ = plt.suptitle("BerlinMod Trajectories", size=25, y=0.92)
../../_images/b5c153bfea3f97756e7a7907cb186794e74d3a748f6debcddb99cc76632830d7.png

First, let’s split the trajectories into sections inside and outside of Brussels

brussels_inout_trajectories = trips.copy()
brussels_inout_trajectories["inside"] = brussels_inout_trajectories["trip"].apply(
    lambda tr: tr.at(brussels_geom)
)
brussels_inout_trajectories["outside"] = brussels_inout_trajectories["trip"].apply(
    lambda tr: tr.minus(brussels_geom)
)
brussels_inout_trajectories.head()
vehicle day seq trip inside outside
0 1 2020-06-01 1 [POINT(496253.84080706234 6601691.244869352)@2... {[POINT(493452.04564533016 6596592.507628961)@... {[POINT(496253.84080706234 6601691.244869352)@...
1 1 2020-06-01 2 [POINT(481241.17182724166 6588272.126511315)@2... {[POINT(481241.17182724166 6588272.126511315)@... {(POINT(493488.118266309 6596572.0201423885)@2...
2 1 2020-06-02 1 [POINT(496253.84080706234 6601691.244869352)@2... {[POINT(493452.04567213746 6596592.507642428)@... {[POINT(496253.84080706234 6601691.244869352)@...
3 1 2020-06-02 2 [POINT(481241.17182724166 6588272.126511315)@2... {[POINT(481241.17182724166 6588272.126511315)@... {(POINT(493488.118266309 6596572.0201423885)@2...
4 1 2020-06-03 1 [POINT(496253.84080706234 6601691.244869352)@2... {[POINT(493452.04567213746 6596592.507642428)@... {[POINT(496253.84080706234 6601691.244869352)@...

Let’s see this split in a plot

_, axes = plt.subplots(1, 2, figsize=(20, 10))

axes[0].set_title("Inside Brussels")
brussels.plot(ax=axes[0], alpha=0.4)
for traj in brussels_inout_trajectories["inside"].dropna():
    if isinstance(traj, TGeomPointSeq):
        TemporalPointSequencePlotter.plot_xy(
            traj, axes=axes[0], show_markers=False, show_grid=False
        )
    else:
        TemporalPointSequenceSetPlotter.plot_xy(
            traj, axes=axes[0], show_markers=False, show_grid=False
        )

axes[1].set_title("Outside Brussels")
brussels.plot(ax=axes[1], alpha=0.4)
for traj in brussels_inout_trajectories["outside"].dropna():
    if isinstance(traj, TGeomPointSeq):
        TemporalPointSequencePlotter.plot_xy(
            traj, axes=axes[1], show_markers=False, show_grid=False
        )
    else:
        TemporalPointSequenceSetPlotter.plot_xy(
            traj, axes=axes[1], show_markers=False, show_grid=False
        )

axes[0].set_xlim(
    min(axes[0].get_xlim()[0], axes[1].get_xlim()[0]),
    max(axes[0].get_xlim()[1], axes[1].get_xlim()[1]),
)
axes[0].set_ylim(
    min(axes[0].get_ylim()[0], axes[1].get_ylim()[0]),
    max(axes[0].get_ylim()[1], axes[1].get_ylim()[1]),
)

_ = plt.suptitle("BerlinMod Trajectories Inside/Outside Brussels", size=25)
../../_images/0a670a27b5e081527f2900d3fc3358a41cd0977978413432fda57622df7680f2.png

Now we will compute the trajectory and distance each trip traverses in each municipality

commune_trajectories = trips.copy()
for _, row in communes.iterrows():
    commune_trajectories[f'{row["name"]}-trajectory'] = commune_trajectories[
        "trip"
    ].apply(lambda tr: tr.at(row["geom"]))
    commune_trajectories[f'{row["name"]}-distance'] = commune_trajectories[
        f'{row["name"]}-trajectory'
    ].apply(lambda tr: tr.length() if tr else 0)
commune_trajectories.head()
vehicle day seq trip Anderlecht-trajectory Anderlecht-distance Auderghem - Oudergem-trajectory Auderghem - Oudergem-distance Berchem-Sainte-Agathe - Sint-Agatha-Berchem-trajectory Berchem-Sainte-Agathe - Sint-Agatha-Berchem-distance ... Uccle - Ukkel-trajectory Uccle - Ukkel-distance Ville de Bruxelles - Stad Brussel-trajectory Ville de Bruxelles - Stad Brussel-distance Watermael-Boitsfort - Watermaal-Bosvoorde-trajectory Watermael-Boitsfort - Watermaal-Bosvoorde-distance Woluwe-Saint-Lambert - Sint-Lambrechts-Woluwe-trajectory Woluwe-Saint-Lambert - Sint-Lambrechts-Woluwe-distance Woluwe-Saint-Pierre - Sint-Pieters-Woluwe-trajectory Woluwe-Saint-Pierre - Sint-Pieters-Woluwe-distance
0 1 2020-06-01 1 [POINT(496253.84080706234 6601691.244869352)@2... None 0.0 None 0.0 None 0 ... None 0.0 {[POINT(489089.14461377583 6594225.227862967)@... 6238.532446 None 0.0 {[POINT(493452.04564533016 6596592.507628961)@... 2743.319101 None 0.0
1 1 2020-06-01 2 [POINT(481241.17182724166 6588272.126511315)@2... None 0.0 None 0.0 None 0 ... None 0.0 {[POINT(484373.02614611824 6591974.073148593)@... 5890.625638 None 0.0 {[POINT(491131.62956994347 6595158.952745219)@... 2753.083663 None 0.0
2 1 2020-06-02 1 [POINT(496253.84080706234 6601691.244869352)@2... None 0.0 None 0.0 None 0 ... None 0.0 {[POINT(489089.14461377583 6594225.227862967)@... 6238.532450 None 0.0 {[POINT(493452.04567213746 6596592.507642428)@... 2743.319131 None 0.0
3 1 2020-06-02 2 [POINT(481241.17182724166 6588272.126511315)@2... None 0.0 None 0.0 None 0 ... None 0.0 {[POINT(484373.02614611824 6591974.073148593)@... 5890.625638 None 0.0 {[POINT(491131.62956994347 6595158.952745219)@... 2753.083663 None 0.0
4 1 2020-06-03 1 [POINT(496253.84080706234 6601691.244869352)@2... None 0.0 None 0.0 None 0 ... None 0.0 {[POINT(489089.14461377583 6594225.227862967)@... 6238.532450 None 0.0 {[POINT(493452.04567213746 6596592.507642428)@... 2743.319131 None 0.0

5 rows × 42 columns

distances = commune_trajectories[
    ["vehicle", "day", "seq", *(f"{commune}-distance" for commune in communes["name"])]
]
distances.head()
vehicle day seq Anderlecht-distance Auderghem - Oudergem-distance Berchem-Sainte-Agathe - Sint-Agatha-Berchem-distance Etterbeek-distance Evere-distance Forest - Vorst-distance Ganshoren-distance ... Koekelberg-distance Molenbeek-Saint-Jean - Sint-Jans-Molenbeek-distance Saint-Gilles - Sint-Gillis-distance Saint-Josse-ten-Noode - Sint-Joost-ten-Node-distance Schaerbeek - Schaarbeek-distance Uccle - Ukkel-distance Ville de Bruxelles - Stad Brussel-distance Watermael-Boitsfort - Watermaal-Bosvoorde-distance Woluwe-Saint-Lambert - Sint-Lambrechts-Woluwe-distance Woluwe-Saint-Pierre - Sint-Pieters-Woluwe-distance
0 1 2020-06-01 1 0.0 0.0 0 0.0 0.0 3208.550776 0.0 ... 0.0 0.0 2082.592245 0.0 2241.087203 0.0 6238.532446 0.0 2743.319101 0.0
1 1 2020-06-01 2 0.0 0.0 0 0.0 0.0 3088.180926 0.0 ... 0.0 0.0 2438.565180 0.0 2254.134586 0.0 5890.625638 0.0 2753.083663 0.0
2 1 2020-06-02 1 0.0 0.0 0 0.0 0.0 3208.550776 0.0 ... 0.0 0.0 2082.592241 0.0 2241.087203 0.0 6238.532450 0.0 2743.319131 0.0
3 1 2020-06-02 2 0.0 0.0 0 0.0 0.0 3088.180930 0.0 ... 0.0 0.0 2438.565176 0.0 2254.134586 0.0 5890.625638 0.0 2753.083663 0.0
4 1 2020-06-03 1 0.0 0.0 0 0.0 0.0 3208.550776 0.0 ... 0.0 0.0 2082.592241 0.0 2241.087203 0.0 6238.532450 0.0 2743.319131 0.0

5 rows × 22 columns

Finally, let’s see the visualization of the trajectories split by commune

_, axes = plt.subplots(4, 5, figsize=(50, 40))

axes[0][0].set_title("Brussels Region")
communes.plot(color=communes["color"], ax=axes[0][0], alpha=0.4)
TemporalPointSequencePlotter.plot_sequences_xy(
    trips["trip"], axes=axes[0][0], show_markers=False, show_grid=False
)

for i, commune in communes.iterrows():
    ax = axes[(i + 1) // 5][(i + 1) % 5]
    ax.set_title(commune["name"])
    # Plot commune
    if isinstance(commune["geom"], shp.Polygon):
        ax.fill(*commune["geom"].exterior.xy, color=commune["color"], alpha=0.4)
    else:
        for g in commune["geom"].geoms:
            ax.fill(*g.exterior.xy, color=commune["color"], alpha=0.4)
    # Plot trajectories
    trajs = commune_trajectories[f'{commune["name"]}-trajectory'].dropna()
    for traj in trajs:
        TemporalPointSequenceSetPlotter.plot_xy(
            traj, axes=ax, show_markers=False, show_grid=False
        )

_ = plt.suptitle("BerlinMod Trajectories by Commune", size=35, y=0.92)
../../_images/008affb65a1a2ce918e5384079f9983dbf0210c35e49e64693dde47bca6a446a.png

Simplifying Trips (MEOS Example)#

In this section, we will simplify the trajectories with two different methods, namely Douglas-Peucker (DP) and Syncronized Euclidean Distance (SED, aka Top-Down Time Ratio simplification), and compare the simplifications.

First, we’ll choose a distance tolerance to use in the simplification

tolerance = 20

Now, we’ll compute both simplifications with the simplify_douglas_peucker method (using the synchronized parameter to chose between DP or SED)

simplifications = trips.copy()
simplifications["dp"] = simplifications["trip"].apply(
    lambda tr: tr.simplify_douglas_peucker(tolerance)
)
simplifications["sed"] = simplifications["trip"].apply(
    lambda tr: tr.simplify_douglas_peucker(tolerance, synchronized=True)
)
simplifications.head()
vehicle day seq trip dp sed
0 1 2020-06-01 1 [POINT(496253.84080706234 6601691.244869352)@2... [POINT(496253.84080706234 6601691.244869352)@2... [POINT(496253.84080706234 6601691.244869352)@2...
1 1 2020-06-01 2 [POINT(481241.17182724166 6588272.126511315)@2... [POINT(481241.17182724166 6588272.126511315)@2... [POINT(481241.17182724166 6588272.126511315)@2...
2 1 2020-06-02 1 [POINT(496253.84080706234 6601691.244869352)@2... [POINT(496253.84080706234 6601691.244869352)@2... [POINT(496253.84080706234 6601691.244869352)@2...
3 1 2020-06-02 2 [POINT(481241.17182724166 6588272.126511315)@2... [POINT(481241.17182724166 6588272.126511315)@2... [POINT(481241.17182724166 6588272.126511315)@2...
4 1 2020-06-03 1 [POINT(496253.84080706234 6601691.244869352)@2... [POINT(496253.84080706234 6601691.244869352)@2... [POINT(496253.84080706234 6601691.244869352)@2...
_, axes = plt.subplots(1, 3, figsize=(30, 10))
for _, trip in simplifications.iterrows():
    trip["trip"].plot(axes=axes[0])
    trip["dp"].plot(axes=axes[1])
    trip["sed"].plot(axes=axes[2])
axes[0].set_title("Original trajectories")
axes[1].set_title("Simplified trajectories (DP)")
axes[2].set_title("Simplified trajectories (SED)")
plt.show()
../../_images/25ee362f79807aef1acd93b926f8fec1d48b0c1c5d37ecfc8ecf0612eaab8122.png

Let’s focus on only one trajectory to see the simplification more clearly

trip_example = simplifications.iloc[20]
fig, ax = plt.subplots(figsize=(16, 8))
trip_example["dp"].plot(label="Douglas-Peucker")
trip_example["sed"].plot(label="Syncronized Euclidean Distance")
trip_example["trip"].plot(label="Original trajectory")
plt.legend()
plt.savefig("./output.svg")
../../_images/c1c1d2e49ae3d2e8dc3b2b92065a813b84e3b5566073eeacbe1955483aeea1ee.png

Now, let’s see the difference in points in each trajectory with their simplifications

distances = simplifications[["vehicle", "day", "seq"]].copy()
distances["trip"] = simplifications["trip"].apply(lambda tr: tr.num_instants())
distances["dp"] = simplifications["dp"].apply(lambda tr: tr.num_instants())
distances["sed"] = simplifications["sed"].apply(lambda tr: tr.num_instants())
distances["dp-delta"] = distances["trip"] - distances["dp"]
distances["sed-delta"] = distances["trip"] - distances["sed"]
distances["dp-%"] = distances["dp"] * 100 / distances["trip"]
distances["sed-%"] = distances["sed"] * 100 / distances["trip"]
distances.head()
vehicle day seq trip dp sed dp-delta sed-delta dp-% sed-%
0 1 2020-06-01 1 2103 48 212 2055 1891 2.282454 10.080837
1 1 2020-06-01 2 2326 65 232 2261 2094 2.794497 9.974205
2 1 2020-06-02 1 2110 48 207 2062 1903 2.274882 9.810427
3 1 2020-06-02 2 2308 65 235 2243 2073 2.816291 10.181976
4 1 2020-06-03 1 2071 48 190 2023 1881 2.317721 9.174312

We can see in the following table that the DP-simplified trips contain only a 3% of the original points in average, and that the SED-simplified ones around a 10%.

distances[["dp-delta", "dp-%", "sed-delta", "sed-%"]].describe()
dp-delta dp-% sed-delta sed-%
count 96.000000 96.000000 96.000000 96.000000
mean 1425.489583 2.686471 1315.750000 9.568915
std 718.930726 1.068704 649.342437 1.514393
min 98.000000 1.731844 91.000000 7.011070
25% 923.500000 1.999928 875.250000 8.475305
50% 1616.000000 2.367245 1471.500000 9.281897
75% 2056.750000 2.859347 1894.000000 10.622931
max 3333.000000 7.547170 2990.000000 14.150943

Temporal Aggregation of Trips (MEOS Example)#

In this section we will aggregate the trips to calculate the extent (the spatiotemporal bounding box) and the temporal count (the evolution on time of the number of vehicles travelling). PyMEOS offers two APIs for aggregating: bulk and streaming.

Bulk aggregation#

If you have all the elements to aggregate in memory, you can use the bulk aggregation to perform it in a single call. To do so, use the aggregate methods of the aggregator classes.

Now, let’s compute the extent and the temporal count using the TemporalPointExtentAggregator and TemporalPeriodCountAggregator respectively

extent_bulk = TemporalPointExtentAggregator.aggregate(trips["trip"])
tcount_bulk = TemporalPeriodCountAggregator.aggregate(trips["trip"])
_, axes = plt.subplots(1, 2, figsize=(20, 10))

extent_bulk.plot_xy(axes=axes[0], label="Spatial Extent")
for _, trip in trips.iterrows():
    trip["trip"].plot(axes=axes[0], show_markers=False)
axes[0].set_title("Spatial Extent")

tcount_bulk.plot(axes=axes[1], label="Temporal Count", color="g")
extent_bulk.to_tstzspan().plot(axes=axes[1], label="Temporal Extent")
axes[1].set_title("Temporal Count + Temporal Extent")

plt.legend()
plt.show()
../../_images/ed4d2b65070b8d8a86bc1998316ecdcc60acc9b7aff8e8f4725f08740d2ec67a.png

Let’s take the extent of only a few hours to see clearly the aggregations

day_trips = (
    trips["trip"]
    .apply(lambda tr: tr.at(TsTzSpan("[2020-06-01 08:00:00, 2020-06-01 12:00:00]")))
    .dropna()
)
day_extent = TemporalPointExtentAggregator.aggregate(day_trips)
day_tcount = TemporalPeriodCountAggregator.aggregate(day_trips)
_, axes = plt.subplots(1, 2, figsize=(20, 10))

day_extent.plot_xy(axes=axes[0], label="Spatial Extent")
for trip in day_trips:
    trip.plot(axes=axes[0], show_markers=False)
axes[0].set_title("Spatial Extent")

day_tcount.plot(axes=axes[1], label="Temporal Count", color="g")
day_extent.to_tstzspan().plot(axes=axes[1], label="Temporal Extent")
axes[1].set_title("Temporal Count + Temporal Extent")

plt.legend()
plt.show()
../../_images/8e46552edb9d1d36c5385dd600b877224e54e33093c998d7c12e0b764365187b.png

Streaming Aggregation#

There will be cases when you don’t have all the elements in memory at once, for example, maybe there are too many and memory is limited, or you are receiving them on real time, and you want to aggregate as they arrive. In those cases, you can use the streaming aggregation to perform the aggregation in steps. To do so, use the start_aggregation method of the aggregator class, and then use the object returned to perform the aggregation using the add and aggregation methods.

extent_aggregator = TemporalPointExtentAggregator.start_aggregation()
count_aggregator = TemporalPeriodCountAggregator.start_aggregation()
for trip in trips["trip"]:
    extent_aggregator.add(trip)
    count_aggregator.add(trip)
extent_stream = extent_aggregator.aggregation()
count_stream = count_aggregator.aggregation()
_, axes = plt.subplots(1, 2, figsize=(20, 10))

extent_stream.plot_xy(axes=axes[0], label="Spatial Extent")
for _, trip in trips.iterrows():
    trip["trip"].plot(axes=axes[0], show_markers=False)
axes[0].set_title("Spatial Extent")

count_stream.plot(axes=axes[1], label="Temporal Count", color="g")
extent_stream.to_tstzspan().plot(axes=axes[1], label="Temporal Extent")
axes[1].set_title("Temporal Count + Temporal Extent")

plt.legend()
plt.show()
../../_images/ed4d2b65070b8d8a86bc1998316ecdcc60acc9b7aff8e8f4725f08740d2ec67a.png

Note that you can keep aggregating after getting the result.

extent_aggregator = TemporalPointExtentAggregator.start_aggregation()
_, ax = plt.subplots(1, 1, figsize=(20, 10))
color = "black"
colors = []
for i, trip in enumerate(trips["trip"], 1):
    extent_aggregator.add(trip)
    if i % 10 == 0:
        plot = extent_aggregator.aggregation().plot_xy(label=f"{i} trips")
        color = plot[0][0].get_color()
        colors += [color] * 10
extent_stream = extent_aggregator.aggregation()
extent_stream.plot_xy(label="Final extent")

for trip, color in zip(trips["trip"], colors):
    trip.plot(show_markers=False, color=color)

plt.legend()
plt.show()
../../_images/cd0b89c6977d894baa6844d7036feed07f880eb70e9c7362ce964d4bb3d97126.png

Tiling Trips (MEOS Example)#

In this section, the trajectories and the speed will be split into tiles, for which we will compute some aggregates.

First, let’s take a look at the tiled trajectories and speeds

tiled_trips = trips["trip"].apply(lambda tr: tr.tile(5e3, remove_empty=True))
speeds = trips["trip"].apply(lambda tr: tr.speed())
tiled_speeds = speeds.apply(
    lambda sp: sp.value_time_split(10, "1 day", 0, "2020-06-01")
)
_, axes = plt.subplots(1, 2, figsize=(20, 10))
for tiled_trip in tiled_trips:
    for traj in tiled_trip:
        traj.plot(axes=axes[0], show_markers=False)
axes[0].set_title("Tiled Trajectories")

for tiled_speed in tiled_speeds:
    for traj in tiled_speed:
        traj.plot(axes=axes[1], show_markers=False)
axes[1].set_title("Tiled Speeds")

plt.show()
../../_images/d38beaa270645871138843281ea26a2bf8ebe7c922c8e31193dbfa63e414ab2c.png

To see the tiles more clearly, let’s tile the extent of all the trips and speeds instead and compute the intersection of each trip with each tile (that is exactly what the tile function of TPointSeq is doing under the hood). This way, we can color them manually to see them clearly.

extent_tiles = extent_bulk.tile(5e3)
speed_extent = TemporalNumberExtentAggregator.aggregate(speeds)
speed_tiles = speed_extent.tile(10, "1 day", 0.0, "2020-06-01")
_, axes = plt.subplots(1, 2, figsize=(20, 10))

for tile, color in zip(extent_tiles, distinctipy.get_colors(len(extent_tiles))):
    tile.plot_xy(axes=axes[0], color="black", draw_filling=False)
    for trip in trips["trip"]:
        trip_in_tile = trip.at(tile)
        if trip_in_tile is not None:
            trip_in_tile.plot(axes=axes[0], color=color, show_markers=False)
axes[0].set_title("Tiled Trajectories")

for tile, color in zip(speed_tiles, distinctipy.get_colors(len(speed_tiles))):
    tile.plot(axes=axes[1], color="black", draw_filling=False)
    for speed in speeds:
        speed_in_tile = speed.at(tile)
        if speed_in_tile is not None:
            speed_in_tile.plot(axes=axes[1], color=color, show_markers=False)
axes[1].set_title("Tiled Speeds")
plt.show()
../../_images/29faaf9eab2d06668d66d1edba66da3f05278d010350d7d4e507760fd6214b95.png

Finally, let’s aggregate the tiles to compute some metrics We will start by computing, for each spatial tile, the number of trajectories that enter it, and the total amount of time spent and the total distance traveled by them in the tile.

extent_tiles = extent_bulk.tile(5e3)
data = []
for tile in extent_tiles:
    intersections = trips["trip"].apply(lambda tr: tr.at(tile)).dropna()
    if len(intersections) == 0:
        continue
    count = len(intersections)
    duration = sum([i.duration() for i in intersections], timedelta())
    distance = sum([i.length() for i in intersections]) / 1000
    data.append([tile, count, duration, distance])

spatial_aggregates = pd.DataFrame(
    data, columns=["tile", "count", "duration", "distance"]
)
spatial_aggregates
tile count duration distance
0 SRID=3857;STBOX X((475000,6580000),(480000,658... 10 0 days 00:21:57.271928 12.715462
1 SRID=3857;STBOX X((480000,6580000),(485000,658... 8 0 days 00:37:50.260034 15.250435
2 SRID=3857;STBOX X((495000,6580000),(500000,658... 8 0 days 00:56:22.219006 19.807692
3 SRID=3857;STBOX X((475000,6585000),(480000,659... 13 0 days 01:42:26.592035 59.420951
4 SRID=3857;STBOX X((480000,6585000),(485000,659... 20 0 days 03:34:19.201837 73.582531
5 SRID=3857;STBOX X((485000,6585000),(490000,659... 12 0 days 02:07:10.468161 52.101740
6 SRID=3857;STBOX X((490000,6585000),(495000,659... 6 0 days 01:22:33.732748 36.422087
7 SRID=3857;STBOX X((495000,6585000),(500000,659... 8 0 days 00:47:50.956624 42.149632
8 SRID=3857;STBOX X((475000,6590000),(480000,659... 8 0 days 01:21:45.559201 54.962866
9 SRID=3857;STBOX X((480000,6590000),(485000,659... 10 0 days 01:39:18.880626 35.092058
10 SRID=3857;STBOX X((485000,6590000),(490000,659... 37 0 days 04:43:56.029545 135.629127
11 SRID=3857;STBOX X((490000,6590000),(495000,659... 21 0 days 01:08:30.026607 27.547398
12 SRID=3857;STBOX X((495000,6590000),(500000,659... 4 0 days 00:27:07.208503 20.924216
13 SRID=3857;STBOX X((470000,6595000),(475000,660... 9 0 days 00:16:51.436501 10.185846
14 SRID=3857;STBOX X((475000,6595000),(480000,660... 30 0 days 01:37:23.497930 38.060402
15 SRID=3857;STBOX X((480000,6595000),(485000,660... 40 0 days 05:02:47.976012 127.956860
16 SRID=3857;STBOX X((485000,6595000),(490000,660... 18 0 days 02:44:27.235479 69.741764
17 SRID=3857;STBOX X((490000,6595000),(495000,660... 28 0 days 03:01:46.069980 113.754468
18 SRID=3857;STBOX X((495000,6595000),(500000,660... 16 0 days 01:56:46.320943 79.532763
19 SRID=3857;STBOX X((470000,6600000),(475000,660... 14 0 days 01:08:43.444506 23.299969
20 SRID=3857;STBOX X((475000,6600000),(480000,660... 11 0 days 01:12:06.437047 55.651246
21 SRID=3857;STBOX X((480000,6600000),(485000,660... 26 0 days 04:36:55.024443 145.456576
22 SRID=3857;STBOX X((485000,6600000),(490000,660... 8 0 days 00:57:20.399942 36.388236
23 SRID=3857;STBOX X((490000,6600000),(495000,660... 12 0 days 01:54:33.371322 58.257905
24 SRID=3857;STBOX X((495000,6600000),(500000,660... 14 0 days 01:36:16.495602 46.346555
25 SRID=3857;STBOX X((480000,6605000),(485000,661... 10 0 days 00:40:21.652772 22.114868
26 SRID=3857;STBOX X((485000,6605000),(490000,661... 10 0 days 01:06:43.042539 33.102930
27 SRID=3857;STBOX X((490000,6605000),(495000,661... 12 0 days 00:41:57.630167 22.865846

For the speed tiles, we will compute the number of times a vehicle is in the speed range at that time, and the amount of time it stays there

speed_tiles = speed_extent.tile(10, "1 day", 0, "2020-06-01")
data = []
for tile in speed_tiles:
    intersections = speeds.apply(lambda tr: tr.at(tile)).dropna()
    if len(intersections) == 0:
        continue
    count = len(intersections)
    duration = sum([i.duration() for i in intersections], timedelta())
    data.append([tile, count, duration])

speed_aggregates = pd.DataFrame(data, columns=["tile", "count", "duration"])
speed_aggregates
tile count duration
0 TBOXFLOAT XT([0, 10),[2020-06-01 00:00:00+02, ... 21 0 days 07:31:24.685901
1 TBOXFLOAT XT([10, 20),[2020-06-01 00:00:00+02,... 21 0 days 03:03:36.408255
2 TBOXFLOAT XT([20, 30),[2020-06-01 00:00:00+02,... 12 0 days 00:31:46.326571
3 TBOXFLOAT XT([30, 40),[2020-06-01 00:00:00+02,... 11 0 days 00:12:42.223983
4 TBOXFLOAT XT([0, 10),[2020-06-02 00:00:00+02, ... 23 0 days 07:39:36.356752
5 TBOXFLOAT XT([10, 20),[2020-06-02 00:00:00+02,... 21 0 days 03:02:29.137009
6 TBOXFLOAT XT([20, 30),[2020-06-02 00:00:00+02,... 12 0 days 00:31:04.548691
7 TBOXFLOAT XT([30, 40),[2020-06-02 00:00:00+02,... 11 0 days 00:14:00.514541
8 TBOXFLOAT XT([0, 10),[2020-06-03 00:00:00+02, ... 24 0 days 08:16:09.948028
9 TBOXFLOAT XT([10, 20),[2020-06-03 00:00:00+02,... 22 0 days 03:03:03.718551
10 TBOXFLOAT XT([20, 30),[2020-06-03 00:00:00+02,... 13 0 days 00:44:43.354712
11 TBOXFLOAT XT([30, 40),[2020-06-03 00:00:00+02,... 12 0 days 00:12:20.122468
12 TBOXFLOAT XT([0, 10),[2020-06-04 00:00:00+02, ... 28 0 days 09:01:32.980751
13 TBOXFLOAT XT([10, 20),[2020-06-04 00:00:00+02,... 26 0 days 03:23:41.826769
14 TBOXFLOAT XT([20, 30),[2020-06-04 00:00:00+02,... 14 0 days 00:38:43.298191
15 TBOXFLOAT XT([30, 40),[2020-06-04 00:00:00+02,... 12 0 days 00:10:49.332312