Currently building...
Simple Location Log
The basic idea behind this is to juxtapose the outputs of two processes. The first is a WhatsApp chat export, timestamped, and a GPX file with a timerange encompassing some of the WhatsApp chat export's messages/entries. The output at the moment is a map with a GPX track, and clickable points that represent the WhatsApp messages' locations (their locations will be extracted from the timestamp of the message and the matching timestamp, or interpolation, from the GPX file).
The Plan
- Chat Export --> Structured Object
- GPX File --> Structured Object
- Get Location from timestamp x.
- Assign each entry in Chat Export a location based on above "algorithim".
- Calculate a distance matrix from the entries now assigned a location.
- Cluster the points with DBSCAN and the computed distance matrix with multiple epsilon values for visualization.
- Calculate cluster centroids.
- Create a object that contains each cluster's entries.
- Draw a map with GPX track, and cluster centroids.
- Make each cluster on the map clickable; showing its components, sorted temporally.
Chat Export --> Structured Object
The WhatsApp chat export is formatted as follows:
[$dd/$mm/$yyy, $h:$mm:$ss ${AM/PM}] $name: $message
[$dd/$mm/$yyy, $h:$mm:$ss ${AM/PM}] $name: <attached: $attachment.$extension>
So, the structure will have the date, time, name of sender, message, and attachment. So the structure is as follows:
from datetime import date, time
from typing import Optional
class WhatsAppMessage:
def __init__(self, date: date, time: time, meridiem: str, sender: str, message: str, attachment: Optional[str]):
self.date = date
self.time = time
self.meridiem = meridiem
self.sender = sender
self.message = message
self.attachment = attachment
And the regex patterns to capture these groups are as follows:
pattern = r"\[(\d{2}/\d{2}/\d{4}), (\d{1,2}:\d{2}:\d{2})\s([AP]M)\] ([^:]+):\s(.*(?:<attached: .*?>)?)$"
Now, we need to define a function that will extract the required data using the regular expression.