Ali Alsaffar

My Tiny (currently unstyled) Corner in the Web...

Currently building...

Simple Location Log

The basic idea behind this is to juxtapose the outputs of two processes. The first is a WhatsApp chat export, timestamped, and a GPX file with a timerange encompassing some of the WhatsApp chat export's messages/entries. The output at the moment is a map with a GPX track, and clickable points that represent the WhatsApp messages' locations (their locations will be extracted from the timestamp of the message and the matching timestamp, or interpolation, from the GPX file).

The Plan

  1. Chat Export --> Structured Object
  2. GPX File --> Structured Object
  3. Get Location from timestamp x.
  4. Assign each entry in Chat Export a location based on above "algorithim".
  5. Calculate a distance matrix from the entries now assigned a location.
  6. Cluster the points with DBSCAN and the computed distance matrix with multiple epsilon values for visualization.
  7. Calculate cluster centroids.
  8. Create a object that contains each cluster's entries.
  9. Draw a map with GPX track, and cluster centroids.
  10. Make each cluster on the map clickable; showing its components, sorted temporally.

Chat Export --> Structured Object

The WhatsApp chat export is formatted as follows:


[$dd/$mm/$yyy, $h:$mm:$ss ${AM/PM}] $name: $message
[$dd/$mm/$yyy, $h:$mm:$ss ${AM/PM}] $name: ‎<attached: $attachment.$extension>
            

So, the structure will have the date, time, name of sender, message, and attachment. So the structure is as follows:


from datetime import date, time
from typing import Optional

class WhatsAppMessage:
    def __init__(self, date: date, time: time, meridiem: str, sender: str, message: str, attachment: Optional[str]):
        self.date = date
        self.time = time
        self.meridiem = meridiem
        self.sender = sender
        self.message = message
        self.attachment = attachment
            

And the regex patterns to capture these groups are as follows:


pattern = r"\[(\d{2}/\d{2}/\d{4}), (\d{1,2}:\d{2}:\d{2})\s([AP]M)\] ([^:]+):\s(.*(?:<attached: .*?>)?)$"
            

Now, we need to define a function that will extract the required data using the regular expression.