I’ve created a short exercise on Markov Models for my students, here’s the story
“You are a valued employee of MegaCorp Inc.
Locked in your basement office you’ve been gathering information about your coworkers Adalbert, Beatrice and Celis. They’re working in shifts on a single machine and you’ve been monitoring and logging their daily routines. Unfortunately they found out, disabled their webcams, removed your Trojan and cut off your access to switches and routers. Even though you are blind now, it’s still mandatory for you to know who is currently working for any, because only Adalbert isn’t vigilant enough to notice when you sneak up to the ground floor to snatch some donuts and cake from their office fridge.
Except… you still got the old logs and you can see the LEDs of their switch blinking. Unfortunately you don’t have physical access, but perhaps you can still try to figure out, who’s shift it is?”
So what’s the task?
- We model this problem using Markov Models (obviously)
- Our Markov states are the activities of a given co-worker
- Each activity produces different LED blinking patterns – so each state/activity emits an observation
- Three iterations:
- Task 1: Markov chain, only modeling states (activities) over time
- Task 2: Markov chain with observations, modeling states (activities) and observations (blinking LEDs) over time, given we have access to both
- Task 3: Hidden Markov model, modeling states (activities) and observations (blinking LEDs) over time, given we have access only to the observations
Let’s look at the data:
- You own logs of daily activities for each of the coworkers A, B and C
- The logs are CSVs in the format: date|sequence of activities
- Activities: Working (0), pause (1), online gaming (2), browsing the internet (3), streaming TV series (4)
- Furthermore you’ve been (independently) gathering data on LED blinking patterns for different activities in the format: activity|pattern
- Pattern: none (0), very low frequency (1), low frequency (2), medium frequency (3), high frequency (4), very high frequency (5)
You can get the Python/Jupyter notebook here:
And check out how the correct plots should look like here:
You can also try it on Google colab: