I Explored My Z Shell History. Here’s What I Found

As I was reading another interesting blog post about popular git config options, a curious thought crept into my mind: which git commands do I find myself using the most? This also eventually led me down a journey to explore my own terminal usage pattern throughout the day by using metadata like Epoch which is readily available in my .zsh_history.

My Prediction

Let's see... there's git status, git commit, git add, git push, git pull, git checkout... Huh, what else could there be? It seems these are the ones I rely on most frequently in my day-to-day git workflow.

I'm willing to bet that my most frequently used git commands are git commit and git add. If I had to rank the top 5 git commands I use the most, it would probably look something like this:

  1. git status
  2. git add
  3. git commit
  4. git pull
  5. git push

Inside a Command Line Interface (CLI) Tool

Just so we’re on the same page, here’s a brief breakdown of the anatomy of a command (git as an example).

git fetch origin main --depth=5
  • git: command name
  • fetch: subcommand name
  • origin main: arguments (there are 2 in this case)
  • --depth: options/flags

Verifying My Assumptions

I began pondering: wouldn’t it be fascinating if I could track my local CLI tool usage? Perhaps there are existing tools out there for this purpose. However, considering privacy and security, I wouldn't want to use a third-party hosted solution where sensitive information like passwords or API tokens might accidentally be sent to a server I don’t control.

Suddenly, it struck me – all the commands I’ve run are already recorded in our shell history. What if I could just parse through all the commands that I’ve run from history?

What is in my Zsh history

Before we dive into anything else, let's take a peek inside our shell history first. By exploring its contents, we can get a better idea of what we're working with:

❯ cat ~/.zsh_history | head -n 5
: 1705638216:0;cd wraith
: 1705638264:0;git status
: 1705638987:0;git checkout feat/add-r2-backup
: 1705639215:0;ls
: 1705639390:0;git commit -m 'feat: support sync backup to r2'

It looks like we have some useful information in there. We've got the Epoch, which tells us when each command was executed, the exit status, and the actual commands themselves.

Alright, I think we've got a good starting point to work with!

Identifying the Top 10 Most Used Commands

Let’s start with identifying the top 10 most used commands in my terminal:

history | awk '{print $2}' | sort | uniq -c | sort -rn | head -n 10

Explanation

  • history: This command displays your command history.
  • awk '{print $2}': This filters the output of history to only show the second column, which contains the actual command executed.
  • sort: This sorts the list of commands alphabetically.
  • uniq -c: This counts the occurrences of each unique command.
  • sort -n: This sorts the commands by their count in descending order (most frequent first).
  • tail -n 10: This shows the last 10 entries, which are the top 10 most used commands. Try changing this to 20 or something.

Here, I can’t help to be reminded how remarkable Unix's philosophy is: use small programs that do one thing well, which can then be seamlessly combined or "piped" (|) together to achieve more complex functionalities!

💡
I personally find the tldr CLI tool to be handy for quickly checking command help pages. It's a friendlier alternative to the traditional man pages, making it easier to grasp commands and their usage.

Output:

❯ history | awk '{print $2}' | sort | uniq -c | sort -rn | head -n 10
   3839 git
    426 npm
    409 cd
    395 docker
    278 rm
    269 clasp
    253 poetry
    249 go
    191 wrangler
    177 make

Looking at the output, it’s clear that git commands dominate my terminal activity.

Visualizing it using Mermaid.js

With these numbers in hand, it's time to visualize the data. When I think about representing this information graphically, a pie chart immediately comes to mind. Let’s draw this using Mermaid:

pie title Top 10 Most Used Commands "git" : 3839 "npm" : 426 "cd" : 409 "docker" : 395 "rm" : 278 "clasp" : 269 "poetry" : 253 "go" : 249 "wrangler" : 191 "make" : 177

Not surprisingly, it appears that git commands reign supreme, comprising 59% of all the commands I used!

At the same time, it's quite intriguing to see how different tools and operations occupy varying proportions of my command usage. Here, I'm not surprised to see my Linux make command usage made it into the top 10!

Most Used Git Commands

Now, back to the original question. Let's take a closer look at the git commands I use the most.

My first thought was to simply add grep 'git' to our original command and tweak the awk part, we can hone in on the most frequently used git (sub)commands:

history | grep 'git' | awk '{print $2 " " $3}' | sort | uniq -c | sort -rn | head -n 10

Output:

❯ history | grep 'git' | awk '{print $2 " " $3}' | sort | uniq -c | sort -rn | head -n 10
   1923 git commit
    496 git status
    293 git add
    200 git push
    121 git checkout
     96 git branch
     90 git lg
     85 git pull
     79 go mod
     64 go get

Ah, it seems like some unrelated go commands are showing up in the terminal history despite using grep to filter for git-related commands.

Upon careful inspection after running only history | grep 'git', I realized my original command takes into account arguments, flags, and other unrelated strings!

Instead, we need to filter out only the lines starting with git:

history | awk '$2=="git" {print $2 " " $3}' | sort | uniq -c | sort -rn | head -n 10

Output:

❯ history | awk '$2=="git" {print $2 " " $3}' | sort | uniq -c | sort -rn | head -n 10
   1923 git commit
    496 git status
    293 git add
    200 git push
    121 git checkout
     96 git branch
     90 git lg
     85 git pull
     55 git remote
     51 git st

Yay! This works! Just a side note: git lg and git st are my git aliases in my git configs managed using Chezmoi.

Plotting the top 10 most used git commands

pie title Top 10 Most Used Git Commands "git commit" : 1923 "git status" : 496 "git add" : 293 "git push" : 200 "git checkout" : 121 "git branch" : 96 "git lg" : 90 "git pull" : 85 "git remote" : 55 "git st" : 51

Aligned with my initial prediction, it seems like git commit (56%) is by far the command I've used the most, followed by git status (15%) and git add (9%). However, I was surprised to see that the number of git add was not as closely matched to git commit as I had expected.

I also noticed some other git commands like git push, git checkout, and git branch popping up quite frequently. This makes sense!

Overall, it's interesting to see these patterns and get a glimpse into my workflow with git.

Terminal Activity Pattern

While I was working on this, another idea popped into my head. Remember when we noticed the Epoch entry in the Zsh history? What if we could do something with that date?

💬
I think I just discovered another reason to love Zsh even more than Bash – they have a bit more metadata like Epoch that I can use!

Now, for this task, I don’t think I am able to rely on a single one-liner command like I did before. Well, if the one-liner gets any longer, it won’t be readable anyway.

I wrote a simple Python script to parse our .zsh_history file from our home directory:

import re
from datetime import datetime
from pathlib import Path


def main():
    command_activities = []

    for log_entry in load_zsh_history():
        if not log_entry or log_entry.isspace():
            continue
        epoch_time, command = extract_activity_details(log_entry)
        if not epoch_time or not command:
            continue

        command_activities.append((epoch_time, command))

    hour_of_the_day = group_activities_by_hour(command_activities)
    graph = create_mermaidjs_graph(hour_of_the_day)
    print(graph)


def load_zsh_history():
    zsh_history_path = Path.home() / ".zsh_history"
    with open(zsh_history_path, encoding="latin-1") as file:
        for line in file:
            yield line.strip()


def extract_activity_details(log_entry):
    matches = re.match(r"^:\s(\d+):\d+;(\S+(?:\s\S+)?)", log_entry)
    if matches:
        return matches.groups()
    return None, None


def group_activities_by_hour(command_activities, specific_command=""):
    hours_of_the_day = range(24)
    grouped_activities = {h: 0 for h in hours_of_the_day}

    for epoch_time, command in command_activities:
        if len(specific_command) != 0 and specific_command not in command:
            continue

        normal_time = datetime.fromtimestamp(int(epoch_time))
        hour_of_the_day = normal_time.hour
        grouped_activities[hour_of_the_day] += 1
    return grouped_activities


def create_mermaidjs_graph(grouped_data):
    values = list(grouped_data.values())
    graph = (
        "xychart-beta\n"
        '    title "Terminal Activity by Hour of the Day"\n'
        '    x-axis "Hour of the day"\n'
        '    y-axis "No. of commands run"\n'
        f"    bar {values}\n"
        f"    line {values}\n"
    )
    return graph


if __name__ == "__main__":
    main()

Basically here’s how the Python script works:

  1. The script goes through my terminal history file
  2. It looks at each line in the file to see when I did stuff and what I did
  3. It then counts how many times I did each thing during different hours of the day
  4. After it has all that data, it prints out the Mermaidjs syntax — which then allows me to make a little graph below to show me when I'm most active in the terminal

Output:

❯ python3 main.py
xychart-beta
    title "Terminal Activity by Hour of the Day"
    x-axis "Hour of the day"
    y-axis "No. of commands run"
    bar [122, 42, 30, 21, 29, 61, 8, 423, 1003, 881, 624, 500, 363, 595, 412, 477, 455, 404, 351, 587, 807, 884, 920, 364]
    line [122, 42, 30, 21, 29, 61, 8, 423, 1003, 881, 624, 500, 363, 595, 412, 477, 455, 404, 351, 587, 807, 884, 920, 364]

Finally, pasting this on mermaid.live allows me to instantly visualize the breakdown of when I’m the most active in the terminal throughout the day:

xychart-beta title "Terminal Activity by Hour of the Day" x-axis "Hour of the day" y-axis "No. of commands run" bar [122, 42, 30, 21, 29, 61, 8, 423, 1003, 881, 624, 500, 363, 595, 404, 473, 455, 404, 351, 587, 807, 884, 920, 364] line [122, 42, 30, 21, 29, 61, 8, 423, 1003, 881, 624, 500, 363, 595, 404, 473, 455, 404, 351, 587, 807, 884, 920, 364]

From the graph, it appears that my terminal activity usually peaks in the morning hours (at around 9 am) and at night around 10 pm. It's also quite apparent that my terminal activity gradually decreases in the afternoon.

Obviously, I don't often pull an all-nighter.

My Takeaways

I had a lot of fun digging into my terminal usage history and finding out what commands I use the most.

If there’s any takeaway, I’d like to leave you with these two commands:

# Top 10 most used commands:
history | awk '{print $2}' | sort | uniq -c | sort -rn | head -n 10

# Top 10 most used <git> commands:
history | awk '$2=="git" {print $2 " " $3}' | sort | uniq -c | sort -rn | head -n 10

Do tweak the commands/Python script to fit your needs and learn more about your terminal usage just for fun!

What's Next

While playing around with this, I stumbled upon a project called Atuin that replaces your shell history with an SQLite database, providing additional context for your commands. What's even cooler is that it supports syncing between machines (with the option to self-host it). I haven't tried it out yet, but it sounds promising!

I also came across several other projects that piqued my interest but had not used:

That’s it, thanks for reading!

Hosted on Digital Ocean.