9 minutes
A Primer on MTR - The Traceroute on Steroids
If you have ever wondered what paths packets take to traverse networks, chances are that you have used or heard of traceroute. traceroute
or some variant of it is available on almost all operating systems.
My experience with traceroute
began when I was just beginning to learn networking and was eager to determine the distance between a server and my machine. I used it on every server or website that I used to regularly visit, watching in awe as the green texts would fill up my terminal window.
Years passed, and as I progressed in my career, I found myself relying on this tool more often than not. Like many other tools, traceroute
and other variants of it also kept evolving.
One of these relatively newer tools which is also my favorite is called MTR or “My TraceRoute” and it is the successor to the traceroute
utility and comes pre-installed on Linux. In case somehow your Linux machine does not come with it, install it using:
sudo apt update && sudo apt install mtr
mtr
is usually referred to as traceroute
on steroids and in this post, we will uncover why.
MTR Advantages
MTR
is beneficial to use for several reasons:
-
Helps troubleshoot connectivity issues by identifying bottlenecks or failures along the route to a particular destination. This can aid in identifying the source of network problems, such as misconfigured routers or oversubscribed network links.
-
Measures the quality of a network connection, providing information on packet loss, latency, and jitter. This can be useful for network performance monitoring and optimizations (very important).
-
It is a very versatile tool and can be used in various scenarios and with various protocols. It can be used to trace IPv4 and IPv6 routes, and can also trace routes through ICMP, TCP, or UDP packets.
I would say the most important take away here is the support for tracing with TCP
and UDP
in addition to ICMP
. Why you ask? In some cases, the network devices such as firewalls block the ICMP
traffic. But most importantly, almost always ICMP
packets are handled by the slow path
or control plane rather than the fast path
or data plane.
This gets a bit tricky so let’s expand on it by explaining the terms I just threw in the hat.
Control Plane vs Data Plane
The control plane is responsible for managing and maintaining the configuration of the network device and its interactions with other network devices. This includes tasks such as routing protocols, address resolution, and network management. The control plane is often implemented in software and runs on a processor or CPU of the network device. ICMP
packets are handled by the control plane.
The data plane, on the other hand, is responsible for forwarding and processing data packets as they pass through the network device. This includes tasks such as packet filtering, forwarding, and queuing. The data plane is often implemented in hardware, such as ASICs or network processors, and runs on specialized hardware in the network device.
Slow Path vs Fast Path
Fast path and slow path are terms that are used to refer to the different processing paths that data packets can take through the data plane.
The slow path, on the other hand, is the processing path that is used for packets that cannot be handled by the fast path. This path is typically implemented in software and is used for tasks such as access control, deep packet inspection, and other tasks that require more complex processing. The slow path is less optimized for performance and may introduce additional latency.
The fast path is the optimized processing path that is designed to handle the majority of packets with minimal processing overhead. This path is optimized for high performance and is typically implemented in hardware.
So, can we conclude that the slow path == control plane
and fast path == data plane
? Although it is very convincing to say yes, it is not quite correct! But why did I make it appear that way earlier? Because these terms usually (and mistakenly) get interchanged and I needed an excuse to distinguish between them. Nonetheless, I could use the term slower path or control plane and so on.
The point I’m trying to make is that because of how network devices handle traffic, TCP
and UDP
packets travel a much faster course than ICMP
packets.
Depending on how these network devices are set up or how busy they are at any given moment, these devices may simply drop the ICMP
packets, rate limit them, or, using traffic engineering, divert them to a different path from which the “regular” data traffic is not sent. You might be wondering why someone would set up a device to behave that way. Keep in mind that a device’s CPU is in charge of handling ICMP
traffic, and if a large volume of ICMP
packets are directed at that device, such as during a DDoS
attack, it may affect the data plane and the important/normal traffic that is handled by the device.
I think by now, you could imagine where I am heading. If we see a 40% drop on a hop in our path or some latency, it does not mean that there is definitely a problem with that network device. It could simply be because the way it is configured to handle the ICMP
packets. This is where tracing with TCP
or UDP
packets becomes valuable.
Ok, let’s get to some examples.
MTR Examples
Let’s have a look at a few example.
Basic Usage
The most basic form; uses ICMP
:
mtr example.com
Continues running until you press
CTRL-C
The command above depicts an output similar to:
My traceroute [v0.95]
dev (10.11.12.13) -> example.com (93.184.216.34) 2023-01-16T22:32:03+0100
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1. _gateway 0.0% 27 0.2 0.2 0.1 0.3 0.0
2. 192.168.143.254 0.0% 35 0.4 0.4 0.2 0.5 0.1
3. 10.224.217.254 0.0% 27 0.4 0.4 0.3 0.4 0.0
4. 10.224.216.164 0.0% 27 0.4 0.4 0.4 0.5 0.0
5. 10.224.225.206 0.0% 27 0.4 0.4 0.3 0.5 0.0
6. 10.17.151.112 0.0% 27 0.7 0.8 0.6 1.1 0.1
7. 10.73.0.152 0.0% 27 0.4 0.4 0.3 0.5 0.0
8. 10.95.33.8 0.0% 27 58.4 14.1 1.4 88.5 27.3
9. lon-thw-sbb1-nc5.uk.eu 0.0% 27 3.6 3.7 3.4 4.0 0.1
10. nyc-ny1-sbb1-8k.nj.us 42.3% 27 77.4 77.5 76.3 78.6 0.7
11. 10.200.3.133 0.0% 26 81.5 81.6 80.1 83.1 0.7
12. core1.lga.edgecastcdn.net 0.0% 26 98.9 80.1 77.5 98.9 6.0
13. ae-66.core1.nyb.edgecastcdn.net 0.0% 26 80.5 83.6 80.4 119.3 8.6
14. 93.184.216.34 0.0% 26 75.7 75.8 75.7 75.9 0.0
We can interpret the output as follows:
- The first column shows the hops we traversed to reach example.com
- The second column
Loss%
shows the percentage of packets that were lost at that hop. In this case, all the hops except the 10th hop have a 0% loss, which means that all packets reached the next hop. - The third column
Snt
shows the number of packets sent to that hop. - The fourth column
Last
shows the round-trip time (RTT
) of the last packet sent to that hop in milliseconds. - The fifth column
Avg
shows the averageRTT
of all packets sent to that hop. - The sixth column
Best
shows the lowestRTT
of all packets sent to that hop. - The seventh column
Wrst
shows the highest or worstRTT
of all packets sent to that hop. - The eighth column
StDev
shows the standard deviation ofRTT
for all packets sent to that hop.
Trace Using TCP Packets
Now, let’s see what changes if we use TCP
by typing the -T
flag and specifying the port with -P
flag:
mtr -T -P 443 example.com
Outputs:
My traceroute [v0.95]
dev (10.11.12.13) -> example.com (93.184.216.34) 2023-01-16T23:07:46+0100
Keys: Help Display mode Restart statistics Order of fields quit
Host Loss% Snt Last Avg Best Wrst StDev
1. _gateway 0.0% 35 0.3 0.3 0.2 0.5 0.1
2. 192.168.143.254 0.0% 35 0.4 0.4 0.2 0.5 0.1
4. 10.224.217.254 0.0% 35 0.5 0.5 0.4 0.6 0.1
5. 10.224.216.166 0.0% 35 0.5 0.5 0.4 0.7 0.1
10.224.216.164
6. 10.224.138.62 0.0% 35 0.5 0.5 0.4 0.8 0.1
10.224.225.212
7. 10.17.151.114 0.0% 35 1.0 116.4 0.7 3043. 536.6
10.17.146.2
8. 10.73.0.230 0.0% 35 0.5 0.5 0.4 0.6 0.1
10.73.0.98
9. 10.95.33.8 0.0% 34 1.7 37.7 1.1 1011. 173.6
10.95.33.10
10. lon-thw-sbb1-nc5.uk.eu 0.0% 34 3.7 3.8 3.6 4.6 0.2
be102.lon-drch-sbb1-nc5.uk.eu
11. nyc-ny1-sbb2-8k.nj.us 0.0% 34 1084. 1437. 71.5 7352. 2184.
nyc-ny1-sbb1-8k.nj.us
12. 10.200.3.131 0.0% 34 74.5 76.5 70.0 82.5 3.1
10.200.3.137
13. de-cix.nyc1.edgecast.com 0.0% 34 87.6 75.2 70.7 88.8 5.2
core1.lga.edgecastcdn.net
14. ae-65.core1.nyb.edgecastcdn.net 0.0% 34 73.8 78.7 72.6 111.1 7.5
ae-71.core1.nyb.edgecastcdn.net
15. 93.184.216.34 2.9% 34 70.5 72.0 70.3 76.1 1.8
A few interesting things happened here! First the nyc-ny1-sbb1-8k.nj.us
hop that had a lot of packet loss is now not having any losses. Secondly, we see more than one hop per hop count. This is because some sort of load-balancing (e.g. ECMP
) is taking place.
Good. Let’s explore some more options.
Trace Using UDP Packets
The required steps to leverage UDP
are quite similar to the TCP example. Just swap the -T
flag with lowercase -u
:
mtr -u -P 53 1.0.0.1
This can be helpful for tracing DNS
or HTTP/3 servers.
More Advanced Usage
Here are some examples that I mostly use myself:
-
mtr -T -P 443 -w -n -c 10 example.com
-T
: TellsMTR
to useTCP
packets instead of the default ICMP packets for the trace route.-P 443
: Specifies thatMTR
should useTCP
port 443 for the trace route.-w
: TellsMTR
to display the results in a wide format, which will show more detailed information about each hop.-n
: Tells MTR to display IP addresses instead of hostnames in the output.-c 10
: TellsMTR
to send 10 packets to each hop before moving on to the next hop. This can be useful to get a more accurate representation of the network conditions.
-
mtr -T -P 443 -w -n -c 10 -z example.com
Very similar to the first command but displays the Autonomous System (AS) number alongside each hop by providing the
-z
flag. e.g.<OUTPUT OMITTED FOR BREVITY> 13. AS15133 152.195.68.141 0.0% 10 72.7 76.2 72.7 83.5 3.1 152.195.68.131 AS15133 152.195.68.131 152.195.69.131 AS15133 152.195.69.131 152.195.69.139 AS15133 152.195.69.139 14. AS15133 93.184.216.34 0.0% 10 73.3 71.5 70.4 74.1 1.3
-
mtr -T -P 443 -j -n example.com | jq '.report.hubs[].host'
The
-j
flag instructsMTR
to output its data in JSON format and with a tool likejq
, we can extract the data we need. In this case I have extracted all the hops.
Conclusion
In this we learned about the mtr
command and seized the opportunity to also talk about control plane, data plane, slow path, fast path and how they influence our analysis and observation from the data we gather.