There's no 10 ways about it, Subnets are Genius
The power of binary in action!
I would hazard a guess that the vast majority of people who have ever had to enter in a "subnet mask" into their network settings have no idea what it means, much less the amazingly powerful and beautifully simple solution to the problem it solves. In this post, we'll peel back the mystery, talk a little binary and basic computer operations, and hopefully do it in a way that even a non-technical person could understand.
What is a Subnet?
So what is a Subnet and what does it do?
I'm going to use the word "computer" a lot, but really that's a stand-in for any network device. We're also just talking IPv4 in our examples here for simplicity, but the same rules apply to IPv6, just with 128-bit numbers.
In its most simple form, if two computers are on a network and Computer A wants to talk to Computer B, it just hops on the wire (or wireless, sure...) and says “Hey B, I’ve got some data for you!”. That’s great if both computers are on the same local network, they can hear each other talk. But want happens if Computer A wants to talk to Computer C, but C is on a different network?
Computer A needs to know that it can’t just shout into its local network expecting C to answer, it needs to instead send its request to someone who knows how to reach C. A “gateway” between Computer A’s network and whatever network Computer C lives on.
But how does Computer A know that Computer C is not on its network? How does it know Computer B is? You has a human can look at two IP addresses and say "well, they both start 192.168.10.
, so they must be in the same network, right?" But how does the computer make that determination? What if your network is divided in such a way that 192.168.10.10
and 192.168.10.200
are actually different networks? This is the problem that the Subnet Mask solves.
Let’s say you have the following two networks:
LAN 1
Hostname | IP Address |
---|---|
gateway-a |
192.168.10.1 |
alpha |
192.168.10.10 |
beta |
192.168.10.11 |
LAN 2
Hostname | IP Address |
---|---|
gateway-b |
192.168.10.1 |
gamma |
192.168.20.10 |
delta |
192.168.20.11 |
For the sake of simplicity, let’s assume gateway-a
and gateway-b
are connected in some way so they know how to talk to both LAN1 and LAN2 and pass along traffic (aka “routing”). They could, for example, be the same computer with two network interfaces, one for each network.
Now when alpha wants to talk to beta, you as a human can see that the 192.168.10. part of both of their addresses are the same, so clearly both those hosts should be on the same network. When talking to gamma
, that third octet is different (192.168.10.
vs. 192.168.20.)
so different network. Or does it? What says which parts of the IP address are the same means the same network and what means different?
So let’s dive into that subnet mask setting. You’ve all seen it and you’ve all entered the same thing every time… 255.255.255.0
. Maybe you’ve been told to enter something other than 255.255.255.0
and you were overcome with a feeling of panic, unsure why your world was suddenly turned upside down! Maybe you saw it referred to as a “/24” and you just went along with it, hoping no one would catch on that you had no idea what the difference between a /24 or a /23 and a /💩.
In order to explain how a subnet mask lets a computer tell if two different IP addresses are in the same network or not, a little background on what IP address are. We’re just going to talk about IPv4 here, IPv6 is the same idea, just bigger numbers.
The IP address is how a computer identifies itself on a network. Sure, we may have given our computers names like “alpha
” and “beta
”, and that’s great for us humans, but computers deal with numbers so they turn that name into a number (ex. 192.168.10.10
). How that’s done is through something called DNS (Domain Name Service), but that’s another post…
So computers like to deal with numbers, not with names, but why the “dotted quad” like 192.168.10.10.
And how come each “quad” can only be the numbers 0-255? Again, it’s for the convince of us poor humans who aren’t as good with numbers. It’s easier for us to remember 4 small numbers in a row than it is for us to remember a single large number with as many as 10 digits - and a lot less prone to errors if we have to type it or read it out to another person.
To a computer, the IP address (again IPv4 only here) is just a single, 32-bit number. Extremely efficient for a computer to store (esp. a 32-bit computer) and do calculations with. Computers were created to do math on numbers and we identify each computer by a single number. Great!
A Brief Binary Background
So let’s keep diving down the rabbit hole and make sure we’re up on what a 32-bit number is, since this stuff is simple enough you don’t need a computer science degree to understand. Everything computer do internally is with numbers (like we just said), but more specifically computers operate with really only two numbers: the 0 and the 1, or “binary”. Computers operate using electricity, and just like a light switch has one of two states - on or off - the most basic element a computer can store is a “bit” and the computer stores the state of that bit as either giving it a small electric charge or not - “on” or “off”… “1” or “0”.
A single bit that can count all the way up to 1 isn’t very exciting, but if we string together a few more bits, we can start working with some real numbers (get it?)
So just like when you count up to 9 you add a digit to the left and reset the 9 back down to 0 to get “10”, you do the same with binary when counting up, though you can only count up to “1”. To better show this if you haven’t seen counting in binary before, here’s how you’d count up to “4”:
Sting 8 bits together you get what computer nerds call a “byte”, but all it is a number that can be reprinted by a combination of up to 8 1’s or 0’s.
To demonstrate that this arrangement of 1’s and 0’s actually does represent a number you’re familiar with, here’s how you make the conversion. Each position represents the next power of 2, so if you take all the positions that have a 1 in them, and add up their powers of two, you’ll get the decimal number you’re familiar with.
So now we know a little about what binary is, to look at how a computer stores an IP address we’ll use 196.168.10.10
as our example. Remember we said in our “dotted quad”, each number can be from 0 to 255. 255 just happens to be the largest number we can represent in one byte (8 bits), if all the positions have 1’s in them. If we put four 8-bit numbers together (4 x 8 = 32) we get a 32-bit number.
We’ll put the “.” between each byte to make it easy for us humans to read, but again to the computer it’s just one big number.
So now we know how an IP address is stored inside the computer, we need to know about one more thing before we can take what we’ve learned and apply it to our subnet mask and that’s binary operations.
You’re familiar with basic math operations: addition, subtraction, multiplication, division, and computer can do all of those really well. But there’s a couple extra operations computers do, especially when dealing with binary numbers, that you might not be as familiar with. Two of those are AND and OR, sometimes represented with the symbols “&” and “|” (that’s a vertical pipe, not a capital I).
AND gives a True result if both the left and right sides evaluate to True. OR gives a True result if either the left or right side (or both!) evaluate to True.
There's a slew of other binary operations, things like XOR, XAND, shift operators, but that's...another post...
The Subnet Mask
No we should have all the tools to understand the subnet mask and how it works. Remember, the problem that it is solving is making it easy to determine if another host is in the same network as you. To do this, the computer does a binary AND operation of both IP addresses against the subnet mask. If the resulting values are the same, the hosts are in the same network. If the results are different, the networks are different and the message must be directed to the default gateway instead.
Let’s see it in action by comparing our hosts alpha
and beta
192 . 168 . 10 . 10 (alpha) 11000000 . 10101000 . 00001010 . 00001010 255 . 255 . 255 . 0 (netmask) 11111111 . 11111111 . 11111111 . 00000000 11000000 . 10101000 . 00001010 . 00001010 & 11111111 . 11111111 . 11111111 . 00000000 ============================================ 11000000 . 10101000 . 00001010 . 00000000 = 3232238080 192 . 168 . 10 . 11 (beta) 11000000 . 10101000 . 00001010 . 00001011 255 . 255 . 255 . 0 (netmask) 11111111 . 11111111 . 11111111 . 00000000 11000000 . 10101000 . 00001010 . 00001011 & 11111111 . 11111111 . 11111111 . 00000000 ============================================ 11000000 . 10101000 . 00001010 . 00000000 = 3232238080
As you can see, after you perform an AND (&) against the two host numbers, the result is the same. Let’s try it with the host gamma:
192 . 168 . 20 . 10 (gamma) 11000000 . 10101000 . 00010100 . 00001010 255 . 255 . 255 . 0 (netmask) 11111111 . 11111111 . 11111111 . 00000000 11000000 . 10101000 . 00010100 . 00001010 & 11111111 . 11111111 . 11111111 . 00000000 ============================================ 11000000 . 10101000 . 00010100 . 00000000 = 3232240640
3232238080 != 3232240640
See how we get a different result? That lets us know the two addresses are not in the same network.
Subnet Formats (CIDR Notation)
So another real quick aside on how a netmask is formatted. I don't want to dive into how networks are divided (we'll leave that for another post), but we have enough knowledge know to explain a different format that you may have seen, that of an IP address, followed by a "/" and another number. This is called CIDR notation and it's a shorthand for describing the netmask. You can describe our hostname/netmask for the alpha host we've been using in this example as 192.168.10.10/24
.
That /XX
number indicates how many 1's are used in the binary representation of our netmask (starting from the left or "most significant" digits). So a /24
has 24 1's before filling out the remainder of the 32-bit number with 0's. A /23
has 23, /16
has 16, etc. Here's a little bit of what that looks like:
/32 11111111.11111111.11111111.11111100 255.255.255.255
/31 11111111.11111111.11111111.11111100 255.255.255.254
/30 11111111.11111111.11111111.11111100 255.255.255.252
/29 11111111.11111111.11111111.11111000 255.255.255.248
/28 11111111.11111111.11111111.11110000 255.255.255.240
/27 11111111.11111111.11111111.11100000 255.255.255.224
/26 11111111.11111111.11111111.11000000 255.255.255.192
/25 11111111.11111111.11111111.10000000 255.255.255.128
/24 11111111.11111111.11111111.00000000 255.255.255.0
/23 11111111.11111111.11111110.00000000 255.255.254.0
/22 11111111.11111111.11111100.00000000 255.255.252.0
/21 11111111.11111111.11111000.00000000 255.255.248.0
/20 11111111.11111111.11110000.00000000 255.255.240.0
/19 11111111.11111111.11100000.00000000 255.255.224.0
/18 11111111.11111111.11000000.00000000 255.255.192.0
/17 11111111.11111111.10000000.00000000 255.255.128.0
/16 11111111.11111111.00000000.00000000 255.255.0.0
Again, not digging into how networks are divided here, but the 1's and 0's in a netmask are not just random, but grouped by that most significant (left most) position as represented above. As you may now guess, netmasks are not selected at random, they are based on the number most significant bits that have 1's in them!
Why is the Important / So Genius?
The subnet mask lets the computer make this important routing decision and it lets it do it in a single operation that for a computer is very, very fast.
if ( source-ip & netmask ) == ( dest_ip & netmask ) then # local network else # routed network
And this speed is critical to making the Internet fast. I took a capture of the network traffic on my computer for one minute while I surfed the web, checked email and watched a video on YouTube. Over 24,000 packets were transmitted during that 60 second period. For every one of those packets, that very decision of where to send the packet needed to be made. And it wasn't just on my computer, but every computer and network device that was in-between me and the destination had to make the same calculation. In this example, that was 400 times a second, and even though I was watching a YouTube video I was hardly stressing my gigabit internet connection.
When you step back and think about how many billions of packets must be transmitted globally every second (and I'm not even sure billions is a large enough scale), each one having that same calculation applied at every single hop, hopefully you can see why I find a little bit of elegant beauty in the simplicity of it all. Something most people just enter into their settings without any idea what it actually does.
Tags:
You Might Also Like
Code Testing with Flask, unittest and PyQuery
March 8, 2022