I need the ability to generate random MAC addresses, so I wrote up a little function that does that:
>>> random_mac()
'7C:93:B7:AF:BA:AE'
>>> random_mac()
'D8:D8:A0:D4:A5:3F'
>>> random_mac(unicast=False, universal=True)
'55:47:C6:EE:C6:2B'
>>> random_mac(unicast=True, universal=False)
'FE:A1:4B:98:76:B6'
I decided to allow the user to pick if it's unicast/multicast and universally/locally administered; even though I'll only need unicast/universal. This caused me headaches though because I'm still not great at dealing with bits. The LSB of the first octet indicates uni/multicast, and the second LSB of the octet indicates universal/local, so these bits can't be random.
After playing around with a few failed ideas (generating all random bits, then "fixing" the two bits later), I finally decided to generate a random number between 0 and 63 inclusive, left shift it twice, than add the two bits on after. It works, but it's ugly and looks suboptimal.
It's not a lot of code, but I'd like a few things reviewed:
- Is there a better approach? It feels hacky generating it as two pieces then adding them together. I tried explicitly setting the bits, but the code to decide between
|
, and&
and~
got messier than what I have now, so I went with this way. - The number constants are bugging me too. The numbers kind of sit on a border of self-explanatory and magic, so I decided to name them to be safe.
LAST_SIX_BITS_VALUE
feels off though. - Is treating a boolean value as a number during bitwise operation idiomatic? Is it clear as I have it now?
- Attaching the first octet to the rest is suboptimal as well. Speed isn't a huge concern, but I'm curious if there's a cleaner way that I'm missing.
from random import randint, randrange
N_MAC_OCTETS = 6
OCTET_VALUE = 256
LAST_SIX_BITS_VALUE = 63
def random_mac(unicast: bool = True, universal: bool = True) -> str:
least_two_bits = (not unicast) + ((not universal) << 1)
first_octet = least_two_bits + (randint(0, LAST_SIX_BITS_VALUE) << 2)
octets = [first_octet] + [randrange(OCTET_VALUE) for _ in range(N_MAC_OCTETS - 1)]
return ":".join(f"{octet:02X}" for octet in octets)
Examples of the bits for the first octet for different inputs:
def display(mac):
print(mac, f"{int(mac.split(':')[0], 16):08b}")
# Unicast, Universal
>>> display(random_mac(True, True))
04:27:DE:9A:1B:D7 00000100 # Ends with 0,0
# Unicast, Local
>>> display(random_mac(True, False))
72:FB:49:43:D5:F2 01110010 # 1,0
# Multicast, Universal
>>> display(random_mac(False, True))
7D:BF:03:4E:E5:2A 01111101 # 0,1
# Multicast, Local
>>> display(random_mac(False, False))
2F:73:52:12:8C:50 00101111 # 1,1
5 Answers 5
Negating an argument is somewhat counterintuitive. Consider passing them as
multicast
andlocal
instead.I would seriously consider defining
UNIVERSAL = 0x01 MULTICAST = 0x02
and pass them as a single argument, is in
random_mac(UNIVERSAL | MULTICAST)
Using both
randint
andrandrange
feels odd. I would stick withrandrange
.First octet needs a special treatment anyway. That said, consider
def random_mac(special_bits = 0): octets = [randrange(OCTET_VALUE) for _ in range(N_MAC_OCTETS)] octets[0] = fix_octet(octet[0], special_bits) return ":".join(f"{octet:02X}" for octet in octets)
with
def fix_octet(octet, special_bits): return (octet & ~0x03) | special_bits
-
\$\begingroup\$ @Carcigenicate:
fix_octet
just clears the low 2 bits, then ORs in what should be there. Looks very comprehensible to me, and exactly what I would have suggested if there wasn't already an answer doing that. (Generating all 6 octets the same way and modifying 1 is also what I was going to suggest). Getting the caller to pass ORed bitflags is great; it simplifies the implementation and makes the call-site more self-documenting. (And is a common idiom in C for system calls likemmap(..., MAP_PRIVATE | MAP_ANONYMOUS | ...)
, or for ORing together permission bit flags, or lots of other things.) \$\endgroup\$Peter Cordes– Peter Cordes2020年10月12日 02:42:52 +00:00Commented Oct 12, 2020 at 2:42 -
\$\begingroup\$ @PeterCordes I may be misunderstanding something, but I was just saying that this currently seems to give an incorrect result. I think it's because the flags need to be
MULTICAST = 0x01; LOCAL = 0x02
instead of what they currently are. \$\endgroup\$Carcigenicate– Carcigenicate2020年10月12日 03:12:41 +00:00Commented Oct 12, 2020 at 3:12 -
\$\begingroup\$ @Carcigenicate: I was replying to your first comment. You second comment didn't point out what the problem was or give an example, and I didn't notice the bit-flags definitions were backwards. But yes,
0x1
is the LSB,0x2
is the second bit, so that matches your question (and wikipedia). \$\endgroup\$Peter Cordes– Peter Cordes2020年10月12日 03:14:54 +00:00Commented Oct 12, 2020 at 3:14 -
\$\begingroup\$ @PeterCordes OK, now this makes sense. Thank you. \$\endgroup\$Carcigenicate– Carcigenicate2020年10月12日 03:17:05 +00:00Commented Oct 12, 2020 at 3:17
-
1\$\begingroup\$ After changing the flags to
MULTICAST = 0x01; LOCAL = 0x02
, this does indeed give the correct output. Thank you for the suggestions. \$\endgroup\$Carcigenicate– Carcigenicate2020年10月12日 16:02:23 +00:00Commented Oct 12, 2020 at 16:02
Some observations on the API
Naming
The IEEE strongly discourages use of the name MAC or MAC-48. These names should only be used as an obsolete label for EUI-48.
It is also imprecise, since not all MAC addresses are EUI-48 addresses. For example, FireWire MAC addresses are EUI-64.
So, your function should probably be named random_eui48
instead.
Keyword-only arguments
Having two boolean parameters can lead to confusion. I would make them keyword-only arguments so that the caller is always forced to name them:
def random_eui48(*, unicast: bool = True, universal: bool = True) -> str:
Defaults
I agree with the choice of making Unicast the default. It is probably what users will usually need more often. However, I disagree with making universally administered addresses the default. In fact, I find it highly dubious to randomly generate UAAs at all. At most, you should randomly generate addresses within an OUI you own.
So, I would very much prefer to make LAAs the default.
Choice of parameters
I would choose the parameters such that they are "off-by-default" (False
) and ca be "turned on" by the caller:
def random_eui48(*, multicast: bool = False, universal: bool = False) -> str:
API extension: supply OUI
It really only makes sense to generate a UAA within an OUI you own. Therefore, your API should provide for passing in a OUI to generate an addresses for. Make sure you take care of both the MAC-S and MAC-L registries!
Implementation
An EUI-48 is a 48 bit number. I find it strange to treat it as a conglomerate of 5 8 bit and one 6 bit number.
-
2\$\begingroup\$ Thank you. Regarding "EUI-48", is "MAC" simply not used in industry, or should it only be avoided when you're referring to a physical address of a certain length? I've actually never heard of "EUI-48" before; beyond the mention on the Wiki page. And as for why I'm randomizing the OUI as well: I'm writing a malicious DHCP client and need random MACs so that I can continue requesting new addresses while keeping the previous addresses on lease. \$\endgroup\$Carcigenicate– Carcigenicate2020年10月12日 13:53:52 +00:00Commented Oct 12, 2020 at 13:53
-
3\$\begingroup\$ Oh, MAC is being used all over the place, it's just that the IEEE strongly advises against it. If you really are talking about MAC addresses, I think it makes sense to use the term "MAC", but be aware that, depending on the protocol, MAC addresses need not be 48 bit. And conversely, EUIs are used for addresses other than MAC addresses. E.g. the Clock ID in the Precision Type Protocol is an EUI, albeit an EUI-64. \$\endgroup\$Jörg W Mittag– Jörg W Mittag2020年10月13日 05:44:13 +00:00Commented Oct 13, 2020 at 5:44
-
\$\begingroup\$ I'm not sure the IEEE disapproves of the use of "MAC" as a general term. Rather, my understanding is that the term "MAC-48" is deprecated in favour of "EUI-48". "MAC address" usually refers to how the identifier is used, where as "[MAC/EUI]-48" refers to the numbering scheme. \$\endgroup\$Tom Wright– Tom Wright2020年10月13日 16:21:30 +00:00Commented Oct 13, 2020 at 16:21
random.randrange()
takes start, stop, and step arguments just like range()
. To select the first octet, start is based on the unicast and universal flags, end is 256, and step is 4 (four possible combinations of unicast and universal).
N_MAC_OCTETS = 6
OCTET_VALUE = 256
LAST_SIX_BITS_VALUE = 63
def random_mac(unicast: bool = True, universal: bool = True) -> str:
first_octet = randrange(3 ^ (universal*2 + unicast), OCTET_VALUE, 4)
octets = [first_octet] + [randrange(OCTET_VALUE) for _ in range(N_MAC_OCTETS - 1)]
return ":".join(f"{octet:02X}" for octet in octets)
or better:
UNICAST = 0
MULTICASE = 1
UNIVERSAL = 0
LOCAL = 2
def random_mac(flags: int = UNICAST | UNIVERSAL) -> str:
first_octet = randrange(flags, OCTET_VALUE, 4)
octets = [first_octet] + [randrange(OCTET_VALUE) for _ in range(N_MAC_OCTETS - 1)]
return ":".join(f"{octet:02X}" for octet in octets)
Called like:
random_mac(LOCAL | MULTICAST)
-
1\$\begingroup\$ Using
universal = True
for unicast is the opposite of the bit value. That is, whenunicast
it true the bit value is false. Similarly foruniversal
. This smells like a source of bugs. I'd suggest changing the parameters tomulticast
andlocal
so they match the bit values or, as vnp suggested, define some constants that can be or-ed. \$\endgroup\$RootTwo– RootTwo2020年10月11日 20:59:00 +00:00Commented Oct 11, 2020 at 20:59
While still using @vnp's fix_octet()
function, an alternate approach might be
def random_mac(special_bits = 0):
return ':'.join('%02x'%randint(0,255) if i != 0 else '%02x'%fix_octet(randint(0,255),special_bits) for i in range(6))
(not unicast) + ((not universal) << 1)
- When manipulating bits or bitfields, use
|
, not+
. Even though the result will be the same here, the semantic is different. - You read left-to-right. So handle the bits left to right.
- I do agree with the fact that negations quickly mess with your head.
I'd rather write:
(local << 1) | multicast
Going one step further, I'd replace:
least_two_bits = (not unicast) + ((not universal) << 1)
first_octet = least_two_bits + (randint(0, LAST_SIX_BITS_VALUE) << 2)
With
first_octet = (randint(0, LAST_SIX_BITS_VALUE) << 2) | (local << 1) | multicast
You could define LAST_SIX_BITS_VALUE
as ((1 << 6)-1)
to make it more explicit that its value comes from the need for 6 bits. One step further would be to define
FIRST_OCTET_RANDOM_BITS_NUMBER = 6
FIRST_OCTET_RANDOM_BITS_MAX_VALUE = (1 << FIRST_OCTET_RANDOM_BITS_NUMBER) - 1
I agree that mixing randint
(where the top value is inclusive) and randrange
(where it isn't) is confusing.
randmac
does it. randmac Python 3 \$\endgroup\$