Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/

edited May 23, 2017 at 12:41

Lookups in a std::map are not \$O(1)\$; they're \$O(\log n)\$ and involve a ton of pointer traversals. Doing the same \$O(\log n)\$ operations on a sorted array (which is going to be contiguous in memory, therefore cache-friendly cache-friendly) will be faster — probably even faster than using some actually "\$O(1)\$" (cough handwave cough) yet still cache-unfriendly container such as std::unordered_map.

Lookups in a std::map are not \$O(1)\$; they're \$O(\log n)\$ and involve a ton of pointer traversals. Doing the same \$O(\log n)\$ operations on a sorted array (which is going to be contiguous in memory, therefore cache-friendly) will be faster — probably even faster than using some actually "\$O(1)\$" (cough handwave cough) yet still cache-unfriendly container such as std::unordered_map.

formatting

Source Link

edited May 12, 2015 at 13:47

Snowbody

edited May 12, 2015 at 13:47

Snowbody

8.7k
25
50

Lookups in a std::map are not O(1)\$O(1)\$; they're O(log n)\$O(\log n)\$ and involve a ton of pointer traversals. Doing the same O(log n)\$O(\log n)\$ operations on a sorted array (which is going to be contiguous in memory, therefore cache-friendly) will be faster — probably even faster than using some actually "O(1)"\$O(1)\$" (cough handwave cough) yet still cache-unfriendly container such as std::unordered_map.

Lookups in a std::map are not O(1); they're O(log n) and involve a ton of pointer traversals. Doing the same O(log n) operations on a sorted array (which is going to be contiguous in memory, therefore cache-friendly) will be faster — probably even faster than using some actually "O(1)" (cough handwave cough) yet still cache-unfriendly container such as std::unordered_map.

Lookups in a std::map are not \$O(1)\$; they're \$O(\log n)\$ and involve a ton of pointer traversals. Doing the same \$O(\log n)\$ operations on a sorted array (which is going to be contiguous in memory, therefore cache-friendly) will be faster — probably even faster than using some actually "\$O(1)\$" (cough handwave cough) yet still cache-unfriendly container such as std::unordered_map.

Source Link

answered May 12, 2015 at 6:19

Quuxplusone

answered May 12, 2015 at 6:19

Quuxplusone

19.7k
2
44
91

Take your "pair" version to its logical conclusion:

using Mnemonic = std::string;
enum AddressMode { /* ... */ };
using Size = int;
using InstructionDescription = std::tuple<Mnemonic, AddressMode, Size>;
using Opcode = uint8_t;
std::map<InstructionDescription, Opcode> OpcodeTable = {
 {{"ADC", Indirect_X, 2}, 0x61 },
 {{"ADC", Stack, 2}, 0x63 },
 {{"ADC", Direct, 2}, 0x65 },
 {{"ADC", Indirect_Long, 2}, 0x67 },
 {{"ADC", Immediate, 2}, 0x69 },
 {{"ADC", Direct, 3}, 0x6D },
 {{"ADC", Direct, 4}, 0x6F },
 {{"ADC", Indirect_Y, 2}, 0x71 },
 {{"ADC", Indirect, 2}, 0x72 },
 {{"ADC", Stack_Y, 2}, 0x73 },
 // ...
};

There will only be 256 rows in this table, so it's totally manageable.

Or, do what real assembler-writers do: read the instruction set specification to learn how the bits of the opcode correspond to the meaning of the instruction, and then turn that into a short mathematical formula instead of a huge lookup table.

For example:

enum AddressMode : uint8_t {
 Indirect_X = 0x01,
 Stack = 0x03,
 Direct = 0x05,
 Indirect_Long = 0x07,
 Immediate = 0x09,
 Indirect_Y = 0x11,
 Indirect = 0x12,
 Stack_Y = 0x13,
 // ...
};
Opcode get_opcode(Mnemonic instr, AddressMode mode, Size size)
{
 if (instr == "ADC") {
 unsigned size_mask = (size == 2) ? 0 : (size == 3) ? 8 : 10;
 return 0x60 + int(mode) + size_mask;
 } else if // ...
}

I'd think having to search all the instructions every time (up to 3 times per instruction based on optimizations) instead of an O(1) lookup seems like it may increase the time taken to assemble the code which may be over 100k lines in some tests, but I don't have any data to back that up.

Anyway, my second code example above is what you'll want to do if you really need the speed (which I'm sure you don't). Simple arithmetic operations will always be lightning-fast compared to any kind of container lookup. But at some point, probably long before any of this starts mattering, the real performance bottleneck will be in how fast you can read that 100K of assembly code off the filesystem and how fast you can write the machine code back out. The code that turns mnemonics into opcodes isn't the bottleneck; I guarantee it.

lang-cpp