No edit summary
(Added more information about how the card works, its registers and how to write a simple driver that can receive/transmit packets)
Line 1:
Line 1:
(削除) {{In Progress}} (削除ここまで)
The Intel 8254x series is comprised of: 82546GB/EB, 82545GM/EM, 82544GC/EI, 82541(PI/GI/EI), 82541ER, 82547GI/EI, and 82540EP/EM Gigabit Ethernet Controllers.
The Intel 8254x series is comprised of: 82546GB/EB, 82545GM/EM, 82544GC/EI, 82541(PI/GI/EI), 82541ER, 82547GI/EI, and 82540EP/EM Gigabit Ethernet Controllers.
[[Image:Intel-82540EM.jpg|right|frame|Intel 82540EM-based card]]
[[Image:Intel-82540EM.jpg|right|frame|Intel 82540EM-based card]]
Line 6:
Line 4:
== Overview ==
== Overview ==
Intel 8254x-based cards come in 32-/64-bit, 33/66 MHz PCI and PCI-X flavors.
Intel 8254x-based cards come in 32-/64-bit, 33/66 MHz PCI and PCI-X flavors. (追記) (追記ここまで)
The Intel 82547GI(EI) connects to the motherboard via a Communications Streaming Architecture (CSA) port instead of a PCI/PCI-X bus. (削除) (削除ここまで)
The Intel 82547GI(EI) connects to the motherboard via a Communications Streaming Architecture (CSA) port instead of a PCI/PCI-X bus. (追記) (追記ここまで)
The 82541xx and 82540EP/EM controllers do not support the PCI-X bus.
The 82541xx and 82540EP/EM controllers do not support the PCI-X bus.
Line 14:
Line 14:
The Intel 8254x series heavily utilizes task offloading. Each controller has an "offloading engine" for tasks such as TCP/UDP/IP checksum calculations, packet filtering, and packet segmentation.
The Intel 8254x series heavily utilizes task offloading. Each controller has an "offloading engine" for tasks such as TCP/UDP/IP checksum calculations, packet filtering, and packet segmentation.
* Jumbo packets are supported. (削除) (削除ここまで)
* Jumbo packets are supported.
* Wake on LAN (WoL) is supported.
*Wake on LAN (WoL) is supported.
* A four wire serial EEPROM interface as well as a generic EEPROM "read" interface is implemented within the configuration registers. (削除) (削除ここまで)
* A four wire serial EEPROM interface as well as a generic EEPROM "read" interface is implemented within the configuration registers.
* D0 and D3 power states are supported through ACPI.
*D0 and D3 power states are supported through ACPI.
== Programming ==
==Programming==
=== Detection ===
===Detection===
Section 5.2 in the 8254x Software Developer's Manual lists the Vendor and Device ID's of the various device in the 8254x series. These are used to detect devices on the PCI bus by looking in the PCI Configuration Space registers.
Section 5.2 in the 8254x Software Developer's Manual lists the Vendor and Device ID's of the various device in the 8254x series. These are used to detect devices on the PCI bus by looking in the PCI Configuration Space registers.
Line 29:
Line 29:
When using MMIO, reading/writing to/from registers is very straight-forward.
When using MMIO, reading/writing to/from registers is very straight-forward.
<source lang="c">
<source lang="c(追記) " line="1 (追記ここまで)">
*(uint32_t *)(ioaddr + (削除) reg (削除ここまで)) = (削除) val (削除ここまで); (削除) // writes "val" to an MMIO address (削除ここまで)
(追記) uint64_t ioaddr = BAR_GOES_HERE; (追記ここまで)
(削除) val = (削除ここまで)*(uint32_t *)(ioaddr + (削除) reg (削除ここまで)); (削除) // reads "val" from an MMIO address (削除ここまで)
(追記) void write_register(uint16_t register, uint32_t value){ (追記ここまで)
(追記) (追記ここまで)*(uint32_t *)(ioaddr + (追記) register (追記ここまで)) = (追記) value (追記ここまで);
(追記) uint32_t read_register(uint16_t register){ (追記ここまで)
(追記) return (追記ここまで)*(uint32_t *)(ioaddr + (追記) register (追記ここまで));
Line 38:
Line 45:
IOADDR holds the IO address that the IODATA window operates on. So, basic operation is to set the IOADDR window and then the desired action using the IODATA window.
IOADDR holds the IO address that the IODATA window operates on. So, basic operation is to set the IOADDR window and then the desired action using the IODATA window.
<source lang="c">
<source lang="c(追記) " line="1 (追記ここまで)">
outl(ioaddr + 0x00, (削除) reg (削除ここまで)); // set the IOADDR window
(追記) uint16_t ioaddr = IO_BAR_GOES_HERE; (追記ここまで)
outl(ioaddr + 0x04, (削除) val (削除ここまで)); // write the value to the IOADDR window which will end up in the register in IOADDR
inl(ioaddr + 0x04); (削除) (削除ここまで)// read (削除) back (削除ここまで)the value
(追記) void write_register(uint16_t register, uint32_t value){ (追記ここまで)
(追記) (追記ここまで)outl(ioaddr + 0x00, (追記) register (追記ここまで)); // set the IOADDR window
(追記) (追記ここまで)outl(ioaddr + 0x04, (追記) value (追記ここまで)); // write the value to the IOADDR window which will end up in the register in IOADDR
(追記) uint32_t read_register(uint16_t register){ (追記ここまで)
(追記) outl(ioaddr + 0x00, register); // set the IOADDR window (追記ここまで)
(追記) return (追記ここまで)inl(ioaddr + 0x04); // read the value
== (削除) Initialization (削除ここまで)==
==(追記) Device Registers (追記ここまで)==
The 8254x will be on (削除) an undefined state (削除ここまで)and as (削除) such (削除ここまで)it (削除) needs (削除ここまで)to be (削除) reset (削除ここまで). (削除) The first thing that (削除ここまで)should be done (削除) is enabling bus mastering, memory and IO accesses from (削除ここまで)the (削除) [[PCI]] command register (削除ここまで).
The 8254x (追記) cards have a handful of registers. There is a complete list of the registers and their offsets at the Table 13-2 (Page 219) of the [https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#G15.1220466 Intel 8254x Family of Gigabit Ethernet Controllers Software Developer's Manual]. (追記ここまで)
(追記) Here are the most important ones: (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) !Abbreviation (追記ここまで)
(追記) !Manual Page (追記ここまで)
(追記) | Device Control (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=238 224] (追記ここまで)
(追記) |Device Status (追記ここまで)
(追記) | [https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=243 229] (追記ここまで)
(追記) |EEPROM/Flash Control/Data (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=246 232] (追記ここまで)
(追記) | EEPROM Read (not applicable (追記ここまで)
(追記) to the 82544GC/EI) (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=250 236] (追記ここまで)
(追記) |Interrupt Cause Read (追記ここまで)
(追記) | [https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=306 292] (追記ここまで)
(追記) |Interrupt Mask Set / Read (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=311 297] (追記ここまで)
(追記) |Receive Control (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=314 300] (追記ここまで)
(追記) |Receive Descriptor Base Low (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=320 306] (追記ここまで)
(追記) |Receive Descriptor Base High (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=320 306] (追記ここまで)
(追記) |Receive Descriptor Length (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=321 307] (追記ここまで)
(追記) |Receive Descriptor Head (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=321 307] (追記ここまで)
(追記) |Receive Descriptor Tail (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=322 308] (追記ここまで)
(追記) |Transmit Control (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=324 310] (追記ここまで)
(追記) |Transmit Descriptor Base Low (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=329 315] (追記ここまで)
(追記) |Transmit Descriptor Base High (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=330 316] (追記ここまで)
(追記) |Transmit Descriptor Length (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=330 316] (追記ここまで)
(追記) |Transmit Descriptor Head (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=331 317] (追記ここまで)
(追記) |Transmit Descriptor Tail (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=332 318] (追記ここまで)
(追記) |Receive Address Low (n) (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=343 329] (追記ここまで)
(追記) |Receive Address High (n) (追記ここまで)
(追記) |[https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=343 329] (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+The Device Control Register (CTRL) (追記ここまで)
(追記) |Full - Duplex (追記ここまで)
(追記) |SDP1 Data Value (追記ここまで)
(追記) |D3Cold Wakeup Capability (追記ここまで)
(追記) Advertisement Enable (追記ここまで)
(追記) |Link Reset (追記ここまで)
(追記) |EN_PHY_PWR_MGMT (追記ここまで)
(追記) |PHY Power Management Enable (追記ここまで)
(追記) |SDP0_IODIR (追記ここまで)
(追記) |SDP0 Pin Directionality (追記ここまで)
(追記) |Auto-Speed Detection Enable (追記ここまで)
(追記) |SDP1_IODIR (追記ここまで)
(追記) |SDP1 Pin Directionality (追記ここまで)
(追記) |Set Link Up (追記ここまで)
(追記) |Invert Loss-of-Signal (追記ここまで)
(追記) | Device Reset (追記ここまで)
(追記) |Speed selection (追記ここまで)
(追記) |Receive Flow Control Enable (追記ここまで)
(追記) | Transmit Flow Control Enable (追記ここまで)
(追記) |Force Speed (追記ここまで)
(追記) |Force Duplex (追記ここまで)
(追記) |VLAN Mode Enable (追記ここまで)
(追記) |SDP0 Data Value (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Status Register Bit Description (追記ここまで)
(追記) |Link Full Duplex configuration Indication. (追記ここまで)
(追記) |Link Up indication (追記ここまで)
(追記) |Function ID (追記ここまで)
(追記) |Provides software a mechanism to determine the Ethernet (追記ここまで)
(追記) controller function number (LAN identifier) for this MAC. Read (追記ここまで)
(追記) as: [0b,0b] LAN A, [0b,1b] LAN B. (追記ここまで)
(追記) '''Note:''' These settings are only applicable to the '''82546GB/EB'''. (追記ここまで)
(追記) |Transmission Paused (追記ここまで)
(追記) |TBI Mode/internal SerDes Indication. (追記ここまで)
(追記) '''Note:''' For the '''82544GC/EI''', reflects the status of the TBI_MODE input pin. (追記ここまで)
(追記) |Link Speed Setting. (追記ここまで)
(追記) Speed indication is mapped as follows: (追記ここまで)
(追記) 00b = 10 Mb/s (追記ここまで)
(追記) 01b = 100 Mb/s (追記ここまで)
(追記) 10b = 1000 Mb/s (追記ここまで)
(追記) 11b = 1000 Mb/s (追記ここまで)
(追記) These bits are not valid in TBI mode/internal SerDes. (追記ここまで)
(追記) |Auto Speed Detection Value (追記ここまで)
(追記) |PCI Bus speed indication. (When set, indicates that the PCI Bus is running (追記ここまで)
(追記) at 66 MHz). (追記ここまで)
(追記) |BUS64<sup>1</sup> (追記ここまで)
(追記) |PCI Bus Width indication. (When set, indicates that the Ethernet controller is on (追記ここまで)
(追記) a 64-bit bus) (追記ここまで)
(追記) |PCIX_MODE<sup>1</sup> (追記ここまで)
(追記) |PCI-X Mode indication. (When set, indicates that the Ethernet Controller is operating (追記ここまで)
(追記) in PCI-X mode) (追記ここまで)
(追記) |PCIXSPD<sup>1</sup> (追記ここまで)
(追記) |PCI-X Bus Speed Indication. (追記ここまで)
(追記) 00b = 50-66 MHz (追記ここまで)
(追記) 01b = 66-100 MHz (追記ここまで)
(追記) 10b = 100-133 MHz (追記ここまで)
(追記) 11b = Reserved (追記ここまで)
(追記) <sup>1</sup>. Not applicable to the '''82540EP/EM''', '''82541xx''', or '''82547GI/EI'''. (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Transmit Control Register (TCTL) (追記ここまで)
(追記) |Transmit Enable (追記ここまで)
(追記) |Pad Short Packets (追記ここまで)
(追記) |Collision Threshold (追記ここまで)
(追記) |Collision Distance (追記ここまで)
(追記) |Software XOFF Transmission (追記ここまで)
(追記) |Re-transmit on Late Collision (追記ここまで)
(追記) |No Re-transmit on underrun (追記ここまで)
(追記) (82544GC/EL Only) (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Control Register (RCTL) (追記ここまで)
(追記) |Receive Buffer Size (追記ここまで)
(追記) |Receiver Enable (追記ここまで)
(追記) |VLAN Filter Enable (追記ここまで)
(追記) |Store Bad Packets (追記ここまで)
(追記) |Canonical Form Indicator Enable (追記ここまで)
(追記) |Unicast Promiscuous Enabled (追記ここまで)
(追記) |Canonical Form Indicator bit value (追記ここまで)
(追記) |Multicast Promiscuous Enabled (追記ここまで)
(追記) |Long Packet Reception Enable (追記ここまで)
(追記) |Discard Pause Frames (追記ここまで)
(追記) |Loopback Mode (追記ここまで)
(追記) |Pass MAC Control Frames (追記ここまで)
(追記) |Receive Descriptor Minimum (追記ここまで)
(追記) Threshold Size (追記ここまで)
(追記) |Buffer Size Extenstion (追記ここまで)
(追記) |Multicast Offset (追記ここまで)
(追記) |Strip Ethernet CRC from incoming packet (追記ここまで)
(追記) |Broadcast Accept (追記ここまで)
(追記) When BSEX is set, the value in BSIZE is multiplied by 16. (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Buffer Size Configuration (追記ここまで)
(追記) !Size (Bytes) (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Interrupt mask Set / Read (IMS) (追記ここまで)
(追記) !Description (追記ここまで)
(追記) |Sets mask for Transmit Descriptor Written Back (追記ここまで)
(追記) |Sets mask for Transmit Queue Empty. (追記ここまで)
(追記) |Sets mask for Link Status Change. (追記ここまで)
(追記) |Sets mask for Receive Sequence Error. (追記ここまで)
(追記) This is a reserved bit for the '''82541xx''' and '''82547GI/EI'''. Set to 0b. (追記ここまで)
(追記) |Sets mask for Receive Descriptor Minimum Threshold hit. (追記ここまで)
(追記) |Sets mask for on Receiver FIFO Overrun (追記ここまで)
(追記) |Sets mask for Receiver Timer Interrupt (追記ここまで)
(追記) |Sets mask for MDI/O Access Complete Interrupt (追記ここまで)
(追記) |Sets mask for Receiving /C/ ordered sets. (追記ここまで)
(追記) This is a reserved bit for the '''82541xx''' and '''82547GI/EI'''. Set to 0b (追記ここまで)
(追記) |Sets mask for PHY Interrupt (not applicable to the '''82544GC/EI'''). (追記ここまで)
(追記) This is a reserved bit for the '''82541xx''' and '''82547GI/EI'''. Set to 0b (追記ここまで)
(追記) |Sets mask for General Purpose Interrupts '''(82544GC/EI only).''' (追記ここまで)
(追記) |Sets mask for General Purpose Interrupts (追記ここまで)
(追記) |Sets the mask for Transmit Descriptor Low Threshold hit (not (追記ここまで)
(追記) applicable to the '''82544GC/EI'''). (追記ここまで)
(追記) |Sets mask for Small Receive Packet Detection (not applicable to (追記ここまで)
(追記) the '''82544GC/EI'''). (追記ここまで)
(追記) To enable an interrupt, simply write '1' to the corresponding bit. (追記ここまで)
(追記) ==Descriptor Format== (追記ここまで)
(追記) Both receive and transmit descriptors are 16 bytes in size. There are 3 types of transmit descriptors, the original referred to as the "Legacy transmit descriptor". The second one is referred to as the " TCP/IP Data Descriptor" and is a replacement for the legacy descriptor offering access to new offloading capabilities.The other descriptor type is fundamentally different as it does not point to packet data. It merely contains control information which is loaded into registers of the controller and affect the processing of future packets. For simplicity we will only use the Legacy transmit descriptor. If you want to learn more about the other types of descriptors, [https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf#page=49 you can have a look at the specification.] (追記ここまで)
(追記) {| class="wikitable" style="text-align: center;" (追記ここまで)
(追記) |+Legacy Transmit Descriptor Format (追記ここまで)
(追記) | colspan="7" |Buffer Address (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Legacy Transmit Descriptor Field Description (追記ここまで)
(追記) !Description (追記ここまで)
(追記) |Buffer Address (追記ここまで)
(追記) |The address of the buffer. Descriptors with a null address transfer no data. (追記ここまで)
(追記) |Length is per segment. The maximum length allowed is 16288 bytes. (追記ここまで)
(追記) |Checksum Offset. Indicates where, relative to the start of the packet to insert (追記ここまで)
(追記) a TCP checksum if it is enabled in the CMD field. (追記ここまで)
(追記) |Command Field (追記ここまで)
(追記) |Status Field (追記ここまで)
(追記) |Checksum Start Field. Its an offset relative to the start of the buffer and it (追記ここまで)
(追記) indicates where to start computing the Checksum. (追記ここまで)
(追記) |Special Field (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Transmit Descriptor Command Field Format (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Transmit Descriptor Command Field Description (追記ここまで)
(追記) !Description (追記ここまで)
(追記) |IDE (bit 7) (追記ここまで)
(追記) |Interrupt Delay Enable (追記ここまで)
(追記) |VLE (bit 6) (追記ここまで)
(追記) |VLAN Packet Enable (追記ここまで)
(追記) |DEXT (bit 5) (追記ここまで)
(追記) |Extension. (Set to 0b to indicate legacy mode) (追記ここまで)
(追記) |RPS/RSV (bit 4) (追記ここまで)
(追記) |Report Packet Sent. '''82544GC/EL only. Otherwise reserved!''' (追記ここまで)
(追記) |RS (bit 3) (追記ここまで)
(追記) |Report Status. (When set, the controller will fire an interrupt when (追記ここまで)
(追記) the packet gets transmitted and bit STA.DD (Descriptor Done) (追記ここまで)will be (追記) set). (追記ここまで)
(追記) |IC (bit 2) (追記ここまで)
(追記) |Insert Checksum. (When set, the controller will insert a checksum based (追記ここまで)
on (追記) the values of the CSO (追記ここまで)and (追記) CSS fields.) (追記ここまで)
(追記) |IFCS (bit 1) (追記ここまで)
(追記) |Controls the Insertion of the FCS/CRC field in normal Ethernet packets. (追記ここまで)
(追記) IFCS is only valid when EOP is set. (追記ここまで)
(追記) |EOP (bit 0) (追記ここまで)
(追記) |End Of Packet. It indicates the last descriptor making up the packet. (追記ここまで)
(追記) One or many descriptors can be used to form a packet. (追記ここまで)
(追記) ======Transmit Descriptor Status Format====== (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Transmit Descriptor Status Field Description (追記ここまで)
(追記) !Description (追記ここまで)
(追記) |TU/RSV (bit 3) (追記ここまで)
(追記) |Transmit Underrun. Indicated a transmit underrun error has occurred. (追記ここまで)
(追記) '''82544GC/EL only. Otherwise reserved!''' (追記ここまで)
(追記) |LC (bit 2) (追記ここまで)
(追記) |Late Collision. Indicates that a Late Collision occurred while working in (追記ここまで)
(追記) half-duplex mode. It has no meaning in full-duplex. (追記ここまで)
(追記) |EC (bit 1) (追記ここまで)
(追記) |Excess Collisions. It indicates that the packet has experienced more than (追記ここまで)
(追記) the maximum excessive collisions (追記ここまで)as (追記) defined by TCTL.CT control field. (追記ここまで)
(追記) |DD (bit 0) (追記ここまで)
(追記) |Descriptor Done. Indicates that the descriptor is finished. (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Descriptor Format (追記ここまで)
(追記) | colspan="5" |Buffer Address (追記ここまで)
(追記) |Special'''*''' (追記ここまで)
(追記) Checksum'''*''' (追記ここまで)
(追記) '''*82544GC/EL only. Otherwise reserved!''' (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Descriptor Status Field (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Descriptor Status Bits (追記ここまで)
(追記) !Description (追記ここまで)
(追記) |PIF (bit 7) (追記ここまで)
(追記) |Passed in-exact filter. If set the software must examine this packet to determine (追記ここまで)
(追記) whether to accept (追記ここまで)it (追記) or not. if PIF is clear, the packet is known (追記ここまで)to be (追記) for this station. (追記ここまで)
(追記) |IPCS (bit 6) (追記ここまで)
(追記) |IP Checksum Calculated on Packet (追記ここまで). (追記) (0 = do not perform IP checksum, 1 = perform IP checksum) (追記ここまで)
(追記) |TCPCS (bit 5) (追記ここまで)
(追記) |TCP Checksum Calculated on Packet. (0 = do not perform TCP/UDP checksum, 1 = perform TCP/UDP checksum) (追記ここまで)
(追記) |RSV (bit 4) (追記ここまで)
(追記) |VP (bit 3) (追記ここまで)
(追記) |Packet is 802.1Q (matched VET). (追記ここまで)
(追記) |IXSM (bit 2) (追記ここまで)
(追記) |Ignore Checksum Indication. (when set, the checksum indication results (追記ここまで)should be (追記) ignored). (追記ここまで)
(追記) |EOP (bit 1) (追記ここまで)
(追記) |End Of Packet. (Indicates that this is the last descriptor for an incoming packet) (追記ここまで)
(追記) |DD (bit 0) (追記ここまで)
(追記) |Descriptor Done. (Indicates whether the controller is (追記ここまで)done (追記) with (追記ここまで)the (追記) descriptor) (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Descriptor Errors Field (追記ここまで)
(追記) |CXE<sup>a</sup> (追記ここまで)
(追記) |SEQ<sup>b</sup> (追記ここまで)
(追記) |SE<sup>b</sup> (追記ここまで)
(追記) <sup>a</sup> (追記ここまで). (追記) '''82544GC/EI''' only, otherwise reserved! (追記ここまで)
(削除) Then the NIC should be reset by setting CTRL (削除ここまで).(削除) RST (削除ここまで)((削除) 0x4000000 (削除ここまで)) bit (削除) in the CTRL (削除ここまで)((削除) 0x00000 (削除ここまで)) (削除) register of the card (削除ここまで).
(追記) <sup>b</sup> (追記ここまで).(追記) '''82541xx, 82547GI/EI,''' and '''82540EP/EM''' only, otherwise reserved. (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Descriptor Error bits (追記ここまで)
(追記) !Description (追記ここまで)
(追記) |RXE (bit 7) (追記ここまで)
(追記) |RX Data Error (追記ここまで)
(追記) |IPE (bit 6) (追記ここまで)
(追記) |IP Checksum Error (追記ここまで)
(追記) |TCPE (bit 5) (追記ここまで)
(追記) |TCP/UDP Checksum Error (追記ここまで)
(追記) |CXE (bit 4) (追記ここまで)
(追記) |Carrier Extension Error (追記ここまで)
(追記) |RSV (追記ここまで)((追記) bit 3 (追記ここまで))
(追記) |SEQ ( (追記ここまで)bit (追記) 2) (追記ここまで)
(追記) |Sequence Error (追記ここまで)
(追記) |SE (追記ここまで)((追記) bit 1 (追記ここまで))
(追記) |Symbol Error (追記ここまで)
(追記) |CE (bit 0) (追記ここまで)
(追記) |CRC Error or Alignment Error (追記ここまで)
(追記) The Receive Descriptor Special field is only populated for 802.1q packets. For all other packets it's contents are set to 0 (追記ここまで).
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Descriptor Special Field (追記ここまで)
(追記) {| class="wikitable" (追記ここまで)
(追記) |+Receive Descriptor Special Field (追記ここまで)
(追記) !Description (追記ここまで)
(追記) |VLAN Identifier (追記ここまで)
(追記) |Canonical Form Indicator (追記ここまで)
(追記) |User Priority (追記ここまで)
== EEPROM Reading ==
==EEPROM Reading==
There are a few variants of the card with many differences, most notably the method to access the EEPROM and the Flash memory of the card. Here we will only describe methods applicable to cards that use the EEPROM method.
There are a few variants of the card with many differences, most notably the method to access the EEPROM and the Flash memory of the card. Here we will only describe methods applicable to cards that use the EEPROM method.
Line 65:
Line 974:
After that the EERD.START bit must be cleared.
After that the EERD.START bit must be cleared.
<(削除) source (削除ここまで)lang="c">
<(追記) syntaxhighlight (追記ここまで)lang="c(追記) " line="1 (追記ここまで)">
static uint16_t (削除) i8254x_read_eeprom (削除ここまで)(uint8_t addr) {
static uint16_t (追記) eeprom_read (追記ここまで)(uint8_t addr) {
uint32_t tmp;
uint32_t tmp;
uint16_t data;
uint16_t data;
Line 102:
Line 1,011:
return data;
return data;
</(削除) source (削除ここまで)>
</(追記) syntaxhighlight (追記ここまで)>(追記) When all data is finally read the kernel should unlock the EEPROM to let hardware access it. (追記ここまで)
(追記) ==Initialization== (追記ここまで)
(追記) The 8254x will be on an undefined state and as such it needs to be reset. The first thing that should be done is enabling bus mastering, memory and IO accesses from the [[PCI]] command register. (追記ここまで)
(追記) Then the NIC should be reset by setting CTRL.RST (bit 26, self clearing) bit in the Device Control register of the card. (追記ここまで)
(追記) After the card has been reset, you should enable the CTRL.ASDE, the CTRL.SLU bits (To enable Auto Speed Detection (ASDE), you must also set the SLU (Set link up) bit) and write the MAC address you want the device to use in the RAL0 and RAH0 registers. To get the device MAC address, all you have to do is read the first 3 bytes of the EEPROM. (追記ここまで)
(追記) The entire procedure looks something like this:<syntaxhighlight lang="c" line="1"> (追記ここまで)
(追記) uint8_t MAC_ADDRESS[6]; (追記ここまで)
(追記) void reset_nic(){ (追記ここまで)
(追記) uint32_t device_control = read_register(I8254_REG_CTRL); (追記ここまで)
(追記) device_control |= I8254_CTRL_RESET; // Set the reset bit (追記ここまで)
(追記) write_register(I8254_REG_CTRL, device_control); (追記ここまで)
(追記) while(read_register(I8254_REG_CTRL) & I8254_CTRL_RESET) __asm__ ("hlt"); // wait for it to reset (追記ここまで)
(追記) device_control = read_register(I8254_REG_CTRL); (追記ここまで)
(追記) device_control |= I8254_CTRL_ASDE | I8254_CTRL_SLU; // Enable Auto Speed Detection. (追記ここまで)
(追記) write_register(I8254_REG_CTRL, device_control); (追記ここまで)
(追記) // Read the MAC address from the EEPROM (追記ここまで)
(追記) uint16_t b0 = eeprom_read(0); (追記ここまで)
(追記) uint16_t b1 = eeprom_read(1); (追記ここまで)
(追記) uint16_t b2 = eeprom_read(2); (追記ここまで)
(追記) MAC_ADDRESS[0] = b0 & 0xFF; (追記ここまで)
(追記) MAC_ADDRESS[1] = b0 >> 8; (追記ここまで)
(追記) MAC_ADDRESS[2] = b1 & 0xFF; (追記ここまで)
(追記) MAC_ADDRESS[3] = b1 >> 8; (追記ここまで)
(追記) MAC_ADDRESS[4] = b2 & 0xFF; (追記ここまで)
(追記) MAC_ADDRESS[5] = b2 >> 8; (追記ここまで)
(追記) // Write the MAC address to RAL/RAH 0. (追記ここまで)
(追記) uint32_t writeL = ((uint32_t)b1 << 16) | b0; (追記ここまで)
(追記) uint32_t writeH = b2; (追記ここまで)
(追記) write_register(E1000_REG_RAL0, writeL); (追記ここまで)
(追記) write_register(E1000_REG_RAH0, writeH); (追記ここまで)
(追記) </syntaxhighlight> (追記ここまで)
(追記) ==Ring setup== (追記ここまで)
(追記) ====Theory of operation:==== (追記ここまで)
(追記) The next step is to setup the rings. Without setting up the rings, you will not be able to send/receive packets. Luckily the ring system is pretty simple, It consists of the T/RDH and T/RDT (Transmit/Receive Descriptor Head/Tail) and of-course the ring buffers. (追記ここまで)
(追記) ====Transmit Ring==== (追記ここまで)
(追記) In the image bellow, you can see the structure of the transmit ring. The shaded boxes represent descriptors that have been transmitted but not yet reclamed. (If you dynamically allocate the descriptor buffers, reclaiming would simply involve freeing those buffers). (追記ここまで)
(追記) [[File:Transmit_Ring_Structure.png|border]] (追記ここまで)
(追記) Anything between the Head and the Tail is owned by the controller and consists the transmit queue (the descriptors that have been queued for transmission). At reset, both TDT and TDH are set to 0. (If TDT = TDH that means that the queue is empty, there is nothing to transmit). (追記ここまで)
(追記) ====Receive Ring==== (追記ここまで)
(追記) The image bellow depicts the structure of the receive ring. The shaded boxes represent descriptors that have stored incoming packets but have not yet been recognized by the driver. You can detect which descriptors have incoming data written in them by checking whether the status field is non-zero. (追記ここまで)
(追記) [[File:Receive_Ring_Structure.png|border]] (追記ここまで)
(追記) Any descriptors between RDH and RDT are owned by the hardware and should not be modified! (追記ここまで)
(追記) After the reset, the head should point to the first descriptor and the tail to the last descriptor of the ring (Since all descriptors are available for use). (追記ここまで)
(追記) The RDH points to the descriptor the controller will write the next received packet. It increments automatically. (追記ここまで)
(追記) The RDT points to one descriptor after the last available descriptor. This register should still point to a valid descriptor (should be within Base and Base + Size). (追記ここまで)
(追記) The TDLEN/RDLEN registers contain the size in bytes of the ring. (追記ここまで)
(追記) ===Setup:=== (追記ここまで)
(追記) ======Transmit Ring====== (追記ここまで)
(追記) *Firstly allocate a region for the descriptor ring (追記ここまで)
(追記) *Next, you can allocate a static buffer for the descriptors if you want, or use a dynamically allocated buffer to allocate it when you transmit the packet (In this example code, we use the first option). (追記ここまで)
(追記) *Set '''TDH''' and '''TDT''' to 0, '''TDBAL''' to the lower 32 bits of the ring's physical address, '''TDBAH''' to the higher 32 bits and TDBAL to the total length of the ring buffer (number of descriptors * 16) (追記ここまで)
(追記) *Set your preferred bits in the '''TCTL''' registger. (追記ここまで)
(追記) <syntaxhighlight lang="c" line="1"> (追記ここまで)
(追記) // Assumes 1:1 memory mapping for simplicity (追記ここまで)
(追記) #define NUM_OF_TX_DESCRIPTORS 8 (追記ここまで)
(追記) #define SIZE_OF_TX_DESCRIPTOR_BUFFER 4096 (追記ここまで)
(追記) struct transmit_descriptor_t; (追記ここまで)
(追記) void setup_transmit_ring(){ (追記ここまで)
(追記) size_t transmit_ring_size = NUM_OF_TX_DESCRIPTORS * 16; (追記ここまで)
(追記) transmit_descriptor_t* transmit_ring = your_favorite_physical_allocator(transmit_ring_size); (追記ここまで)
(追記) for (int i = 0; i < NUM_OF_TX_DESCRIPTORS; i++){ (追記ここまで)
(追記) transmit_descriptor_t* descriptor = transmit_ring + i; (追記ここまで)
(追記) descriptor->buffer_address = your_favorite_physical_allocator(SIZE_OF_TX_DESCRIPTOR_BUFFER); (追記ここまで)
(追記) write_register(REG_TDBAL, ((uint64_t)transmit_ring) & 0xFFFFFFFF); (追記ここまで)
(追記) write_register(REG_TDBAH, ((uint64_t)transmit_ring) >> 32); (追記ここまで)
(追記) write_register(REG_TDLEN, transmit_ring_size); (追記ここまで)
(追記) write_register(REG_TDH, 0); (追記ここまで)
(追記) write_register(REG_TDT, 0); (追記ここまで)
(追記) // Set the Enable (EN) and Pad Short Packets (PSP) bits (追記ここまで)
(追記) uint32_t tctl = E1000_TCTL_EN | E1000_TCTL_PSP; (追記ここまで)
(追記) write_register(REG_TCTL, tctl); (追記ここまで)
(追記) </syntaxhighlight> (追記ここまで)
(追記) =====Receive Ring===== (追記ここまで)
(追記) *Firstly allocate a region for the descriptor ring (追記ここまで)
(追記) *After that, loop through each descriptor and allocate a buffer of the selected size (set in the '''Receive Control Register''') and set it (its physical address) in the descriptor address field. (追記ここまで)
(追記) *Set '''RDH''' to 0 (the first descriptor), '''RDT''' to the last descriptor (number of descriptors - 1), '''RDBAL''' to the lower 32 bits of the ring's physical address, '''RDBAH''' to the higher 32 bits and '''RDLEN''' to the total length of the ring buffer (number of descriptors * 16). (追記ここまで)
(追記) *Set your preferred bits in the '''RCTL''' register (You must set the '''EN''' bit to enable the dma engine. '''LPE''' and '''BAM''' are recommended). (追記ここまで)
(追記) <syntaxhighlight lang="c" line="1"> (追記ここまで)
(追記) // Assumes 1:1 page mapping for simplicity (追記ここまで)
(追記) #define NUM_OF_RX_DESCRIPTORS 32 (追記ここまで)
(追記) #define SIZE_OF_RX_DESCRIPTOR_BUFFER 4096 (追記ここまで)
(追記) struct receive_descriptor_t; (追記ここまで)
(追記) void setup_receive_ring(){ (追記ここまで)
(追記) size_t receive_ring_size = NUM_OF_RX_DESCRIPTORS * 16; // you can substitute 16 with sizeof(receive_descriptor_t) (追記ここまで)
(追記) receive_descriptor_t* receive_ring = your_favorite_physical_allocator(receive_ring_size); (追記ここまで)
(追記) for (int i = 0; i < NUM_OF_RX_DESCRIPTORS; i++){ (追記ここまで)
(追記) receive_descriptor_t* descriptor = receive_ring + i; (追記ここまで)
(追記) descriptor->buffer_address = your_favorite_physical_allocator(SIZE_OF_RX_DESCRIPTOR_BUFFER); (追記ここまで)
(追記) write_register(REG_RDBAL, ((uint64_t)receive_ring) & 0xFFFFFFFF); // Base Address Low (追記ここまで)
(追記) write_register(REG_RDBAH, ((uint64_t)rx_phys) >> 32); // Base Address High (追記ここまで)
(追記) write_register(REG_RDLEN, receive_ring_size); // Ring Size (追記ここまで)
(追記) write_register(REG_RDH, 0); // Set it to the first descriptor (追記ここまで)
(追記) write_register(REG_RDT, NUM_OF_RX_DESCRIPTORS - 1); // Set it to the last descriptor (追記ここまで)
(追記) // Set the Enable, Long Packet Reception, Broadcast Accept Mode and Size Extenstion bits (追記ここまで)
(追記) // Also set the buffer size. This configuration (BSIZE = 0b11 and BSEX = 1) means 4096 (4kB) buffers (追記ここまで)
(追記) uint32_t rctl = RCTL_EN | RCTL_LPE | RCTL_BAM | RCTL_BSEX | (0b11 << RCTL_BSIZE); (追記ここまで)
(追記) write_register(REG_RCTL, rctl); (追記ここまで)
(追記) </syntaxhighlight> (追記ここまで)
(追記) ==Interrupt Handling== (追記ここまで)
(追記) Well, If you want to receive packets, you need a way of knowing when to read them. Thats where interrupts come into play. (追記ここまで)
(追記) To enable Interrupts, simply set the corresponding bit in the '''Interrupt Mask Set/Read (IMS)''' register. Recommended interrupts are: '''RXT0''' (to receive interrupts about incoming packets), '''RXO''' (to get notified about overruns) and '''LSC''' (to get notified about link status changes, e.g. if the user (un)plugs the ethernet cable. In such cases, you should redo the '''DHCP''' handshake to connect to that network).<syntaxhighlight lang="c" line="1"> (追記ここまで)
(追記) void enable_interrupts(){ (追記ここまで)
(追記) uint32_t ims = E1000_IMS_RXT | E1000_IMS_RXO | E1000_IMS_LSC; (追記ここまで)
(追記) write_register(REG_IMS, ims); (追記ここまで)
(追記) </syntaxhighlight>To check why an interrupt was caused, you can check the '''Interrupt Cause Read (ICR)''' register. The '''ICR''' register is self clearing, meaning it will get cleared when you read it. (追記ここまで)
(追記) A simple interrupt handler may look something like this:<syntaxhighlight lang="c" line="1"> (追記ここまで)
(追記) void _handle_interrupt(){ (追記ここまで)
(追記) uint32_t cause = read_register(REG_ICR); // Cleared uppon read (追記ここまで)
(追記) if (cause & IMS_RXT) { // Packets received (追記ここまで)
(追記) receive_packets(); // Call the function responsible for receiving (追記ここまで)
(追記) // packets and sending them to the network stack (追記ここまで)
(追記) if (cause & IMS_LSC){ // link status change (追記ここまで)
(追記) // Read the status register and check the LU bit to get the link status (追記ここまで)
(追記) if (read_register(E1000_REG_STATUS) & STATUS_LU) { (追記ここまで)
(追記) kprintf("Link change detected: Link up!\n"); (追記ここまで)
(追記) kprintf("Link change detected: Link down!\n"); (追記ここまで)
(追記) </syntaxhighlight> (追記ここまで)
(追記) ==Packet Transmittion== (追記ここまで)
(追記) To transmit a packet, all you have to do is load the data in a free descriptor (or split it if it doesn't fit in one descriptor) and set the '''EOP''' bit on the last descriptor. (追記ここまで)
(追記) In this example we are using preallocated buffers, but you could use dynamically allocated ones. Just remember to free it after the packet is transmitted.<syntaxhighlight lang="c" line="1"> (追記ここまで)
(追記) void send_data(void* data, uint32_t size, bool EOP){ (追記ここまで)
(追記) uint32_t tail = read_register(REG_TDT); (追記ここまで)
(追記) transmit_descriptor_t* tx = transmit_ring + tail; // Get the descriptor the tail is pointing at (next available descriptor) (追記ここまで)
(追記) memcpy(tx->buffer_address, data, size); // Copy the data to the previously allocated buffer (追記ここまで)
(追記) tx->length = size; // Set the length of the descriptor (追記ここまで)
(追記) if (EOP) tx->command |= TX_CMD_EOP | TX_CMD_IFCS; // If its the last one, set EOP (追記ここまで)
(追記) tail = (tail + 1) % NUM_OF_TX_DESCRIPTORS; (追記ここまで)
(追記) write_register(REG_TDT, tail); // Increment and write the tail (追記ここまで)
(追記) size_t send(void* data, size_t length){ (追記ここまで)
(追記) size_t sent = 0; (追記ここまで)
(追記) // split the data into chunks and send them (追記ここまで)
(追記) for (; sent < length;){ (追記ここまで)
(追記) int to_send = min(length - sent, SIZE_OF_TX_DESCRIPTOR_BUFFER); (追記ここまで)
(追記) send_data((void*)((uint64_t)data + sent), to_send, to_send == (length - sent)); (追記ここまで)
(追記) sent += to_send; (追記ここまで)
(追記) return sent; (追記ここまで)
(追記) </syntaxhighlight> (追記ここまで)
(追記) ==Packet Reception== (追記ここまで)
(追記) To receive packets after an interrupt, all you have to do is loop, from the first non-received (by the driver) packet, to the last one. To do that, its a good idea to keep track of the last descriptor the driver read. (You should do this, to reconstruct the packets in the correct order)<syntaxhighlight lang="c" line="1"> (追記ここまで)
(追記) uint8_t rx_next = 0; (追記ここまで)
(追記) void receive_packets(){ (追記ここまで)
(追記) uint32_t idx = rx_next; (追記ここまで)
(追記) void* buffer = nullptr; // use this to store the buffer. (追記ここまで)
(追記) size_t buffer_len = 0; (追記ここまで)
(削除) When all (削除ここまで)data is (削除) finally read (削除ここまで)the kernel (削除) should unlock (削除ここまで)the (削除) EEPROM (削除ここまで)to (削除) let hardware access it (削除ここまで).
(追記) while (receive_ring[idx].status & RX_STATUS_DD) { (追記ここまで)
(追記) // This descriptor has been filled (追記ここまで)
(追記) bool eop = receive_ring[idx].status & RX_STATUS_EOP; (追記ここまで)
(追記) uint16_t len = receive_ring[idx].length; (追記ここまで)
(追記) void* (追記ここまで)data (追記) = receive_ring[idx].buffer_address; (追記ここまで)
(追記) // Handle multiple-descriptor packets (追記ここまで)
(追記) if (buffer == nullptr){ // This (追記ここまで)is the (追記) first descriptor of the packet (追記ここまで)
(追記) buffer = malloc(len); // use your (追記ここまで)kernel(追記) 's heap allocator (追記ここまで)
(追記) buffer_len = len; (追記ここまで)
(追記) memcpy(buffer, data, len); (追記ここまで)
(追記) // Its (追記ここまで)the (追記) next part of the packet, add it (追記ここまで)to (追記) the packet (追記ここまで)
(追記) void* new_buffer = malloc(buffer_len + len); // allocate a bigger buffer (追記ここまで)
(追記) memcpy(new_buffer, buffer, buffer_len); // copy the previous data (追記ここまで)
(追記) free(buffer); // free the old buffer (追記ここまで)
(追記) // copy the new data (追記ここまで)
(追記) memcpy((void*)((uint64_t)new_buffer + buffer_len), data, len); (追記ここまで)
(追記) // Set the new buffer into the variables (追記ここまで)
(追記) buffer_len += len; (追記ここまで)
(追記) buffer = new_buffer; (追記ここまで)
(追記) // Set status to 0 (To give ownership back to the controller) (追記ここまで)
(追記) receive_ring[idx] (追記ここまで).(追記) status = 0; (追記ここまで)
=(削除) = Obtaining the MAC address == (削除ここまで)
(追記) idx (追記ここまで)= (追記) (idx + 1) % NUM_OF_RECEIVE_DESCRIPTORS; (追記ここまで)
(削除) Obtaining the MAC is quite trivial and only requires reading the first 3-words of the EEPROM. (削除ここまで)
(削除) <source lang (削除ここまで)=(削除) "c"> (削除ここまで)
(追記) if (eop) { (追記ここまで)
(削除) uint8_t dev_info.mac_addr[6] (削除ここまで);
(追記) // This is the last descriptor of the packet (追記ここまで)
(追記) // Forward the packet to your network stack (追記ここまで)
(追記) stack_receive_packet(buffer, buffer_len); (追記ここまで)
(追記) buffer (追記ここまで)= (追記) nullptr; (追記ここまで)
(追記) buffer_len = 0 (追記ここまで);
/(削除) * Assumes a little-endian architecture * (削除ここまで)/
(追記) (追記ここまで)// (追記) Give the controller more free descriptors by updating RDT (追記ここまで)
(削除) i8254x_lock_eeprom (削除ここまで)();
(追記) uint32_t tail = (追記ここまで)((追記) idx == 0 (追記ここまで)) (追記) ? NUM_OF_RECEIVE_DESCRIPTORS - 1 : idx - 1 (追記ここまで);
(削除) * (削除ここまで)((削除) (uint16_t *)&dev_info.mac_addr[0]) = i8254x_read_eeprom(0x00 (削除ここまで));
(追記) write_register (追記ここまで)((追記) REG_RDT, tail (追記ここまで));
(削除) *((uint16_t *)&dev_info.mac_addr[2]) = i8254x_read_eeprom(0x01); (削除ここまで)
(削除) *((uint16_t *)&dev_info.mac_addr[4]) (削除ここまで)= (削除) i8254x_read_eeprom(0x02) (削除ここまで);
(追記) rx_next (追記ここまで)= (追記) idx (追記ここまで);
(削除) i8254x_unlock_eeprom(); (削除ここまで)
(追記) } (追記ここまで)
</(削除) source (削除ここまで)>
</(追記) syntaxhighlight (追記ここまで)>
== Emulation ==
==Emulation==
* '''VirtualBox''' (3.1 is all I can personally confirm) supports rather dodgy implementations of an Intel PRO/1000 MT Server (82545EM), Intel PRO/1000 MT Desktop (82540EM), and Intel PRO/1000 T Server (82543GC).
*'''VirtualBox''' (3.1 is all I can personally confirm) supports rather dodgy implementations of an Intel PRO/1000 MT Server (82545EM), Intel PRO/1000 MT Desktop (82540EM), and Intel PRO/1000 T Server (82543GC).
*** The EERD register is unimplemented (you *must* use the 4-wire access method if you want to read from the EEPROM). [01000101 - I had a patch committed to fix this. It will soon be mainstream]
***The EERD register is unimplemented (you *must* use the 4-wire access method if you want to read from the EEPROM). [01000101 - I had a patch committed to fix this. It will soon be mainstream]
* '''VMWare Virtual Server 2''' emulates/virtualizes an 82545EM-based card rather well.
*'''VMWare Virtual Server 2''' emulates/virtualizes an 82545EM-based card rather well.
* '''QEMU''' (since 0.10.0) supports an 82540EM-based card and it seems to work OK. It is the default network card since 0.11.0.
*'''QEMU''' (since 0.10.0) supports an 82540EM-based card and it seems to work OK. It is the default network card since 0.11.0.
*** QEMU does not properly handle the software reset operation (CTRL.RST) in builds prior to June 2009.
***QEMU does not properly handle the software reset operation (CTRL.RST) in builds prior to June 2009.
*** QEMU (version 4.2.1 tested) doesn't seem to support flash memory, instead shifting the IO Register Base Address up.
***QEMU (version 4.2.1 tested) doesn't seem to support flash memory, instead shifting the IO Register Base Address up.
* IIRC (needs confirmation) '''Microsoft's Hyper-V''' supports an 8254x-series card.
*IIRC (needs confirmation) '''Microsoft's Hyper-V''' supports an 8254x-series card.
== Documentation ==
==Documentation==
* [http://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf Intel 8254x Family of Gigabit Ethernet Controllers Software Developer's Manual]
*[http://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf Intel 8254x Family of Gigabit Ethernet Controllers Software Developer's Manual]
* The [http://www.intel.com/content/dam/www/public/us/en/documents/manuals/pcie-gbe-controllers-open-source-manual.pdf PCIe GbE Controllers Open Source Software Developer’s Manual] may also be of interest. In Linux, the PCIe cards are handled by a separate driver (e1000e) but they appear to be mostly if not entirely compatible with the 8254x series.
*The [http://www.intel.com/content/dam/www/public/us/en/documents/manuals/pcie-gbe-controllers-open-source-manual.pdf PCIe GbE Controllers Open Source Software Developer’s Manual] may also be of interest. In Linux, the PCIe cards are handled by a separate driver (e1000e) but they appear to be mostly if not entirely compatible with the 8254x series.
== Example driver ==
==Example driver==
* [https://joscor.com/blog/intel-8254x-ethernet-controller-example-driver/ 01000101's Intel 8254x-series example driver]
*[https://joscor.com/blog/intel-8254x-ethernet-controller-example-driver/ 01000101's Intel 8254x-series example driver]
* [https://github.com/torokernel/torokernel/blob/7d6df4c40fa4cc85febd5fd5799404592ffdff53/rtl/drivers/E1000.pas Example of a driver for e1000 in Freepascal]
*[https://github.com/torokernel/torokernel/blob/7d6df4c40fa4cc85febd5fd5799404592ffdff53/rtl/drivers/E1000.pas Example of a driver for e1000 in Freepascal]
[[Category:Network Hardware]]
[[Category:Network Hardware]]
[[Category:Standards]]
[[Category:Standards]]
Revision as of 16:46, 10 November 2025
The Intel 8254x series is comprised of: 82546GB/EB, 82545GM/EM, 82544GC/EI, 82541(PI/GI/EI), 82541ER, 82547GI/EI, and 82540EP/EM Gigabit Ethernet Controllers.
Overview
Intel 8254x-based cards come in 32-/64-bit, 33/66 MHz PCI and PCI-X flavors.
The Intel 82547GI(EI) connects to the motherboard via a Communications Streaming Architecture (CSA) port instead of a PCI/PCI-X bus.
The 82541xx and 82540EP/EM controllers do not support the PCI-X bus.
They are all high-performance, Gigabit-capable controllers and range from 1 to 4 ethernet/fiber ports per controller.
The Intel 8254x series heavily utilizes task offloading. Each controller has an "offloading engine" for tasks such as TCP/UDP/IP checksum calculations, packet filtering, and packet segmentation.
- Jumbo packets are supported.
- Wake on LAN (WoL) is supported.
- A four wire serial EEPROM interface as well as a generic EEPROM "read" interface is implemented within the configuration registers.
- D0 and D3 power states are supported through ACPI.
Programming
Detection
Section 5.2 in the 8254x Software Developer's Manual lists the Vendor and Device ID's of the various device in the 8254x series. These are used to detect devices on the PCI bus by looking in the PCI Configuration Space registers.
The device will also fill in the PCI Base Address Registers (BAR). BAR0 will either be a 64-bit or 32-bit MMIO address (checked by testing bits 2:1 to see if it's 00b (32-bit) or 10b (64-bit)) that points to the device's base register space. BAR0 should always be used to interface with the device via MMIO as the BAR number never changes in different devices in the series.
There is also a BAR that will contain an I/O base address, this can be detected by looking at each BAR and testing bit 1. Documentation states this will be in either BAR2 or BAR4, but emulators may move it.
When using MMIO, reading/writing to/from registers is very straight-forward.
uint64_tioaddr=BAR_GOES_HERE;
voidwrite_register(uint16_tregister,uint32_tvalue){
*(uint32_t*)(ioaddr+register)=value;
}
uint32_tread_register(uint16_tregister){
return*(uint32_t*)(ioaddr+register);
}
When using IO, reading/writing to/from registers is a little more complicated as the IO address space for the 8254x is only 8 bytes wide.
The register at offset 0x00 is the "IOADDR" window. The register at offset 0x04 is the "IODATA" window.
IOADDR holds the IO address that the IODATA window operates on. So, basic operation is to set the IOADDR window and then the desired action using the IODATA window.
uint16_tioaddr=IO_BAR_GOES_HERE;
voidwrite_register(uint16_tregister,uint32_tvalue){
outl(ioaddr+0x00,register);// set the IOADDR window
outl(ioaddr+0x04,value);// write the value to the IOADDR window which will end up in the register in IOADDR
}
uint32_tread_register(uint16_tregister){
outl(ioaddr+0x00,register);// set the IOADDR window
returninl(ioaddr+0x04);// read the value
}
Device Registers
The 8254x cards have a handful of registers. There is a complete list of the registers and their offsets at the Table 13-2 (Page 219) of the Intel 8254x Family of Gigabit Ethernet Controllers Software Developer's Manual.
Here are the most important ones:
| Category
|
Offset
|
Abbreviation
|
Name
|
R/W
|
Manual Page
|
| General
|
00000h
|
CTRL
|
Device Control
|
R/W
|
224
|
| General
|
00008h
|
STATUS
|
Device Status
|
R
|
229
|
| General
|
00010h
|
EECD
|
EEPROM/Flash Control/Data
|
R/W
|
232
|
| General
|
00014h
|
EERD
|
EEPROM Read (not applicable
to the 82544GC/EI)
|
R/W
|
236
|
| Interrupt
|
000C0h
|
ICR
|
Interrupt Cause Read
|
R/W
|
292
|
| Interrupt
|
000D0h
|
IMS
|
Interrupt Mask Set / Read
|
R/W
|
297
|
| Receive
|
00100h
|
RCTL
|
Receive Control
|
R/W
|
300
|
| Receive
|
02800h
|
RDBAL
|
Receive Descriptor Base Low
|
R/W
|
306
|
| Receive
|
02804h
|
RDBAH
|
Receive Descriptor Base High
|
R/W
|
306
|
| Receive
|
02808h
|
RDLEN
|
Receive Descriptor Length
|
R/W
|
307
|
| Receive
|
02810h
|
RDH
|
Receive Descriptor Head
|
R/W
|
307
|
| Receive
|
02818h
|
RDT
|
Receive Descriptor Tail
|
R/W
|
308
|
| Transmit
|
00400h
|
TCTL
|
Transmit Control
|
R/W
|
310
|
| Transmit
|
03800h
|
TDBAL
|
Transmit Descriptor Base Low
|
R/W
|
315
|
| Transmit
|
03804h
|
TDBAH
|
Transmit Descriptor Base High
|
R/W
|
316
|
| Transmit
|
03808h
|
TDLEN
|
Transmit Descriptor Length
|
R/W
|
316
|
| Transmit
|
03810h
|
TDH
|
Transmit Descriptor Head
|
R/W
|
317
|
| Transmit
|
03818h
|
TDT
|
Transmit Descriptor Tail
|
R/W
|
318
|
| Receive
|
05400h-
05488h
|
RAL(8*n)
|
Receive Address Low (n)
|
R/W
|
329
|
| Receive
|
05404h-
0547Ch
|
RAH(8*n)
|
Receive Address High (n)
|
R/W
|
329
|
The Device Control Register (CTRL)
| Field
|
Bit(s)
|
Name
|
Field
|
Bit(s)
|
Name
|
| FD
|
0
|
Full - Duplex
|
SDP1_DATA
|
19
|
SDP1 Data Value
|
| RSV
|
2:1
|
Reserved
|
ADVD3WUC
|
20
|
D3Cold Wakeup Capability
Advertisement Enable
|
| LRST
|
3
|
Link Reset
|
EN_PHY_PWR_MGMT
|
21
|
PHY Power Management Enable
|
| RSV
|
4
|
Reserved
|
SDP0_IODIR
|
22
|
SDP0 Pin Directionality
|
| ASDE
|
5
|
Auto-Speed Detection Enable
|
SDP1_IODIR
|
23
|
SDP1 Pin Directionality
|
| SLU
|
6
|
Set Link Up
|
RSV
|
25:24
|
Reserved
|
| ILOS
|
7
|
Invert Loss-of-Signal
|
RST
|
26
|
Device Reset
|
| SPEED
|
9:8
|
Speed selection
|
RFCE
|
27
|
Receive Flow Control Enable
|
| RSV
|
10
|
Reserved
|
TFCE
|
28
|
Transmit Flow Control Enable
|
| FRCSPD
|
11
|
Force Speed
|
RSV
|
29
|
Reserved
|
| FRCDPLX
|
12
|
Force Duplex
|
VME
|
30
|
VLAN Mode Enable
|
| RSV
|
17:13
|
Reserved
|
PHY_RST
|
31
|
PHY Reset
|
| SDP0_DATA
|
18
|
SDP0 Data Value
|
Status Register Bit Description
| Field
|
Bit(s)
|
Name
|
| FD
|
0
|
Link Full Duplex configuration Indication.
|
| LU
|
1
|
Link Up indication
|
| Function ID
|
3:2
|
Provides software a mechanism to determine the Ethernet
controller function number (LAN identifier) for this MAC. Read
as: [0b,0b] LAN A, [0b,1b] LAN B.
Note: These settings are only applicable to the 82546GB/EB.
|
| TXOFF
|
4
|
Transmission Paused
|
| TBIMODE
|
5
|
TBI Mode/internal SerDes Indication.
Note: For the 82544GC/EI, reflects the status of the TBI_MODE input pin.
|
| SPEED
|
7:6
|
Link Speed Setting.
Speed indication is mapped as follows:
00b = 10 Mb/s
01b = 100 Mb/s
10b = 1000 Mb/s
11b = 1000 Mb/s
These bits are not valid in TBI mode/internal SerDes.
|
| ASDV
|
9:8
|
Auto Speed Detection Value
|
| RSV
|
10
|
Reserved
|
| PCI66
|
11
|
PCI Bus speed indication. (When set, indicates that the PCI Bus is running
at 66 MHz).
|
| BUS641
|
12
|
PCI Bus Width indication. (When set, indicates that the Ethernet controller is on
a 64-bit bus)
|
| PCIX_MODE1
|
13
|
PCI-X Mode indication. (When set, indicates that the Ethernet Controller is operating
in PCI-X mode)
|
| PCIXSPD1
|
15:14
|
PCI-X Bus Speed Indication.
00b = 50-66 MHz
01b = 66-100 MHz
10b = 100-133 MHz
11b = Reserved
|
| RSV
|
31:16
|
Reserved
|
1. Not applicable to the 82540EP/EM, 82541xx, or 82547GI/EI.
Transmit Control Register (TCTL)
| Field
|
Bit(s)
|
Name
|
| RSV
|
0
|
Reserved
|
| EN
|
1
|
Transmit Enable
|
| RSV
|
2
|
Reserved
|
| PSP
|
3
|
Pad Short Packets
|
| CT
|
11:4
|
Collision Threshold
|
| COLD
|
21:12
|
Collision Distance
|
| SWXOFF
|
22
|
Software XOFF Transmission
|
| RSV
|
23
|
Reserved
|
| RTLC
|
24
|
Re-transmit on Late Collision
|
| NRTU
|
25
|
No Re-transmit on underrun
(82544GC/EL Only)
|
| RSV
|
31:26
|
Reserved
|
Receive Control Register (RCTL)
| Field
|
Bit(s)
|
Name
|
Field
|
Bit(s)
|
Name
|
| RSV
|
0
|
Reserved
|
BSIZE
|
17:16
|
Receive Buffer Size
|
| EN
|
1
|
Receiver Enable
|
VFE
|
18
|
VLAN Filter Enable
|
| SBP
|
2
|
Store Bad Packets
|
CFIEN
|
19
|
Canonical Form Indicator Enable
|
| UPE
|
3
|
Unicast Promiscuous Enabled
|
CFI
|
20
|
Canonical Form Indicator bit value
|
| MPE
|
4
|
Multicast Promiscuous Enabled
|
RSV
|
21
|
Reserved
|
| LPE
|
5
|
Long Packet Reception Enable
|
DPF
|
22
|
Discard Pause Frames
|
| LBM
|
7:6
|
Loopback Mode
|
PMCF
|
23
|
Pass MAC Control Frames
|
| RDMTS
|
9:8
|
Receive Descriptor Minimum
Threshold Size
|
RSV
|
24
|
Reserved
|
| RSV
|
11:10
|
Reserved
|
BSEX
|
25
|
Buffer Size Extenstion
|
| MO
|
13:12
|
Multicast Offset
|
SECRC
|
26
|
Strip Ethernet CRC from incoming packet
|
| RSV
|
14
|
Reserved
|
RSV
|
21:27
|
Reserved
|
| BAM
|
15
|
Broadcast Accept
|
When BSEX is set, the value in BSIZE is multiplied by 16.
Receive Buffer Size Configuration
| Size (Bytes)
|
BSIZE
|
BSEX
|
| 16384
|
01b
|
1
|
| 8192
|
10b
|
1
|
| 4096
|
11b
|
1
|
| 2048
|
00b
|
0
|
| 1024
|
01b
|
0
|
| 512
|
10b
|
0
|
| 256
|
11b
|
0
|
Interrupt mask Set / Read (IMS)
| Field
|
Bit(s)
|
Description
|
| TDW
|
0
|
Sets mask for Transmit Descriptor Written Back
|
| TXQE
|
1
|
Sets mask for Transmit Queue Empty.
|
| LSC
|
2
|
Sets mask for Link Status Change.
|
| RXSEQ
|
3
|
Sets mask for Receive Sequence Error.
This is a reserved bit for the 82541xx and 82547GI/EI. Set to 0b.
|
| RXDMT0
|
4
|
Sets mask for Receive Descriptor Minimum Threshold hit.
|
| RSV
|
5
|
Reserved
|
| RXO
|
6
|
Sets mask for on Receiver FIFO Overrun
|
| RXT0
|
7
|
Sets mask for Receiver Timer Interrupt
|
| RSV
|
8
|
Reserved
|
| MDAC
|
9
|
Sets mask for MDI/O Access Complete Interrupt
|
| RXCFG
|
10
|
Sets mask for Receiving /C/ ordered sets.
This is a reserved bit for the 82541xx and 82547GI/EI. Set to 0b
|
| RSV
|
11
|
Reserved
|
| PHYINT
|
12
|
Sets mask for PHY Interrupt (not applicable to the 82544GC/EI).
This is a reserved bit for the 82541xx and 82547GI/EI. Set to 0b
|
| GPI
|
14:11
|
Sets mask for General Purpose Interrupts (82544GC/EI only).
|
| GPI
|
14:13
|
Sets mask for General Purpose Interrupts
|
| TXD_LOW
|
15
|
Sets the mask for Transmit Descriptor Low Threshold hit (not
applicable to the 82544GC/EI).
|
| SRPD
|
16
|
Sets mask for Small Receive Packet Detection (not applicable to
the 82544GC/EI).
|
| RSV
|
31:17
|
Reserved
|
To enable an interrupt, simply write '1' to the corresponding bit.
Descriptor Format
Both receive and transmit descriptors are 16 bytes in size. There are 3 types of transmit descriptors, the original referred to as the "Legacy transmit descriptor". The second one is referred to as the " TCP/IP Data Descriptor" and is a replacement for the legacy descriptor offering access to new offloading capabilities.The other descriptor type is fundamentally different as it does not point to packet data. It merely contains control information which is loaded into registers of the controller and affect the processing of future packets. For simplicity we will only use the Legacy transmit descriptor. If you want to learn more about the other types of descriptors, you can have a look at the specification.
Legacy Transmit Descriptor Format
| 63 63
|
47 40
|
39 36
|
35 32
|
31 24
|
23 16
|
15 0
|
| Buffer Address
|
| Special
|
CSS
|
RSV
|
STA
|
CMD
|
CSO
|
Length
|
Legacy Transmit Descriptor Field Description
| Name
|
Description
|
| Buffer Address
|
The address of the buffer. Descriptors with a null address transfer no data.
|
| Length
|
Length is per segment. The maximum length allowed is 16288 bytes.
|
| CSO
|
Checksum Offset. Indicates where, relative to the start of the packet to insert
a TCP checksum if it is enabled in the CMD field.
|
| CMD
|
Command Field
|
| STA
|
Status Field
|
| RSV
|
Reserved
|
| CSS
|
Checksum Start Field. Its an offset relative to the start of the buffer and it
indicates where to start computing the Checksum.
|
| Special
|
Special Field
|
Transmit Descriptor Command Field Format
| 7
|
6
|
5
|
4
|
3
|
2
|
1
|
0
|
| IDE
|
VLE
|
DEXT
|
RPS
|
RS
|
IC
|
IFCS
|
EOP
|
Transmit Descriptor Command Field Description
| Name
|
Description
|
| IDE (bit 7)
|
Interrupt Delay Enable
|
| VLE (bit 6)
|
VLAN Packet Enable
|
| DEXT (bit 5)
|
Extension. (Set to 0b to indicate legacy mode)
|
| RPS/RSV (bit 4)
|
Report Packet Sent. 82544GC/EL only. Otherwise reserved!
|
| RS (bit 3)
|
Report Status. (When set, the controller will fire an interrupt when
the packet gets transmitted and bit STA.DD (Descriptor Done) will be set).
|
| IC (bit 2)
|
Insert Checksum. (When set, the controller will insert a checksum based
on the values of the CSO and CSS fields.)
|
| IFCS (bit 1)
|
Controls the Insertion of the FCS/CRC field in normal Ethernet packets.
IFCS is only valid when EOP is set.
|
| EOP (bit 0)
|
End Of Packet. It indicates the last descriptor making up the packet.
One or many descriptors can be used to form a packet.
|
Transmit Descriptor Status Format
Transmit Descriptor Status Field Description
| Name
|
Description
|
| TU/RSV (bit 3)
|
Transmit Underrun. Indicated a transmit underrun error has occurred.
82544GC/EL only. Otherwise reserved!
|
| LC (bit 2)
|
Late Collision. Indicates that a Late Collision occurred while working in
half-duplex mode. It has no meaning in full-duplex.
|
| EC (bit 1)
|
Excess Collisions. It indicates that the packet has experienced more than
the maximum excessive collisions as defined by TCTL.CT control field.
|
| DD (bit 0)
|
Descriptor Done. Indicates that the descriptor is finished.
|
Receive Descriptor Format
| 63 48
|
47 40
|
39 32
|
31 16
|
15 0
|
| Buffer Address
|
| Special*
|
Errors
|
Status
|
Packet
Checksum*
|
Length
|
*82544GC/EL only. Otherwise reserved!
Receive Descriptor Status Field
| 7
|
6
|
5
|
4
|
3
|
2
|
1
|
0
|
| PIF
|
IPCF
|
TCPCS
|
RSV
|
VP
|
IXSM
|
EOP
|
DD
|
Receive Descriptor Status Bits
| Name
|
Description
|
| PIF (bit 7)
|
Passed in-exact filter. If set the software must examine this packet to determine
whether to accept it or not. if PIF is clear, the packet is known to be for this station.
|
| IPCS (bit 6)
|
IP Checksum Calculated on Packet. (0 = do not perform IP checksum, 1 = perform IP checksum)
|
| TCPCS (bit 5)
|
TCP Checksum Calculated on Packet. (0 = do not perform TCP/UDP checksum, 1 = perform TCP/UDP checksum)
|
| RSV (bit 4)
|
Reserved
|
| VP (bit 3)
|
Packet is 802.1Q (matched VET).
|
| IXSM (bit 2)
|
Ignore Checksum Indication. (when set, the checksum indication results should be ignored).
|
| EOP (bit 1)
|
End Of Packet. (Indicates that this is the last descriptor for an incoming packet)
|
| DD (bit 0)
|
Descriptor Done. (Indicates whether the controller is done with the descriptor)
|
Receive Descriptor Errors Field
| 7
|
6
|
5
|
4
|
3
|
2
|
1
|
0
|
| RXE
|
IPE
|
TCPE
|
CXEa
|
RSV
|
SEQb
|
SEb
|
CE
|
a. 82544GC/EI only, otherwise reserved!
b.82541xx, 82547GI/EI, and 82540EP/EM only, otherwise reserved.
Receive Descriptor Error bits
| Name
|
Description
|
| RXE (bit 7)
|
RX Data Error
|
| IPE (bit 6)
|
IP Checksum Error
|
| TCPE (bit 5)
|
TCP/UDP Checksum Error
|
| CXE (bit 4)
|
Carrier Extension Error
|
| RSV (bit 3)
|
Reserved
|
| SEQ (bit 2)
|
Sequence Error
|
| SE (bit 1)
|
Symbol Error
|
| CE (bit 0)
|
CRC Error or Alignment Error
|
The Receive Descriptor Special field is only populated for 802.1q packets. For all other packets it's contents are set to 0.
Receive Descriptor Special Field
| 15 13
|
12
|
11 0
|
| PRI
|
CFI
|
VLAN
|
Receive Descriptor Special Field
| Name
|
Description
|
| VLAN
|
VLAN Identifier
|
| CFI
|
Canonical Form Indicator
|
| PRI
|
User Priority
|
EEPROM Reading
There are a few variants of the card with many differences, most notably the method to access the EEPROM and the Flash memory of the card. Here we will only describe methods applicable to cards that use the EEPROM method.
After that the EEPROM must be enabled in order to be able to read the MAC address of the NIC, this is done by setting the EECD.SK (0x01), EECD.CS (0x02) and EECD.DI (0x04) bits of the EECD (0x00010) register. This will allow software to perform reads to the EEPROM.
Before reading the EEPROM has a "lock-unlock" mechanism to prevent software-hardware collisions when reading from the EEPROM.
To lock the EEPROM the EECD.REQ (0x40) bit must be set in the EECD register. Then wait until the EECD.GNT (0x80) bit becomes set.
Unlocking only requires to clear EECD.REQ.
To finally read the EEPROM first the kernel should AND the address to 12 (Applicable only to 82541x or 82547GI/EI cards) or 8 bits; then bit shift the desired address to 2 (Applicable only to 82541x or 82547GI/EI cards) or by 4. The kernel must OR it with the EECD.START (0x01) bit. Then finally write it to the EERD (0x00014) register.
The kernel should wait until the EEPROM read operation is finished by checking until EECD.DONE becomes clear. Then the kernel must read the EERD register, shift it to the right by 16 bits and truncate it to 16-bits.
After that the EERD.START bit must be cleared.
staticuint16_teeprom_read(uint8_taddr){
uint32_ttmp;
uint16_tdata;
if((le32_to_cpu(mmio_read_dword(dev_info.mmio.addr,I8254X_EECD))&I8254X_EECD_EE_PRES)==0){
kpanic("EEPROM present bit is not set for i8254x\n");
}
/* Tell the EEPROM to start reading */
if(dev_info.version==I82547GI_EI
||dev_info.version==I82541EI_A0
||dev_info.version==I82541EI_B0
||dev_info.version==I82541ER_C0
||dev_info.version==I82541GI_B1
||dev_info.version==I82541PI_C0){
/* Specification says that only 82541x devices and the
* 82547GI/EI do 2-bit shift */
tmp=((uint32_t)addr&0xfff)<<2;
}else{
tmp=((uint32_t)addr&0xff)<<8;
}
tmp|=I8254X_EERD_START;
mmio_write_dword(dev_info.mmio.addr,I8254X_EERD,cpu_to_le32(tmp));
/* Wait until the read is finished - then the DONE bit is cleared */
timeout((le32_to_cpu(mmio_read_dword(dev_info.mmio.addr,I8254X_EERD))&I8254X_EERD_DONE)==0,100);
/* Obtain the data */
data=(uint16_t)(le32_to_cpu(mmio_read_dword(dev_info.mmio.addr,I8254X_EERD))>>16);
/* Tell EEPROM to stop reading */
tmp=le32_to_cpu(mmio_read_dword(dev_info.mmio.addr,I8254X_EERD));
tmp&=~(uint32_t)I8254X_EERD_START;
mmio_write_dword(dev_info.mmio.addr,I8254X_EERD,cpu_to_le32(tmp));
returndata;
}
When all data is finally read the kernel should unlock the EEPROM to let hardware access it.
Initialization
The 8254x will be on an undefined state and as such it needs to be reset. The first thing that should be done is enabling bus mastering, memory and IO accesses from the PCI command register.
Then the NIC should be reset by setting CTRL.RST (bit 26, self clearing) bit in the Device Control register of the card.
After the card has been reset, you should enable the CTRL.ASDE, the CTRL.SLU bits (To enable Auto Speed Detection (ASDE), you must also set the SLU (Set link up) bit) and write the MAC address you want the device to use in the RAL0 and RAH0 registers. To get the device MAC address, all you have to do is read the first 3 bytes of the EEPROM.
The entire procedure looks something like this:
uint8_tMAC_ADDRESS[6];
voidreset_nic(){
uint32_tdevice_control=read_register(I8254_REG_CTRL);
device_control|=I8254_CTRL_RESET;// Set the reset bit
write_register(I8254_REG_CTRL,device_control);
while(read_register(I8254_REG_CTRL)&I8254_CTRL_RESET)__asm__("hlt");// wait for it to reset
device_control=read_register(I8254_REG_CTRL);
device_control|=I8254_CTRL_ASDE|I8254_CTRL_SLU;// Enable Auto Speed Detection.
write_register(I8254_REG_CTRL,device_control);
// Read the MAC address from the EEPROM
uint16_tb0=eeprom_read(0);
uint16_tb1=eeprom_read(1);
uint16_tb2=eeprom_read(2);
MAC_ADDRESS[0]=b0&0xFF;
MAC_ADDRESS[1]=b0>>8;
MAC_ADDRESS[2]=b1&0xFF;
MAC_ADDRESS[3]=b1>>8;
MAC_ADDRESS[4]=b2&0xFF;
MAC_ADDRESS[5]=b2>>8;
// Write the MAC address to RAL/RAH 0.
uint32_twriteL=((uint32_t)b1<<16)|b0;
uint32_twriteH=b2;
write_register(E1000_REG_RAL0,writeL);
write_register(E1000_REG_RAH0,writeH);
}
Ring setup
Theory of operation:
The next step is to setup the rings. Without setting up the rings, you will not be able to send/receive packets. Luckily the ring system is pretty simple, It consists of the T/RDH and T/RDT (Transmit/Receive Descriptor Head/Tail) and of-course the ring buffers.
Transmit Ring
In the image bellow, you can see the structure of the transmit ring. The shaded boxes represent descriptors that have been transmitted but not yet reclamed. (If you dynamically allocate the descriptor buffers, reclaiming would simply involve freeing those buffers).
Transmit Ring Structure.png
Anything between the Head and the Tail is owned by the controller and consists the transmit queue (the descriptors that have been queued for transmission). At reset, both TDT and TDH are set to 0. (If TDT = TDH that means that the queue is empty, there is nothing to transmit).
Receive Ring
The image bellow depicts the structure of the receive ring. The shaded boxes represent descriptors that have stored incoming packets but have not yet been recognized by the driver. You can detect which descriptors have incoming data written in them by checking whether the status field is non-zero.
Receive Ring Structure.png
Any descriptors between RDH and RDT are owned by the hardware and should not be modified!
After the reset, the head should point to the first descriptor and the tail to the last descriptor of the ring (Since all descriptors are available for use).
The RDH points to the descriptor the controller will write the next received packet. It increments automatically.
The RDT points to one descriptor after the last available descriptor. This register should still point to a valid descriptor (should be within Base and Base + Size).
The TDLEN/RDLEN registers contain the size in bytes of the ring.
Setup:
Transmit Ring
- Firstly allocate a region for the descriptor ring
- Next, you can allocate a static buffer for the descriptors if you want, or use a dynamically allocated buffer to allocate it when you transmit the packet (In this example code, we use the first option).
- Set TDH and TDT to 0, TDBAL to the lower 32 bits of the ring's physical address, TDBAH to the higher 32 bits and TDBAL to the total length of the ring buffer (number of descriptors * 16)
- Set your preferred bits in the TCTL registger.
// Assumes 1:1 memory mapping for simplicity
#define NUM_OF_TX_DESCRIPTORS 8
#define SIZE_OF_TX_DESCRIPTOR_BUFFER 4096
structtransmit_descriptor_t;
voidsetup_transmit_ring(){
size_ttransmit_ring_size=NUM_OF_TX_DESCRIPTORS*16;
transmit_descriptor_t*transmit_ring=your_favorite_physical_allocator(transmit_ring_size);
for(inti=0;i<NUM_OF_TX_DESCRIPTORS;i++){
transmit_descriptor_t*descriptor=transmit_ring+i;
descriptor->buffer_address=your_favorite_physical_allocator(SIZE_OF_TX_DESCRIPTOR_BUFFER);
}
write_register(REG_TDBAL,((uint64_t)transmit_ring)&0xFFFFFFFF);
write_register(REG_TDBAH,((uint64_t)transmit_ring)>>32);
write_register(REG_TDLEN,transmit_ring_size);
write_register(REG_TDH,0);
write_register(REG_TDT,0);
// Set the Enable (EN) and Pad Short Packets (PSP) bits
uint32_ttctl=E1000_TCTL_EN|E1000_TCTL_PSP;
write_register(REG_TCTL,tctl);
}
Receive Ring
- Firstly allocate a region for the descriptor ring
- After that, loop through each descriptor and allocate a buffer of the selected size (set in the Receive Control Register) and set it (its physical address) in the descriptor address field.
- Set RDH to 0 (the first descriptor), RDT to the last descriptor (number of descriptors - 1), RDBAL to the lower 32 bits of the ring's physical address, RDBAH to the higher 32 bits and RDLEN to the total length of the ring buffer (number of descriptors * 16).
- Set your preferred bits in the RCTL register (You must set the EN bit to enable the dma engine. LPE and BAM are recommended).
// Assumes 1:1 page mapping for simplicity
#define NUM_OF_RX_DESCRIPTORS 32
#define SIZE_OF_RX_DESCRIPTOR_BUFFER 4096
structreceive_descriptor_t;
voidsetup_receive_ring(){
size_treceive_ring_size=NUM_OF_RX_DESCRIPTORS*16;// you can substitute 16 with sizeof(receive_descriptor_t)
receive_descriptor_t*receive_ring=your_favorite_physical_allocator(receive_ring_size);
for(inti=0;i<NUM_OF_RX_DESCRIPTORS;i++){
receive_descriptor_t*descriptor=receive_ring+i;
descriptor->buffer_address=your_favorite_physical_allocator(SIZE_OF_RX_DESCRIPTOR_BUFFER);
}
write_register(REG_RDBAL,((uint64_t)receive_ring)&0xFFFFFFFF);// Base Address Low
write_register(REG_RDBAH,((uint64_t)rx_phys)>>32);// Base Address High
write_register(REG_RDLEN,receive_ring_size);// Ring Size
write_register(REG_RDH,0);// Set it to the first descriptor
write_register(REG_RDT,NUM_OF_RX_DESCRIPTORS-1);// Set it to the last descriptor
// Set the Enable, Long Packet Reception, Broadcast Accept Mode and Size Extenstion bits
// Also set the buffer size. This configuration (BSIZE = 0b11 and BSEX = 1) means 4096 (4kB) buffers
uint32_trctl=RCTL_EN|RCTL_LPE|RCTL_BAM|RCTL_BSEX|(0b11<<RCTL_BSIZE);
write_register(REG_RCTL,rctl);
}
Interrupt Handling
Well, If you want to receive packets, you need a way of knowing when to read them. Thats where interrupts come into play.
To enable Interrupts, simply set the corresponding bit in the Interrupt Mask Set/Read (IMS) register. Recommended interrupts are: RXT0 (to receive interrupts about incoming packets), RXO (to get notified about overruns) and LSC (to get notified about link status changes, e.g. if the user (un)plugs the ethernet cable. In such cases, you should redo the DHCP handshake to connect to that network).
voidenable_interrupts(){
uint32_tims=E1000_IMS_RXT|E1000_IMS_RXO|E1000_IMS_LSC;
write_register(REG_IMS,ims);
}
To check why an interrupt was caused, you can check the Interrupt Cause Read (ICR) register. The ICR register is self clearing, meaning it will get cleared when you read it.
A simple interrupt handler may look something like this:
void_handle_interrupt(){
uint32_tcause=read_register(REG_ICR);// Cleared uppon read
if(cause&IMS_RXT){// Packets received
receive_packets();// Call the function responsible for receiving
// packets and sending them to the network stack
}
if(cause&IMS_LSC){// link status change
// Read the status register and check the LU bit to get the link status
if(read_register(E1000_REG_STATUS)&STATUS_LU){
kprintf("Link change detected: Link up!\n");
}else{
kprintf("Link change detected: Link down!\n");
}
}
}
Packet Transmittion
To transmit a packet, all you have to do is load the data in a free descriptor (or split it if it doesn't fit in one descriptor) and set the EOP bit on the last descriptor.
In this example we are using preallocated buffers, but you could use dynamically allocated ones. Just remember to free it after the packet is transmitted.
voidsend_data(void*data,uint32_tsize,boolEOP){
uint32_ttail=read_register(REG_TDT);
transmit_descriptor_t*tx=transmit_ring+tail;// Get the descriptor the tail is pointing at (next available descriptor)
memcpy(tx->buffer_address,data,size);// Copy the data to the previously allocated buffer
tx->length=size;// Set the length of the descriptor
if(EOP)tx->command|=TX_CMD_EOP|TX_CMD_IFCS;// If its the last one, set EOP
tail=(tail+1)%NUM_OF_TX_DESCRIPTORS;
write_register(REG_TDT,tail);// Increment and write the tail
}
size_tsend(void*data,size_tlength){
size_tsent=0;
// split the data into chunks and send them
for(;sent<length;){
intto_send=min(length-sent,SIZE_OF_TX_DESCRIPTOR_BUFFER);
send_data((void*)((uint64_t)data+sent),to_send,to_send==(length-sent));
sent+=to_send;
}
returnsent;
}
Packet Reception
To receive packets after an interrupt, all you have to do is loop, from the first non-received (by the driver) packet, to the last one. To do that, its a good idea to keep track of the last descriptor the driver read. (You should do this, to reconstruct the packets in the correct order)
uint8_trx_next=0;
voidreceive_packets(){
uint32_tidx=rx_next;
void*buffer=nullptr;// use this to store the buffer.
size_tbuffer_len=0;
while(receive_ring[idx].status&RX_STATUS_DD){
// This descriptor has been filled
booleop=receive_ring[idx].status&RX_STATUS_EOP;
uint16_tlen=receive_ring[idx].length;
void*data=receive_ring[idx].buffer_address;
// Handle multiple-descriptor packets
if(buffer==nullptr){// This is the first descriptor of the packet
buffer=malloc(len);// use your kernel's heap allocator
buffer_len=len;
memcpy(buffer,data,len);
}else{
// Its the next part of the packet, add it to the packet
void*new_buffer=malloc(buffer_len+len);// allocate a bigger buffer
memcpy(new_buffer,buffer,buffer_len);// copy the previous data
free(buffer);// free the old buffer
// copy the new data
memcpy((void*)((uint64_t)new_buffer+buffer_len),data,len);
// Set the new buffer into the variables
buffer_len+=len;
buffer=new_buffer;
}
// Set status to 0 (To give ownership back to the controller)
receive_ring[idx].status=0;
idx=(idx+1)%NUM_OF_RECEIVE_DESCRIPTORS;
if(eop){
// This is the last descriptor of the packet
// Forward the packet to your network stack
stack_receive_packet(buffer,buffer_len);
buffer=nullptr;
buffer_len=0;
}
}
// Give the controller more free descriptors by updating RDT
uint32_ttail=(idx==0)?NUM_OF_RECEIVE_DESCRIPTORS-1:idx-1;
write_register(REG_RDT,tail);
rx_next=idx;
}
Emulation
- VirtualBox (3.1 is all I can personally confirm) supports rather dodgy implementations of an Intel PRO/1000 MT Server (82545EM), Intel PRO/1000 MT Desktop (82540EM), and Intel PRO/1000 T Server (82543GC).
- Bugs:
- The EERD register is unimplemented (you *must* use the 4-wire access method if you want to read from the EEPROM). [01000101 - I had a patch committed to fix this. It will soon be mainstream]
- VMWare Virtual Server 2 emulates/virtualizes an 82545EM-based card rather well.
- QEMU (since 0.10.0) supports an 82540EM-based card and it seems to work OK. It is the default network card since 0.11.0.
- Bugs:
- QEMU does not properly handle the software reset operation (CTRL.RST) in builds prior to June 2009.
- QEMU (version 4.2.1 tested) doesn't seem to support flash memory, instead shifting the IO Register Base Address up.
- IIRC (needs confirmation) Microsoft's Hyper-V supports an 8254x-series card.
Documentation
Example driver