112 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
2
votes
0
answers
81
views
What is the structure and behavior of the Line-Fill Buffer in modern multicore microarchitectures?
I have a few questions regarding the structure and operation of the LFB. I am specifically interested in how x86 processors, or if more specialization is needed, 64-bit Intel processors post-haswell ...
Advice
0
votes
2
replies
96
views
Determine used micro architecture level in executable linkable format (ELF) on x86
I have some troubles with prebuilt development tools (compiler, linker, ...) on my very old workstation. Because the CPU from my old system only supports the micro architecture level x86-64-v1 it ...
1
vote
1
answer
112
views
How are MMIO requests routed in CPU microarchitecture — cache-bypass on same path or a separate bus/port?
Short background: MMIO regions are typically mapped as uncachable / device memory, so CPU must not treat device registers like normal cacheable DRAM. I’m asking about the microarchitecture routing and ...
0
votes
0
answers
112
views
How are x64 instructions decoded and what is the format of the generated uOps?
I'm looking at computer microarchitecture and understanding how CPUs work in hope to perhaps design my own CPU out of logic gates. I understand that the complex nature of x86-64 instructions having ...
0
votes
1
answer
99
views
how does the accessed bit microcode assist work?
When a mem access occurs and the accessed bit in the PT is 0, it triggers a microcode assist that walks the PT and sets the accessed bit in each level.
In oredr for the assist's code to write the ...
3
votes
0
answers
152
views
How much performance penalty is created by split loads in AVX code?
Motivation
I'm working on improving the performance of a numerical simulation engine. This mainly involves re-organizing the memory layouts and access patterns in several ways, re-writing the inner ...
4
votes
2
answers
201
views
What is causing the store latency in this program?
Given the following C program (MSVC does not optimize away the "work" for me, for other compilers you may need to add an asm statement):
#include <inttypes.h>
#include <stdlib.h>
...
1
vote
1
answer
857
views
How instructions are fetched into modern CPUs(2023)?
I am learning rocketchip these days, and I have noticed the IFU(Instruction Fetch Unit) fetches instructions from ibuf instead of main memory. But I have not seen any codes about how instructions are ...
6
votes
1
answer
306
views
Are any instructions affected by IA32_UARCH_MISC_CTL[DOITM] in existing CPUs?
In the document titled Data Operand Independent Timing Instruction Set Architecture (ISA) Guidance Intel is introducing a new IA32_UARCH_MISC_CTL MSR where toggling bit 0 enables the "Data ...
-1
votes
1
answer
208
views
Verilator does not seem to recognize casez statement, any idea of how to solve this?
I'm trying to code a riscv decoder in system verilog, here's the code :
case(opcode)
7'b0110011: assign r_type = 1'b1;
7'b0010011: assign i_type = 1'b1;
7'b0000011: ...
3
votes
0
answers
179
views
intel alderlake performance degradation after spin wait
I'm tunning my program for low-latency.
I have a tight calculation function calc(); which is using SIMD floating point instructions heavily.
I had test the performance of calc(); using perf command. ...
1
vote
1
answer
170
views
Is port blocked when data is fetching from cache or memory in CPU microarchitecture?
There are two identical memory read ports (port 2 and 3) and one write port (port 4) of Intel Skylake cores. Assuming there are two load instructions issued to port 2 and port 3 parallelly:
When both ...
1
vote
1
answer
194
views
Is assembly code and machine code part of the architecture?
Is assembly code and machine code specified by the architecture?
I know that how you implement the architecture is up to you (the microarchitecture can implement the architecture), but I don't ...
-1
votes
1
answer
730
views
How does the Program read 32 bit from the memory in a single clock cycle?
So, I have this assignment where I need to design a RISC-32-bit 5 stage pipeline. I must support at least 32 (32-bit) instructions and 32 (32-bit) data values. The memory should be read in 1 clock ...
1
vote
1
answer
92
views
Does storing false bool values cost less electrical energy?
Going to sleep tonight I have been wondering: if bool, in C++ for example, is set to false that mean, that all of it’s (8 or 16)bits are set 0(seems to be).
Zero bit, as far as I know, means no ...