-
Notifications
You must be signed in to change notification settings - Fork 506
-
I have some processes and also some virtual machines potentially using VAX-512 and/or AMX, depending on cpu architecture. Can I verify usage with pcm tools? How?
Thanks,
Gianluca
Beta Was this translation helpful? Give feedback.
All reactions
Hi Gianluca,
sorry for delayed response. First you need to check if the VM exposes perfmon counters:
https://github.com/intel/pcm/blob/master/doc/FAQ.md#q3
Then there are specific events which one can collect with pcm-raw utility:
EXE.AMX_BUSY
512B events: search the event DB https://github.com/intel/perfmon/blob/main/SPR/events/sapphirerapids_core.json
Hope this helps,
Roman
Replies: 3 comments
-
Hi Gianluca,
sorry for delayed response. First you need to check if the VM exposes perfmon counters:
https://github.com/intel/pcm/blob/master/doc/FAQ.md#q3
Then there are specific events which one can collect with pcm-raw utility:
EXE.AMX_BUSY
512B events: search the event DB https://github.com/intel/perfmon/blob/main/SPR/events/sapphirerapids_core.json
Hope this helps,
Roman
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks Roman for the reply, for the event name and for the event DB link! My current hw is a 5th Gen intel Scalable Processor (EMR). It was ok to run pcm-raw on the host against the pid of the qemu-system-x86_64 process representing the VM. I'm testing a demo inside the VM, with stable diffusion, based on a Hugging Face model using OpenVINO acceleration. With the command "pcm-raw 1 -pid pid_nr -e EXE.AMX_BUSY" I'm now able to verify that when the VM runs with "-cpu host" the values are not zero for the 64 cores/threads representing the VM vpcus configured on it. At the same time if I run the same VM with "-cpu host,-amx-bf16,-amx-tile,-amx-int8" the event instead remains at zero (and for the image generation 16 seconds are necessary vs 6 seconds when AMX is used). Are the AVX-512 events these three ones: FP_ARITH_INST_RET, FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE, FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE? Can I distinguish between SSE and AVX usage?
Beta Was this translation helpful? Give feedback.
All reactions
-
yes.
SSE/AVX: AFAIK, and as you likely already found out it is only possible to distinguish between the different width of operations with 128B and 256B events.
A different tool based on sampling (https://github.com/aayasin/perf-tools) can report instruction-mix: "./do.py profile -pm 100"
Beta Was this translation helpful? Give feedback.