Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Can I monitor AMX and AVX-512 usage with pcm? #763

Answered by rdementi
gcecchi asked this question in Q&A
Discussion options

I have some processes and also some virtual machines potentially using VAX-512 and/or AMX, depending on cpu architecture. Can I verify usage with pcm tools? How?
Thanks,
Gianluca

You must be logged in to vote

Hi Gianluca,

sorry for delayed response. First you need to check if the VM exposes perfmon counters:
https://github.com/intel/pcm/blob/master/doc/FAQ.md#q3

Then there are specific events which one can collect with pcm-raw utility:
EXE.AMX_BUSY
512B events: search the event DB https://github.com/intel/perfmon/blob/main/SPR/events/sapphirerapids_core.json

Hope this helps,

Roman

Replies: 3 comments

Comment options

Hi Gianluca,

sorry for delayed response. First you need to check if the VM exposes perfmon counters:
https://github.com/intel/pcm/blob/master/doc/FAQ.md#q3

Then there are specific events which one can collect with pcm-raw utility:
EXE.AMX_BUSY
512B events: search the event DB https://github.com/intel/perfmon/blob/main/SPR/events/sapphirerapids_core.json

Hope this helps,

Roman

You must be logged in to vote
0 replies
Answer selected by gcecchi
Comment options

Thanks Roman for the reply, for the event name and for the event DB link! My current hw is a 5th Gen intel Scalable Processor (EMR). It was ok to run pcm-raw on the host against the pid of the qemu-system-x86_64 process representing the VM. I'm testing a demo inside the VM, with stable diffusion, based on a Hugging Face model using OpenVINO acceleration. With the command "pcm-raw 1 -pid pid_nr -e EXE.AMX_BUSY" I'm now able to verify that when the VM runs with "-cpu host" the values are not zero for the 64 cores/threads representing the VM vpcus configured on it. At the same time if I run the same VM with "-cpu host,-amx-bf16,-amx-tile,-amx-int8" the event instead remains at zero (and for the image generation 16 seconds are necessary vs 6 seconds when AMX is used). Are the AVX-512 events these three ones: FP_ARITH_INST_RET, FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE, FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE? Can I distinguish between SSE and AVX usage?

You must be logged in to vote
0 replies
Comment options

yes.

SSE/AVX: AFAIK, and as you likely already found out it is only possible to distinguish between the different width of operations with 128B and 256B events.

A different tool based on sampling (https://github.com/aayasin/perf-tools) can report instruction-mix: "./do.py profile -pm 100"

image

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants

AltStyle によって変換されたページ (->オリジナル) /