Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

quepas/mPAPI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

81 Commits

Repository files navigation

Deprecated: Please use multi-language pep-talk

mPAPI

Simple MATLAB/Octave API for PAPI (Performance Application Programming Interface).

Properties

  • Hardware counters are measured for the parent and child threads (e.g. when using parallelized functions like sum). Unfortunately, there is no way to differentiate which counters come from which thread.
  • Each function in the MEX-file is locked (once loaded it can't be erased using clear function in MATLAB/Octave environment)

Installation

  1. Install PAPI >=5.5.1
  2. Build mPAPI functions: mPAPI_register, mPAPI_tic, mPAPI_toc, mPAPI_groupEvents, mPAPI_enumNativeEvents, mPAPI_enumPresetEvents with MEX-compatible compiler (the repository contains two bash script for building build.sh and build_all.sh):
mex -I/usr/local/include mPAPI_register.c -L/usr/local/lib/ -lpapi -output mPAPI_register
mex -I/usr/local/include mPAPI_tic.c -L/usr/local/lib/ -lpapi -output mPAPI_tic
mex -I/usr/local/include mPAPI_toc.c -L/usr/local/lib/ -lpapi -output mPAPI_toc
mex -I/usr/local/include mPAPI_groupEvents.c -L/usr/local/lib/ -lpapi -output mPAPI_groupEvents
mex -I/usr/local/include mPAPI_enumNativeEvents.c -L/usr/local/lib/ -lpapi -output mPAPI_enumNativeEvents
mex -I/usr/local/include mPAPI_enumPresetEvents.c -L/usr/local/lib/ -lpapi -output mPAPI_enumPresetEvents

Where directory /usr/local/include contains papi.h header and directory /usr/local/lib/ contains libpapi.so static library.

Usage

Counting

  1. Register hardware performance monitoring counters (PMC) using preset or native events:
  • For the current thread/process:
    ev = mPAPI_register({'FP_ARITH:SCALAR_SINGLE', 'L1D:REPLACEMENT', 'PAPI_L2_ICA'})
  • In multiplex mode for the current thread:
    ev = mPAPI_register({'FP_ARITH:SCALAR_SINGLE', 'L1D:REPLACEMENT', 'PAPI_L2_ICA'}, true)
  • For a specific thread/process by PID:
    ev = mPAPI_register({'PAPI_TOT_INS'}, 1234)
  1. Start counters for the specific event-set(s):
mPAPI_tic(ev)
  1. Read counters measurements
  • For the specific event-set:
    >> mPAPI_toc(ev)
    ans = [0, 1559, 4032]
  • For many event-sets:
    >> mPAPI_toc([ev1, ev2])
    ans = [0, 1559, 4032;
     0, 1450, 3999]
  1. Enumarate all available native or preset PAPI events:
>> mPAPI_enumNativeEvents()
ans = {'ix86arch::UNHALTED_CORE_CYCLES', 'ix86arch::INSTRUCTION_RETIRED', ...}
>> mPAPI_enumPresetEvents()
ans = {'PAPI_L1_DCM', 'PAPI_L1_ICM', ...}
  1. Divide events into compatible groups (that can be measured simultaneously)
>> mPAPI_groupEvents({'PAPI_L1_DCM', 'PAPI_L1_ICM', ...})
ans = {{'PAPI_L1_DCM', 'PAPI_L1_ICM', ...},
 ...
 }

Performance traces

  1. Register sampling event and frequency (using overflow threshold):
ev = mPAPI_trace_register('PAPI_TOT_INS', 1000000, {'PAPI_BR_INS', 'PAPI_L1_DCM'}, 'kernel.trace')

The first argument is a performance event used as time, here we sample the program performance after some number of cycles defined by the second argument — sampling interval in the domain of time. The third argument is a cell array of performance events to measure. The last argument is a location of the trace result.

  1. Start the sub-trace, basically, a performance trace for a given test.
mPAPI_trace_tic(ev, 'R2015b:1:1:sdaxpy:loop:341:1'))

The second argument is a header. For conversion of the trace to CSV with trace2csv script you need to use header convention: env:threads:process:benchmark:version:N:in_process. The fields represents: env — execution environment e.g. R2015b; threads — number of threads, process — number of test execution on different environment instances, benchmark and version — the kernel and the version used (marked in the code with %! pragma version), N — input data size, in_process — test repeition in the same instance of the execution environment.

  1. Perform the test.
  2. Finish the sub-trace
mPAPI_trace_toc(ev)

Problems

In order to set an older version of GCC (newer might not be supported by MATLAB's MEX compiler), run mex as follows:

mex GXX='/usr/bin/gcc-X.X' ... % R2013a/R2015b/R2018b

Comments

  • The number of hardware counters available on the system defines the upper limit of counters you can register using mPAPI_register function.
  • Not all hardware counters can be mixed and used simultaneously (except when in multiplex mode).

AltStyle によって変換されたページ (->オリジナル) /