You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(8) |
Oct
(17) |
Nov
(29) |
Dec
(30) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(19) |
Feb
(19) |
Mar
(29) |
Apr
(3) |
May
(38) |
Jun
(14) |
Jul
(6) |
Aug
(7) |
Sep
(12) |
Oct
(6) |
Nov
(9) |
Dec
|
| 2003 |
Jan
(6) |
Feb
(5) |
Mar
(8) |
Apr
(10) |
May
(4) |
Jun
(11) |
Jul
(5) |
Aug
(3) |
Sep
(12) |
Oct
(1) |
Nov
(9) |
Dec
(45) |
| 2004 |
Jan
(7) |
Feb
(6) |
Mar
(4) |
Apr
(7) |
May
(7) |
Jun
(30) |
Jul
(7) |
Aug
(6) |
Sep
(1) |
Oct
(4) |
Nov
(18) |
Dec
(25) |
| 2005 |
Jan
(11) |
Feb
(10) |
Mar
(3) |
Apr
(7) |
May
|
Jun
|
Jul
(1) |
Aug
(29) |
Sep
(6) |
Oct
(8) |
Nov
(2) |
Dec
(5) |
| 2006 |
Jan
|
Feb
(16) |
Mar
(2) |
Apr
(9) |
May
(15) |
Jun
(24) |
Jul
(10) |
Aug
(39) |
Sep
(20) |
Oct
(8) |
Nov
(30) |
Dec
(28) |
| 2007 |
Jan
(1) |
Feb
(19) |
Mar
(11) |
Apr
(3) |
May
(12) |
Jun
(7) |
Jul
(20) |
Aug
(9) |
Sep
(7) |
Oct
(7) |
Nov
(8) |
Dec
(6) |
| 2008 |
Jan
(3) |
Feb
(8) |
Mar
|
Apr
|
May
(7) |
Jun
(16) |
Jul
(38) |
Aug
(11) |
Sep
(6) |
Oct
(2) |
Nov
|
Dec
(4) |
| 2009 |
Jan
(6) |
Feb
(25) |
Mar
(13) |
Apr
(5) |
May
|
Jun
|
Jul
(1) |
Aug
(8) |
Sep
(16) |
Oct
(17) |
Nov
(2) |
Dec
(1) |
| 2010 |
Jan
(3) |
Feb
(3) |
Mar
(2) |
Apr
(5) |
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
(16) |
Nov
(53) |
Dec
(7) |
| 2011 |
Jan
(10) |
Feb
(37) |
Mar
(30) |
Apr
(12) |
May
(5) |
Jun
(14) |
Jul
(7) |
Aug
(8) |
Sep
(37) |
Oct
(3) |
Nov
(5) |
Dec
(60) |
| 2012 |
Jan
(25) |
Feb
(5) |
Mar
(4) |
Apr
(7) |
May
(12) |
Jun
(28) |
Jul
(28) |
Aug
(2) |
Sep
(5) |
Oct
(6) |
Nov
|
Dec
(17) |
| 2013 |
Jan
(18) |
Feb
(10) |
Mar
(30) |
Apr
(21) |
May
|
Jun
(10) |
Jul
(8) |
Aug
|
Sep
(39) |
Oct
(54) |
Nov
(8) |
Dec
(6) |
| 2014 |
Jan
(17) |
Feb
(14) |
Mar
(16) |
Apr
(67) |
May
(2) |
Jun
(8) |
Jul
(7) |
Aug
(9) |
Sep
(6) |
Oct
(9) |
Nov
(12) |
Dec
|
| 2015 |
Jan
(5) |
Feb
(9) |
Mar
(1) |
Apr
(2) |
May
|
Jun
(1) |
Jul
(2) |
Aug
(6) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
(3) |
| 2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
(22) |
Aug
|
Sep
(1) |
Oct
|
Nov
(21) |
Dec
|
| 2017 |
Jan
(20) |
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
(8) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
1
|
2
(5) |
|
3
|
4
|
5
|
6
|
7
|
8
|
9
(1) |
|
10
|
11
|
12
|
13
|
14
|
15
|
16
(1) |
|
17
|
18
|
19
|
20
|
21
|
22
|
23
|
|
24
|
25
|
26
(1) |
27
|
28
|
29
|
30
|
Guys, I've released 3.11.22. Other than bugfixes, its main extension is that the max complex blocking factor is now tuned independently of the real. The performance difference on most machines is minuscule, but it could be key on machines that lack an L3. Cheers, Clint ATLAS 3.11.22 released 11/26/13, highlights of changes from 3.11.21: * Changed it so complex block-major gemm installed for non-default installs * Changed it so ARM block-major gemm kernels default to HARDFP ABI * Added NB tuning for complex access-major gemm * Uglied up atlas_install to avoid gcc's unalterable BS warnings * Updated archdefs for Corei364AVXMAC * Plugged several one-time mem leaks in lanbsrch * Added basic config support for cross-compilation * Updated complex cmat2blk to correct prototype & type def for complex * Rakib wrote cmat2blk complex * Changed emit_uamm to handle multiple installs * Boatload of TI_C99_BM accelerator patches from Tony Castaldo
Guys, I have released 3.11.21, which fixes some widespread K-cleanup bugs. I have also gotten the new access-major gemm working with archdefs, though I have added only archdefs for one machine. Right now, even using archdefs does a bunch of unnecessary timings; some of these can be eliminated later once I have finalized the ammm tuning process. Cheers, Clint ATLAS 3.11.21 released 11/16/13, highlights of changes from 3.11.20: * Made it so AMMM result files are included in archdefs * Added ammm archdefs for Corei264AVX * Fixed error in ammm (all precisions) for K-cleanup -- ********************************************************************** ** R. Clint Whaley, PhD * Assoc Prof, LSU * www.csc.lsu.edu/~whaley ** **********************************************************************
OK, here are the numbers for 3.5 Ghz Haswell. This machine is just ridiculous. A kernel I wrote for the AMD gets > 90% peak, and the peak is eye-poppingly high. The block major does so poorly because I don't have an fma3 kernel for it. Cheers, Clint ARCH = Corei364AVXMAC ARCHDEFS = -DATL_OS_Linux -DATL_ARCH_Corei3 -DATL_CPUMHZ=3500 -DATL_AVXMAC -DATL_AVX -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664 drteeth>./xdmmtst_amm2 -N 2000 8000 2000 ; ./xzmmtst_amm2 -N 2000 8000 2000 ; ./xsmmtst_amm2 -N 2000 8000 2000 ; ./xcmmtst_amm2 -N 2000 8000 2000 TEST TA TB M N K alpha beta Time Mflop SpUp PASS ==== == == === === === ===== ===== ====== ===== ==== ==== 1 N N 2000 2000 2000 1.0 1.0 0.63 25378.8 1.00 --- 1 N N 2000 2000 2000 1.0 1.0 0.33 48748.4 1.92 YES 2 N N 4000 4000 4000 1.0 1.0 4.96 25813.8 1.00 --- 2 N N 4000 4000 4000 1.0 1.0 2.56 50022.4 1.94 YES 3 N N 6000 6000 6000 1.0 1.0 16.66 25924.6 1.00 --- 3 N N 6000 6000 6000 1.0 1.0 8.68 49771.5 1.92 YES 4 N N 8000 8000 8000 1.0 1.0 39.44 25964.5 1.00 --- 4 N N 8000 8000 8000 1.0 1.0 20.26 50533.5 1.95 YES NTEST=4, NUMBER PASSED=4, NUMBER FAILURES=0 99.096u 0.612s 1:39.90 99.7% 0+0k 0+0io 0pf+0w TEST TA TB M N K ralph ialph rbeta ibeta Time Mflop SpUp PASS ==== == == === === === ===== ===== ===== ===== ====== ===== ==== ==== 1 N N 2000 2000 2000 1.0 0.0 1.0 0.0 2.56 25012.4 1.00 --- 1 N N 2000 2000 2000 1.0 0.0 1.0 0.0 1.28 50087.6 2.00 YES 2 N N 4000 4000 4000 1.0 0.0 1.0 0.0 23.01 22255.7 1.00 --- 2 N N 4000 4000 4000 1.0 0.0 1.0 0.0 10.22 50086.2 2.25 YES 3 N N 6000 6000 6000 1.0 0.0 1.0 0.0 79.11 21842.5 1.00 --- 3 N N 6000 6000 6000 1.0 0.0 1.0 0.0 34.33 50332.4 2.30 YES 4 N N 8000 8000 8000 1.0 0.0 1.0 0.0 189.00 21671.9 1.00 --- 4 N N 8000 8000 8000 1.0 0.0 1.0 0.0 81.78 50088.6 2.31 YES NTEST=4, NUMBER PASSED=4, NUMBER FAILURES=0 429.152u 2.068s 7:12.07 99.8% 0+0k 0+0io 0pf+0w TEST TA TB M N K alpha beta Time Mflop SpUp PASS ==== == == === === === ===== ===== ====== ===== ==== ==== 1 N N 2000 2000 2000 1.0 1.0 0.33 48341.8 1.00 --- 1 N N 2000 2000 2000 1.0 1.0 0.16 97504.0 2.02 YES 2 N N 4000 4000 4000 1.0 1.0 2.59 49328.7 1.00 --- 2 N N 4000 4000 4000 1.0 1.0 1.26 101739.0 2.06 YES 3 N N 6000 6000 6000 1.0 1.0 8.75 49384.9 1.00 --- 3 N N 6000 6000 6000 1.0 1.0 4.21 102530.8 2.08 YES 4 N N 8000 8000 8000 1.0 1.0 20.65 49579.9 1.00 --- 4 N N 8000 8000 8000 1.0 1.0 9.90 103476.6 2.09 YES NTEST=4, NUMBER PASSED=4, NUMBER FAILURES=0 TEST TA TB M N K ralph ialph rbeta ibeta Time Mflop SpUp PASS ==== == == === === === ===== ===== ===== ===== ====== ===== ==== ==== 1 N N 2000 2000 2000 1.0 0.0 1.0 0.0 1.33 48084.7 1.00 --- 1 N N 2000 2000 2000 1.0 0.0 1.0 0.0 0.64 100767.1 2.10 YES 2 N N 4000 4000 4000 1.0 0.0 1.0 0.0 10.74 47658.0 1.00 --- 2 N N 4000 4000 4000 1.0 0.0 1.0 0.0 4.97 103009.8 2.16 YES 3 N N 6000 6000 6000 1.0 0.0 1.0 0.0 40.99 42160.7 1.00 --- 3 N N 6000 6000 6000 1.0 0.0 1.0 0.0 16.86 102480.4 2.43 YES 4 N N 8000 8000 8000 1.0 0.0 1.0 0.0 97.79 41887.5 1.00 --- 4 N N 8000 8000 8000 1.0 0.0 1.0 0.0 39.11 104731.5 2.50 YES NTEST=4, NUMBER PASSED=4, NUMBER FAILURES=0
Guys, I have released 3.11.20. I plugged a memory leak in threaded QR, and modernized lanbsrch to work with the new framework. After this, I could rerun the archdefs for the lapack header files, and all this together seemed to fix the threaded QR performance problems I was seeing. Unfortunately, to get the improved performance you need to redo the archdefs for lapack if you aren't on a Corei264AVX (my desktop arch), which is a bit of a pain. The quickest approach is to untar your archdef tarfile in ATLAS/CONFIG/ARCHS, delete the files in the lapack/gcc directory, and then remake the tar. It's probably not worth messing with for most folks, but if QR is very important it might help. Cheers, Clint ATLAS 3.11.20 released 11/02/13, highlights of changes from 3.11.19: * Fixed possible memory leak in threaded QR * Updated lanbsrch to work with ammm * Updated Corei264AVX lapack header archdefs to work with ammm
Guys, I have released 3.11.20. I plugged a memory leak in threaded QR, and modernized lanbsrch to work with the new framework. After this, I could rerun the archdefs for the lapack header files, and all this together seemed to fix the threaded QR performance problems I was seeing. Unfortunately, to get the improved performance you need to redo the archdefs for lapack if you aren't on a Corei264AVX (my desktop arch), which is a bit of a pain. The quickest approach is to untar your archdef tarfile in ATLAS/CONFIG/ARCHS, delete the files in the lapack/gcc directory, and then remake the tar. It's probably not worth messing with for most folks, but if QR is very important it might help. Cheers, Clint ATLAS 3.11.20 released 11/02/13, highlights of changes from 3.11.19: * Fixed possible memory leak in threaded QR * Updated lanbsrch to work with ammm * Updated Corei264AVX lapack header archdefs to work with ammm
Guys, I've released 3.11.19. The main work is in reducing the amount of workspace the new framework allocates. I had first removed the dependencies on the block-major stuff in the parallel BLAS, which left them working in a simplified way. Then, I noticed that I had a parallel performance regression in QR, which is probably related to not re-tuning NB for the new framework. I started a big tuning job, and had to hard reset the machine due to swapping making it impossible to type. This was my big clue I needed to reduce workspace being used in the new GEMM. I have still not ensured that the parallel stuff is as fast as it should be, will return to that later. Cheers, Clint ATLAS 3.11.19 released 11/01/13, highlights of changes from 3.11.18: * Removed block-major GEMM dep from all threading code * Performed recursion for K > 3000 in order to put a ceiling on workspace in ammm * Added ammm MNK loop order to save workspace for non-square GEMM
On Fri, Nov 1, 2013 at 8:00 PM, <mic...@th...>wrote: > Here is my cpuinfo: > > cat /proc/cpuinfo > > processor : 0 > > vendor_id : AuthenticAMD > > cpu family : 15 > > model : 65 > > model name : Dual-Core AMD Opteron(tm) Processor 8220 > > stepping : 3 > > cpu MHz : 1000.000 > > So, here is the problem -- 8220 Opteron should run at 2.8GHz not at 1GHz. That suggest that throttling is in effect. Dmitri. --
Here is my cpuinfo:
cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 1
siblings : 2
core id : 0
cpu cores : 2
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
processor : 2
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 2
siblings : 2
core id : 0
cpu cores : 2
apicid : 4
initial apicid : 4
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
processor : 3
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 3
siblings : 2
core id : 0
cpu cores : 2
apicid : 6
initial apicid : 6
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
processor : 4
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
processor : 5
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 1
siblings : 2
core id : 1
cpu cores : 2
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
processor : 6
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 2
siblings : 2
core id : 1
cpu cores : 2
apicid : 5
initial apicid : 5
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
processor : 7
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8220
stepping : 3
cpu MHz : 1000.000
cache size : 1024 KB
physical id : 3
siblings : 2
core id : 1
cpu cores : 2
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 2009.31
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
Thanks,
Mike
From: "Dmitri A. Sergatskov" <das...@gm...<mailto:das...@gm...>>
Reply-To: "List for developer discussion, NOT SUPPORT." <mat...@li...<mailto:mat...@li...>>
Date: Wednesday, October 30, 2013 4:47 PM
To: "List for developer discussion, NOT SUPPORT." <mat...@li...<mailto:mat...@li...>>
Subject: Re: [atlas-devel] Atlas 3.1.10 not even trying to build on Suse
Please post some hardware info...
At least results of "cat /proc/cpuinfo"
("cpupower frequency-info" would be nice too)
Most likely you have a M/B that enables throttling no matter what.
(I have one of those.)
Dmitri.
--
On Wed, Oct 30, 2013 at 3:15 PM, <mic...@th...<mailto:mic...@th...>> wrote:
CPU throttling on Suse linux enterprise server 11
We have this on the server:
cat /proc/acpi/processor/*/info | grep throttling
throttling control: no
throttling control: no
throttling control: no
throttling control: no
throttling control: no
throttling control: no
throttling control: no
throttling control: no
But I get from Configure on Atlas 3.10.0:
CPU Throttling apparently enabled!
Aborting...
Not sure how to resolve. Any ideas appreciated.
Thanks,
Mike
------------------------------------------------------------------------------
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
_______________________________________________
Math-atlas-devel mailing list
Mat...@li...<mailto:Mat...@li...>
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel