Bench++

  a000010.h    class duration
  a000012.cpp  implementation of class duration for UNIX systems
               using times(3), Windows NT using GetProcessTimes(),
               Windows or Windows/95 using clock(), and Macintosh
               using Microseconds().

               Note: Please send in other implementations

  a000021.h    class break_optimization :  optimization control
  a000022.cpp  implementation of class break_optimization

  a000031.h    class iteration :           Iteration control
  a000032.cpp  implementation of class iteration

  a000041.h    procedure piwg_io declaration : output

  a000042.cpp  implementation of procedure piwg_io

  a000047.h   class piwg_timer_generic
  a000048.cpp implementaion of class piwg_timer_generic

  a000095.cpp A program to read a log file from a run and
              produce a compact listing in "bench.tbl".

  a000096.cpp A program to read multiple "bench.tbl" files and
              produce a performance comparison summary.

  a000097.cpp A program to read a "bench.tbl" file and produce
              various interesting performance ratios, based on
              the rules in the "RATIOS" file.

  a000099.cpp  This is a skeleton procedure that can be copied and edited
                 to construct more tests. DO NOT COMPILE THIS. It is for 
                 editing to make more tests.

  a000100.cpp  This is a top level procedure that calls all the other
                 executable timing tests. It is useful if there is no "make"
                 facility available.

  p000000.h    Header file for the "p" tests

  p000000.cpp  Separate compilation of the procedures for the "p" tests
               to prevent inlining.

  s000005.h    Header file for the Stepanov "s" tests.

  FIRST OF EXECUTION TESTS.

  a000090.cpp  Measure clock resolution by second differences
  a000091.cpp  Dhrystone
  a000092.cpp  Whetstone
  a000094a..k  Hennesy benchmarks

                 This group of tests measures the performance of some
                 real (useful) C++ code, including a tracker algorithm,
                 an Orbit calculation, a Kalman filter, and a Centroid
                 algorithm.  Here is where other small useful benchmarks
                 should be added.  Please send ideas to 
                 "joseph.orost@att.com".
  b000002b.cpp Tracker: float
  b000003b.cpp Tracker: double
  b000004b.cpp Tracker: float & int
  b000010.cpp  Orbit
  b000011.cpp  Kalman
  b000013.cpp  Centroid

                 This group of tests measures dynamic allocation related
                 timing.
  d000001.cpp  malloc & free: 1000 ints
  d000002.cpp  malloc & init & free: 1000 ints
  d000003.cpp  new & delete: 1000 ints
  d000004.cpp  new & init & delete: 1000 ints
  d000005.cpp  alloca: 1000 ints        (optional test)
  d000006.cpp  alloca & init: 1000 ints (optional test)

                 This group of tests measures exception related timing.
                 For historical reasons, e000005 is missing.
  e000001.cpp  Local exception caught
  e000002.cpp  Class method exception caught
  e000003.cpp  Procedure exception caught: 3-deep
  e000004.cpp  Procedure exception caught: 4-deep
  e000006.cpp  Declared Procedure exception caught: 4-deep
  e000007.cpp  Procedure exception caught: 4-deep re-thrown at each level
  e000008.cpp  Procedure exception 4-deep: Implemented using setjmp/longjmp

                 This group of tests measures coding style related timing.
  f000001.cpp  Boolean assignment
  f000002.cpp  Boolean if
  f000003.cpp  2-way if/else
  f000004.cpp  2-way switch
  f000005.cpp  10-way if/else
  f000006.cpp  10-way switch
  f000007.cpp  10-way sparse switch
  f000008.cpp  10-way virtual function call

                 This group of tests measures I/O related timing.
  g000001.cpp  iostream.getline: 20 char buffer
  g000002.cpp  iostream.>> : 20 chars in loop
  g000003.cpp  iostream.<< : 20 char buffer
  g000004.cpp  iostream.<< : 20 chars in loop
  g000005.cpp  istrstream.>> : int
  g000006.cpp  istrstream.>> : float
  g000007.cpp  fstream.open/fstream.close

                 This group of tests measures machine level features.
  h000001.cpp  packed bit arrays
  h000002.cpp  unpacked bit arrays
  h000003.cpp  packed bit ops in loop
  h000004.cpp  unpacked bit ops in loop
  h000005.cpp  int conversion
  h000006.cpp  10-float conversion
  h000007.cpp  bit-fields
  h000008.cpp  bit-fields and packed bit array
  h000009.cpp  pack and unpack class objects

                 This group of tests measures loop overhead related timing.
  l000001.cpp  "for" loop
  l000002.cpp  "while" loop
  l000003.cpp  inf. loop w/break
  l000004.cpp  5-iteration loop

                 This group of tests measures optimizer performance.
  o000001a.cpp Constant Propagation (including math functions)
  o000001b.cpp " Hand Optimized
  o000002a.cpp Local Common Sub-expression (including math functions)
  o000002b.cpp " Hand Optimized
  o000003a.cpp Global Common Sub-expression
  o000003b.cpp " Hand Optimized
  o000004a.cpp Unnecessary Copy
  o000004b.cpp " Hand Optimized
  o000005a.cpp Code Motion (including math functions)
  o000005b.cpp " Hand Optimized
  o000006a.cpp Induction Variable
  o000006b.cpp " Hand Optimized
  o000007a.cpp Reduction in Strength (including math functions)
  o000007b.cpp " Hand Optimized
  o000008a.cpp Dead Code
  o000008b.cpp " Hand Optimized
  o000009a.cpp Loop Jamming
  o000009b.cpp " Hand Optimized
  o000010a.cpp Redundant Code
  o000010b.cpp " Hand Optimized
  o000011a.cpp Unreachable Code
  o000011b.cpp " Hand Optimized
  o000012a.cpp String Ops
  o000012b.cpp " Hand Optimized

                 This group of tests measures procedure call related timing.
                 There is no test p000009, nor 14 thru 19.
  p000001.cpp  Procedure Call: No Args
  p000002.cpp  Procedure Call: No Args: Catches Exceptions
  p000003.cpp  Static Class Method Call: No Args: Catches Exceptions
  p000004.cpp  Inline Procedure Call: No Args
  p000005.cpp  Static Class Method Call: 1-int Arg: Catches Exceptions
  p000006.cpp  Static Class Method Call: 1-int *Arg: Catches Exceptions
  p000007.cpp  Static Class Method Call: 1-int &Arg: Catches Exceptions
  p000008.cpp  Procedure Call: No Parameters: Called thru pointer, Catches Excep
tions
  p000010.cpp  Procedure Call: 10-int Args: Catches Exceptions
  p000011.cpp  Procedure Call: 20-int Args: Catches Exceptions
  p000012.cpp  Procedure Call: 10-(3-int) Args: Catches Exceptions
  p000013.cpp  Procedure Call: 20-(3-int) Args: Catches Exceptions
  p000020.cpp  Class Method Call: 1-"this" Arg: Catches Exceptions
  p000021.cpp  Virtual Class Method Call: 1-"this" Arg: Catches Exceptions
  p000022.cpp  Virtual Const Class Method Call: 1-"this" Arg: Catches Exceptions
  p000023.cpp  Same as p000022: called in loop to see if lookup is optimized

                 This group of tests measures object oriented style vs.
                 C style.
  s000001a.cpp Max: C++ Style
  s000001b.cpp Max: C Style
  s000002a.cpp Matrix: C++ Style
  s000002b.cpp Matrix: C Style
  s000003a.cpp Iterator: C++ Style
  s000003b.cpp Iterator: C Style
  s000004a.cpp Complex: C++ Style
  s000004b.cpp Complex: C Style
  s000005a.cpp Stepanov: C++ Style Abstraction Level 12
  s000005b.cpp Stepanov: C++ Style Abstraction Level 11
  s000005c.cpp Stepanov: C++ Style Abstraction Level 10
  s000005d.cpp Stepanov: C++ Style Abstraction Level 9
  s000005e.cpp Stepanov: C++ Style Abstraction Level 8
  s000005f.cpp Stepanov: C++ Style Abstraction Level 7
  s000005g.cpp Stepanov: C++ Style Abstraction Level 6
  s000005h.cpp Stepanov: C++ Style Abstraction Level 5
  s000005i.cpp Stepanov: C++ Style Abstraction Level 4
  s000005j.cpp Stepanov: C++ Style Abstraction Level 3
  s000005k.cpp Stepanov: C++ Style Abstraction Level 2
  s000005l.cpp Stepanov: C++ Style Abstraction Level 1
  s000005m.cpp Stepanov: C++ Style Abstraction Level 0

P4 xeon 1.7GHz vs UP1100 21264 700MHz (RedHat?'s gcc-2.96)

          RELATIVE TIMES ..........
TEST NAME         
                  
          -O2 -O2 
--------- --- --- 
A000091   1.00 1.47 
A000092   1.00 2.73 
A000094a  1.00 1.90 
A000094b  1.00 1.79 
A000094c  1.00 1.39 

A000094d  1.00 1.44 
A000094e  1.00 1.27 
A000094f  1.00 2.24 
A000094g  1.00 1.24 
A000094h  1.00 1.28 

A000094i  1.00 1.62 
A000094j  1.00 1.40 
A000094k  1.00 1.17 
B000002b  1.00 7.20 
B000003b  1.00 5.23 

B000004b  1.00 9.06 
B000010   1.00 1.08 
B000011   1.00 2.36 
B000013   1.00 1.60 
D000001   1.00 1.04 

D000002   1.00 2.19 
D000003   1.00 1.04 
D000004   1.00 2.16 
D000005   1.00 2.46 
D000006   1.00 2.37 

E000001   1.00 1.80 
E000002   1.00 1.63 
E000003   1.00 2.76 
E000004   1.00 3.20 
E000007   1.00 3.90 

E000008   1.00 2.86 
F000001   1.00 inf 
F000002   1.00 3.33 
F000003   1.00 --   
F000004   1.00 4.76 

F000005   1.00 2.05 
F000006   1.00 2.77 
F000007   1.00 8.29 
F000008   1.00 0.92 
G000001   1.00 1.02 

G000002   1.00 1.65 
G000003   1.00 10.26 
G000004   1.00 3.11 
G000005   1.00 1.05 
G000006   1.00 2.94 

G000007   1.00 0.94 
H000001   1.00 1.13 
H000002   1.00 1.55 
H000003   1.00 1.37 
H000004   1.00 2.88 

H000005   1.00 --   
H000006   1.00 0.49 
H000007   1.00 0.26 
H000008   1.00 8.58 
H000009   1.00 0.95 

L000001   1.00 2.10 
L000002   1.00 2.07 
L000003   1.00 1.80 
L000004   1.00 3.80 
O000001a  1.00 3.24 

O000001b  1.00 3.32 
O000002a  1.00 6.22 
O000002b  1.00 5.13 
O000003a  1.00 1.67 
O000003b  1.00 1.50 

O000004a  1.00 0.47 
O000004b  1.00 0.40 
O000005a  1.00 20.60 
O000005b  1.00 5.87 
O000006a  1.00 3.28 

O000006b  1.00 3.13 
O000007a  1.00 1.20 
O000007b  1.00 1.70 
O000008a  1.00 0.72 
O000008b  1.00 0.41 

O000009a  1.00 1.35 
O000009b  1.00 1.61 
O000010a  1.00 0.75 
O000010b  1.00 1.43 
O000011a  1.00 0.14 

O000011b  1.00 0.25 
O000012a  1.00 0.25 
O000012b  1.00 0.61 
P000001   1.00 0.64 
P000002   1.00 1.57 

P000003   1.00 1.78 
P000004   1.00 inf 
P000005   1.00 2.66 
P000006   1.00 1.99 
P000007   1.00 2.06 

P000008   1.00 2.76 
P000010   1.00 1.34 
P000011   1.00 3.23 
P000012   1.00 0.24 
P000013   1.00 0.23 

P000020   1.00 1.70 
P000021   1.00 34.29 
P000022   1.00 14.16 
P000023   1.00 1.00 
S000001a  1.00 1.32 

S000001b  1.00 4.43 
S000002a  1.00 2.65 
S000002b  1.00 2.70 
S000003a  1.00 --   
S000003b  1.00 --   

S000004a  1.00 --   
S000004b  1.00 --   
S000005a  1.00 1.22 
S000005b  1.00 1.21 
S000005c  1.00 1.71 

S000005d  1.00 1.19 
S000005e  1.00 1.49 
S000005f  1.00 1.20 
S000005g  1.00 1.49 
S000005h  1.00 1.47 

S000005i  1.00 1.54 
S000005j  1.00 1.48 
S000005k  1.00 1.82 
S000005l  1.00 1.48 
S000005m  1.00 1.91 

P4 xeon 1.7GHz vs UP1100 21264 700MHz (RedHat?'s gcc-2.96 vs Compaq's cxx)

          RELATIVE TIMES ..........
TEST NAME         
                  
          -O2 -O2 
--------- --- --- 
A000091   1.00 1.32 
A000092   1.00 0.82 
A000094a  1.00 2.76 
A000094b  1.00 2.32 
A000094c  1.00 1.92 

A000094d  1.00 0.82 
A000094e  1.00 1.05 
A000094f  1.00 2.13 
A000094g  1.00 1.17 
A000094h  1.00 1.12 

A000094i  1.00 2.57 
A000094j  1.00 0.43 
A000094k  1.00 0.78 
B000002b  1.00 1.50 
B000003b  1.00 1.24 

B000004b  1.00 1.86 
B000010   1.00 0.20 
B000011   1.00 0.85 
B000013   1.00 0.81 
D000001   1.00 1.09 

D000002   1.00 1.42 
D000003   1.00 1.15 
D000004   1.00 1.43 
D000005   1.00 0.78 
D000006   1.00 1.51 

E000001   1.00 0.26 
E000002   1.00 0.15 
E000003   1.00 0.15 
E000004   1.00 0.14 
E000007   1.00 0.08 

E000008   1.00 2.04 
F000001   1.00 inf 
F000002   1.00 2.27 
F000003   1.00 --   
F000004   1.00 22.95 

F000005   1.00 2.93 
F000006   1.00 --   
F000007   1.00 2.53 
F000008   1.00 --   
G000001   1.00 0.95 

G000002   1.00 0.60 
G000003   1.00 6.88 
G000004   1.00 2.45 
G000005   1.00 1.39 
G000006   1.00 3.75 

G000007   1.00 0.80 
H000001   1.00 0.75 
H000002   1.00 0.98 
H000003   1.00 0.72 
H000004   1.00 0.85 

H000005   1.00 --   
H000006   1.00 1.93 
H000007   1.00 0.13 
H000008   1.00 2.36 
H000009   1.00 0.41 

L000001   1.00 --   
L000002   1.00 --   
L000003   1.00 --   
L000004   1.00 0.92 
O000001a  1.00 2.89 

O000001b  1.00 3.06 
O000002a  1.00 0.53 
O000002b  1.00 0.47 
O000003a  1.00 0.76 
O000003b  1.00 0.75 

O000004a  1.00 --   
O000004b  1.00 --   
O000005a  1.00 1.45 
O000005b  1.00 1.66 
O000006a  1.00 2.86 

O000006b  1.00 0.88 
O000007a  1.00 0.70 
O000007b  1.00 0.89 
O000008a  1.00 --   
O000008b  1.00 --   

O000009a  1.00 0.79 
O000009b  1.00 0.91 
O000010a  1.00 0.31 
O000010b  1.00 --   
O000011a  1.00 --   

O000011b  1.00 --   
O000012a  1.00 0.24 
O000012b  1.00 0.47 
P000001   1.00 0.05 
P000002   1.00 77.00 

P000003   1.00 77.87 
P000004   1.00 inf 
P000005   1.00 96.87 
P000006   1.00 65.01 
P000007   1.00 64.51 

P000008   1.00 47.39 
P000010   1.00 34.26 
P000011   1.00 54.41 
P000012   1.00 2.12 
P000013   1.00 1.17 

P000020   1.00 65.23 
P000021   1.00 250.48 
P000022   1.00 289.11 
P000023   1.00 54.53 
S000001a  1.00 2.72 

S000001b  1.00 3.74 
S000002a  1.00 1.53 
S000002b  1.00 1.73 
S000003a  1.00 1.74 
S000003b  1.00 1.80 

S000004a  1.00 0.07 
S000004b  1.00 1.22 
S000005a  1.00 1.22 
S000005b  1.00 1.25 
S000005c  1.00 1.66 

S000005d  1.00 0.85 
S000005e  1.00 1.49 
S000005f  1.00 1.20 
S000005g  1.00 1.49 
S000005h  1.00 1.49 

S000005i  1.00 1.55 
S000005j  1.00 1.49 
S000005k  1.00 1.82 
S000005l  1.00 1.49 
S000005m  1.00 1.91 


stepanov bench

To: ecell3-devel@bioinformatics.org, ecell-devel@eg.e-cell.org
Subject: [Ecell3-devel] C++ compiler benchmark
From: Kouichi Takahashi <shafi@e-cell.org>
Delivered-To: shafi@e-cell.org
Message-Id: <200102161516.AAA11071@mail.sfc.keio.ac.jp>
Date: Sat, 17 Feb 2001 00:19:47 +0900
User-Agent: Wanderlust/2.4.0 (Rio) SEMI/1.13.7 (Awazu) CLIME/1.13.6
 (Ãæ¥Î¾±) MULE XEmacs/21.1 (patch 12) (Channel Islands)
 (i386-redhat-linux)
Organization: E-CELL Project
MIME-Version: 1.0 (generated by SEMI 1.13.7 - "Awazu")
X-Dispatcher: imput version 20000414(IM141)
X-Mailman-Version: 1.0b10
X-BeenThere: ecell3-devel@bioinformatics.org
X-UIDL: d2490be6543d7278ffad9be0f7449751

hi there ecell developers,


I did a simple C++ benchmarking to find out answers to the following 
two questions:

- Should ecell3 continue to support and be tested on Linux/alpha?
- If so, should ecell3 support cxx, Compaq's C++ compiler?


Although alpha is known to beat IA32 completely if it comes to simple
numerics (like LAPAK in fortran), it's performance in C++ was
uncertain.  Unlike fortran, intensive pointer operations and language
level abstractions characterize C++.

I've used KAI's version of Stepanov Benchmark(Version 1.2), which is
available at ftp://ftp.kai.com/pub/benchmarks/, to measure
that kinds of performance.

I ran the bench on these two machines:

- dual alpha 21264A 677MHz, 2GB RAM (baobab.e-cell.org)
  egcs-1.1.2 and cxx-6.3.9.5
- Pentium-III 850MHz, 768MB RAM    (oak.e-cell.org)
  gcc-2.96 (RH7)
- Celelon 550MHz                    (potos.e-cell.org)

(baobab is a master node of E-CELL Project's primary beowulf.
 oak is our secondary computation server.)

compilers:

- egcs-1.1.2, gcc-2.95.2 and 2.96 (-O3)
- cxx 6.3.9.5 (-fast -lcpml -O3)


The benchmark consists of 12 variations of simple for loops.
test 0 is a simple fortran-like one and 1-12 are STL ones.
(test1: light abstraction <-> test 12: heavy abstraction)


Summary of results:


- alpha + cxx is incredibly fast.
  (3.5x faster than alpha + egcs-1.1.2)

- g++ is not good at alpha

- gcc 2.95 is fairly good at STL 
  (abstraction penalty is 0.83, which means that STL loops run faster
   than the raw for loop.)
  
- RH7's gcc 2.96 is disappointing at STL (abstraction penalty: 1.48)
  (I suspect this is a temporary regression and will be corrected in
   gcc-3.0, since they are using the bench for code quality control.)  


Conclusions:

- platform: support alpha, definitely with cxx 

- compiler: cxx or gcc-2.95
            (wait for gcc-3.0 and run the bench again)


And don't forget, this is just a benchmark on C++ pointer abstraction
using simple for loops.  Undoubtedly alpha conquers IA32 in more
numerically intensive functions like Substance::transit() and
Reactor::react() in ecell.




alpha + cxx:
---------------------------------------------------
test      absolute   additions      ratio with
number    time       per second     test0

 0        0.09sec    556.84M         1.00
 1        0.09sec    582.15M         0.96
 2        0.09sec    582.15M         0.96
 3        0.09sec    582.15M         0.96
 4        0.09sec    582.15M         0.96
 5        0.09sec    582.15M         0.96
 6        0.09sec    582.15M         0.96
 7        0.09sec    582.15M         0.96
 8        0.09sec    582.15M         0.96
 9        0.09sec    582.15M         0.96
10        0.09sec    582.15M         0.96
11        0.09sec    582.15M         0.96
12        0.09sec    582.15M         0.96
mean:     0.09sec    580.17M         0.96

Total absolute time: 1.12 sec

Abstraction Penalty: 0.96
---------------------------------------------------



alpha + egcs-1.1.2
---------------------------------------------------
test      absolute   additions      ratio with
number    time       per second     test0

 0        0.30sec    165.79M         1.00
 1        0.30sec    166.33M         1.00
 2        0.30sec    165.79M         1.00
 3        0.30sec    166.33M         1.00
 4        0.30sec    165.79M         1.00
 5        0.30sec    166.33M         1.00
 6        0.30sec    165.79M         1.00
 7        0.30sec    166.33M         1.00
 8        0.30sec    165.79M         1.00
 9        0.30sec    166.33M         1.00
10        0.30sec    166.33M         1.00
11        0.30sec    166.33M         1.00
12        0.30sec    166.33M         1.00
mean:     0.30sec    166.12M         1.00

Total absolute time: 3.91 sec

Abstraction Penalty: 1.00
---------------------------------------------------


Pentium III 850MHz + gcc-2.96(RH7)
---------------------------------------------------
test      absolute   additions      ratio with
number    time       per second     test0

 0        0.17sec    294.12M         1.00
 1        0.20sec    250.00M         1.18
 2        0.18sec    277.78M         1.06
 3        0.26sec    192.31M         1.53
 4        0.26sec    192.31M         1.53
 5        0.26sec    192.31M         1.53
 6        0.26sec    192.31M         1.53
 7        0.26sec    192.31M         1.53
 8        0.26sec    192.31M         1.53
 9        0.26sec    192.31M         1.53
10        0.26sec    192.31M         1.53
11        0.35sec    142.86M         2.06
12        0.36sec    138.89M         2.12
mean:     0.25sec    198.81M         1.48

Total absolute time: 3.34 sec

Abstraction Penalty: 1.48
---------------------------------------------------


Celeron 550MHz + gcc-2.95.2
---------------------------------------------------
test      absolute   additions      ratio with
number    time       per second     test0

 0        0.37sec    135.14M         1.00
 1        0.31sec    161.29M         0.84
 2        0.30sec    166.67M         0.81
 3        0.30sec    166.67M         0.81
 4        0.30sec    166.67M         0.81
 5        0.30sec    166.67M         0.81
 6        0.30sec    166.67M         0.81
 7        0.30sec    166.67M         0.81
 8        0.30sec    166.67M         0.81
 9        0.30sec    166.67M         0.81
10        0.30sec    166.67M         0.81
11        0.30sec    166.67M         0.81
12        0.30sec    166.67M         0.81
mean:     0.31sec    163.59M         0.83

Total absolute time: 3.98 sec

Abstraction Penalty: 0.83
---------------------------------------------------




-----
Kouichi Takahashi                         E-CELL Project,
email: shafi@sfc.keio.ac.jp               Institute for Advanced Biosciences
       shafi@e-cell.org                   Keio Univ. SFC

_______________________________________________
Ecell3-devel maillist  -  Ecell3-devel@bioinformatics.org
http://bioinformatics.org/mailman/listinfo/ecell3-devel