Zimmer, Michael ;
Krämer, Walter
Fast (Parallel) Dense Linear Interval Systems Solvers in CXSC Using Error Free Transformations and BLAS
Abstract
The traditional solver for linear interval systems available in CXSC [6,1]
is mathematically based on the Krawczyk[12] operator and modifications
introduced by Rump[17]. The Krawczyk operator is composed of
matrix/vector operations. These operations are realized in CXSC
with higest accuracy (only one final rounding) using a so called long
accumulator (dotprecision variable). CXSC dotprecision variables allow the
error free computation of sums of floating point numbers as well as the
error free computation of scalar products of floating point vectors. Thus,
from a mathematical point of view these operations are perfect. Because
actual hardware does not support these perfect scalar products all
operations have to be realized by software. This fact leads to a tremendous
time penalty (note: it has been shown that with modest additional hardware
costs perfect scalar products can be made as fast as simple floatingpoint
loops).
To speed up the CXSC scalar product softwareoperations we adapt the so
called DotK algorithm as published in [14]. Error free transformations[14,3,4,10]
are used as basic building blocks to develop summation and scalar product
algorithms simulating a Kfold precision. Compared to the perfect CXSC operations
these operations are fast. They are more accurate than simple floatingpoint
loops (but of course no longer perfect in the mathematical sense). The fast
operations are available in CXSC via the new data types DotK, IDotK, CDotk
and CIDotK. These new data types are composed in such a way that traditional
CXSC code using dotprecision variables can be adapted with minimal effort. It is
possible to switch (at runtime!) from perfect computations to fast operations using
Kfold precision (K equal 0 means traditional dotprecision computations) and it is
possible to hold intermediate results with corresponding error bounds for further
summations or scalar product updates. The details are described in [19].
Additionaly, based on similar algorithms used in Intlab[16], BLAS and LAPACK
libraries [2] are used in the O(nÃ‚Â³) parts of the linear system solver. For
matrixmatrix products, manipulation of the rounding mode of the processor is used
to compute enclosures of the correct result.
Comparing the traditional solver with the new version shows that the class of
problems which are solvable with the new version is smaller than the class of
problems which can be solved using the solver based on perfect operations. But it
seems that for real world problems also the new solver is appropriate. Using the
new solver based on BLAS and simulating a quadrupel precision (i.e. k==2) the
speedup comes close to 200(!). The new solver is nearly as fast as the corresponding
IntLab[16] solver verifylss. Solving a real linaer system of dimension 1000 on a
Pentium 4 with 3.2GHz takes about 2.8 seconds. In all cases tested the accuracy of
our new solver was better and in some cases significantly better than the accuracy
of the corresponding IntLab results. The new solver also allows solving larger
(dense) problems than its IntLab counterpart. We also show some examples where IntLab
falls down whereas our new solver still works.
A parallel version of this solver, based on ScaLAPACK, is also available. Unlike
the previous parallel solver in CXSC[5], this new solver does not depend on a
rootnode, which makes it possible to compute a verified solution even of very large
linear systems.
In the talk we will discuss the new data types in more detail, we will emphasize our
modifications to the DotK algorithm taken from the literature [14,15], we will show
time measurements and we will present results concerning the accuracy of the computed
enclosures. Our results will also be compared to corresponding results computed with
the IntLab package. We also will comment on hardware features and compiler options
which can/should be used to get reliable results on different platforms efficiently.
References:
[1] Downloads:
CXSC library: http://www.math.uniwuppertal.de/~xsc/xsc/cxsc.html
Solvers: http://www.math.uniwuppertal.de/~xsc/xsc/cxsc_software.html
[2] L.S. Blackford, J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux,
L. Kaufman, A. Lumsdaine, A. Petitet, R. Pozo, K. Remington, R. C. Whaley, An Updated Set
of Basic Linear Algebra Subprograms (BLAS), ACM Trans. Math. Soft., 282 (2002), pp. 135151.
[3] Bohlender, G.; Walter, W.; Kornerup, P.; Matula,
D.W.; Kornerup, P.; Matula, D.W.:
Semantics for Exact Floating Point Operations.
Proceedings, 10th IEEE Symposium on Computer Arithmetic,
2628 June 1991, IEEE, 1991.
[4] Dekker, T.J.: A floatingpoint technique for extending
the available precision. Numer. Math., 18:224, 1971.
[5] Grimmer, M.: Selbstverifizierende Mathematische Softwarewerkzeuge im
HighPerformance Computing. Konzeption, Entwicklung und Analyse am Beispiel
der parallelen verifizierten Loesung linearer Fredholmscher Integralgleichungen
zweiter Art. Logos Verlag, 2007.
[6] Hofschuster, W.; Kraemer, W.:
CXSC 2.0: A C++ Library for Extended Scientific Computing.
Numerical Software with Result Verification,
Lecture Notes in Computer Science, Volume 2991/2004,
SpringerVerlag, Heidelberg, pp. 15  35, 2004.
[7] Kersten, Tim: Verifizierende rechnerinvariante Numerikmodule, Dissertation,
University of Karlsruhe, 1998
[8] Klatte, Kulisch, Wiethoff, Lawo, Rauch:
"CXSC  A C++ Class Library for Extended Scientific Computing",
SpringerVerlag, Heidelberg, 1993.
Due to the C++ standardization (1998) and dramatic changes
in C++ compilers over the last years this documentation describes
no longer the actual CXSC environment. Please refer to more accurate
documentation (e.g.[1]) available from the web site of our
research group: http...
[9] Kirchner, R., Kulisch, U.:
Hardware Support for Interval Arithmetic.
Reliable Computing, Volume 12, Number 3,
June 2006 , pp. 225237(13).
[10] Knuth, D.E.: The Art of Computer Programming: Seminumerical Algorithms.
Addison Wesley, 1969, vol. 2.
[11] Kulisch, U.: Computer Arithmetic and Validity  Theory,
Implementation. To appear.
[12] Krawczyk, R.: NewtonAlgorithmen zur Bestimmung von Nullstellen mit Fehlerschranken,
Computing, 4:187201, 1969.
[13] Lerch, M.; Tischler, G.; Wolff von Gudenberg, J.; Hofschuster, W;
Kraemer, W.:
filib++, a Fast Interval Library Supporting Containment Computations.
ACM TOMS, volume 32, number 2, pp. 299324, 2006.
[14] Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and
dot product. SIAM Journal on Scientific Computing,
26:6, 2005.
[15] Oishi, S., Tanabe, K., Ogita, T., Rump, S.M., Yamanaka, N.:
A Parallel Algorithm of Accurate Dot Product.
Submitted for publication, 2007.
[16] Rump, S.M.: Intlab  Interval Laboratory. Developments in Reliable
Computing, pp. 77104, 1999.
[17] Rump, S.M.: Kleine Fehlerschranken bei Matrixproblemen, Dissertation,
University of Karlsruhe, 1980
[18] Stroustrup, Bjarne: The C++Programming Language, 3rd Edition, AddisonWesley, 2000.
[19] Zimmer, Michael: Laufzeiteffiziente, parallele Loeser fuer
lineare Intervallgleichungssysteme in CXSC, Master thesis,
University of Wuppertal, 2007.
AMS subject classification: 65H10, 1504, 65G99, 65G10, 6504
BibTeX  Entry
@InProceedings{zimmer_et_al:DSP:2008:1436,
author = {Michael Zimmer and Walter Kr{\"a}mer},
title = {Fast (Parallel) Dense Linear Interval Systems Solvers in CXSC Using Error Free Transformations and BLAS},
booktitle = {Numerical Validation in Current Hardware Architectures},
year = {2008},
editor = {Annie Cuyt and Walter Kr{\"a}mer and Wolfram Luther and Peter Markstein},
number = {08021},
series = {Dagstuhl Seminar Proceedings},
ISSN = {18624405},
publisher = {Internationales Begegnungs und Forschungszentrum f{\"u}r Informatik (IBFI), Schloss Dagstuhl, Germany},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2008/1436},
annote = {Keywords: Errorfree transformations, Kfold accuracy, accurate dot product, CXSC, high accuracy, dense linear systems, verified computation.}
}
2008
Keywords: 

Errorfree transformations, Kfold accuracy, accurate dot product, CXSC, high accuracy, dense linear systems, verified computation. 
Seminar: 

08021  Numerical Validation in Current Hardware Architectures

Related Scholarly Article: 


Issue date: 

2008 
Date of publication: 

2008 