Fast (Parallel) Dense Linear Interval Systems Solvers in C-XSC Using Error Free Transformations and BLAS

Zimmer, Michael; Krämer, Walter

doi:10.4230/DagSemProc.08021.11

Abstract

The traditional solver for linear interval systems available in C-XSC [6,1]
is mathematically based on the Krawczyk[12] operator and modifications
introduced by Rump[17]. The Krawczyk operator is composed of
matrix/vector operations. These operations are realized in C-XSC
with higest accuracy (only one final rounding) using a so called long
accumulator (dotprecision variable). C-XSC dotprecision variables allow the
error free computation of sums of floating point numbers as well as the
error free computation of scalar products of floating point vectors. Thus,
from a mathematical point of view these operations are perfect. Because
actual hardware does not support these perfect scalar products all
operations have to be realized by software. This fact leads to a tremendous
time penalty (note: it has been shown that with modest additional hardware
costs perfect scalar products can be made as fast as simple floating-point
loops).

To speed up the C-XSC scalar product software-operations we adapt the so
called DotK algorithm as published in [14]. Error free transformations[14,3,4,10]
are used as basic building blocks to develop summation and scalar product
algorithms simulating a K-fold precision. Compared to the perfect C-XSC operations
these operations are fast. They are more accurate than simple floating-point
loops (but of course no longer perfect in the mathematical sense). The fast
operations are available in C-XSC via the new data types DotK, IDotK, CDotk
and CIDotK. These new data types are composed in such a way that traditional
C-XSC code using dotprecision variables can be adapted with minimal effort. It is
possible to switch (at runtime!) from perfect computations to fast operations using
K-fold precision (K equal 0 means traditional dotprecision computations) and it is
possible to hold intermediate results with corresponding error bounds for further
summations or scalar product updates. The details are described in [19].

Additionaly, based on similar algorithms used in Intlab[16], BLAS and LAPACK
libraries [2] are used in the O(nÃ‚Â³) parts of the linear system solver. For
matrix-matrix products, manipulation of the rounding mode of the processor is used
to compute enclosures of the correct result.

Comparing the traditional solver with the new version shows that the class of
problems which are solvable with the new version is smaller than the class of
problems which can be solved using the solver based on perfect operations. But it
seems that for real world problems also the new solver is appropriate. Using the
new solver based on BLAS and simulating a quadrupel precision (i.e. k==2) the
speedup comes close to 200(!). The new solver is nearly as fast as the corresponding
IntLab[16] solver verifylss. Solving a real linaer system of dimension 1000 on a
Pentium 4 with 3.2GHz takes about 2.8 seconds. In all cases tested the accuracy of
our new solver was better and in some cases significantly better than the accuracy
of the corresponding IntLab results. The new solver also allows solving larger
(dense) problems than its IntLab counterpart. We also show some examples where IntLab
falls down whereas our new solver still works.

A parallel version of this solver, based on ScaLAPACK, is also available. Unlike
the previous parallel solver in C-XSC[5], this new solver does not depend on a
root-node, which makes it possible to compute a verified solution even of very large
linear systems.

In the talk we will discuss the new data types in more detail, we will emphasize our
modifications to the DotK algorithm taken from the literature [14,15], we will show
time measurements and we will present results concerning the accuracy of the computed
enclosures. Our results will also be compared to corresponding results computed with
the IntLab package. We also will comment on hardware features and compiler options
which can/should be used to get reliable results on different platforms efficiently.

References:

[1] Downloads:
C-XSC library: http://www.math.uni-wuppertal.de/~xsc/xsc/cxsc.html
Solvers: http://www.math.uni-wuppertal.de/~xsc/xsc/cxsc_software.html

[2] L.S. Blackford, J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux,
L. Kaufman, A. Lumsdaine, A. Petitet, R. Pozo, K. Remington, R. C. Whaley, An Updated Set
of Basic Linear Algebra Subprograms (BLAS), ACM Trans. Math. Soft., 28-2 (2002), pp. 135--151.

[3] Bohlender, G.; Walter, W.; Kornerup, P.; Matula,
D.W.; Kornerup, P.; Matula, D.W.:
Semantics for Exact Floating Point Operations.
Proceedings, 10th IEEE Symposium on Computer Arithmetic,
26-28 June 1991, IEEE, 1991.

[4] Dekker, T.J.: A floating-point technique for extending
the available precision. Numer. Math., 18:224, 1971.

[5] Grimmer, M.: Selbstverifizierende Mathematische Softwarewerkzeuge im
High-Performance Computing. Konzeption, Entwicklung und Analyse am Beispiel
der parallelen verifizierten Loesung linearer Fredholmscher Integralgleichungen
zweiter Art. Logos Verlag, 2007.

[6] Hofschuster, W.; Kraemer, W.:
C-XSC 2.0: A C++ Library for Extended Scientific Computing.
Numerical Software with Result Verification,
Lecture Notes in Computer Science, Volume 2991/2004,
Springer-Verlag, Heidelberg, pp. 15 - 35, 2004.

[7] Kersten, Tim: Verifizierende rechnerinvariante Numerikmodule, Dissertation,
University of Karlsruhe, 1998

[8] Klatte, Kulisch, Wiethoff, Lawo, Rauch:
"C-XSC - A C++ Class Library for Extended Scientific Computing",
Springer-Verlag, Heidelberg, 1993.
Due to the C++ standardization (1998) and dramatic changes
in C++ compilers over the last years this documentation describes
no longer the actual C-XSC environment. Please refer to more accurate
documentation (e.g.[1]) available from the web site of our
research group: http...

[9] Kirchner, R., Kulisch, U.:
Hardware Support for Interval Arithmetic.
Reliable Computing, Volume 12, Number 3,
June 2006 , pp. 225-237(13).

[10] Knuth, D.E.: The Art of Computer Programming: Seminumerical Algorithms.
Addison Wesley, 1969, vol. 2.

[11] Kulisch, U.: Computer Arithmetic and Validity - Theory,
Implementation. To appear.

[12] Krawczyk, R.: Newton-Algorithmen zur Bestimmung von Nullstellen mit Fehlerschranken,
Computing, 4:187-201, 1969.

[13] Lerch, M.; Tischler, G.; Wolff von Gudenberg, J.; Hofschuster, W;
Kraemer, W.:
filib++, a Fast Interval Library Supporting Containment Computations.
ACM TOMS, volume 32, number 2, pp. 299-324, 2006.

[14] Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and
dot product. SIAM Journal on Scientific Computing,
26:6, 2005.

[15] Oishi, S., Tanabe, K., Ogita, T., Rump, S.M., Yamanaka, N.:
A Parallel Algorithm of Accurate Dot Product.
Submitted for publication, 2007.

[16] Rump, S.M.: Intlab - Interval Laboratory. Developments in Reliable
Computing, pp. 77-104, 1999.

[17] Rump, S.M.: Kleine Fehlerschranken bei Matrixproblemen, Dissertation,
University of Karlsruhe, 1980

[18] Stroustrup, Bjarne: The C++-Programming Language, 3rd Edition, Addison-Wesley, 2000.

[19] Zimmer, Michael: Laufzeiteffiziente, parallele Loeser fuer
lineare Intervallgleichungssysteme in C-XSC, Master thesis,
University of Wuppertal, 2007.

AMS subject classification: 65H10, 15-04, 65G99, 65G10, 65-04

Fast (Parallel) Dense Linear Interval Systems Solvers in C-XSC Using Error Free Transformations and BLAS

Authors Michael Zimmer, Walter Krämer

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

Keywords

Metrics

Thanks for your feedback!

Could not send message