Details

Parallel Scientific Computing


Parallel Scientific Computing


1. Aufl.

von: Frédéric Magoules, François-Xavier Roux, Guillaume Houzeaux

139,99 €

Verlag: Wiley
Format: EPUB
Veröffentl.: 14.12.2015
ISBN/EAN: 9781118761717
Sprache: englisch
Anzahl Seiten: 372

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p>Scientific computing has become an indispensable tool in numerous fields, such as physics, mechanics, biology,<br />finance and industry. For example, it enables us, thanks to efficient algorithms adapted to current computers, to<br />simulate, without the help of models or experimentations, the deflection of beams in bending, the sound level in a theater room or a fluid flowing around an aircraft wing.<br />This book presents the scientific computing techniques applied to parallel computing for the numerical simulation of large-scale problems; these problems result from systems modeled by partial differential equations. Computing concepts will be tackled via examples.<br />Implementation and programming techniques resulting from the finite element method will be presented for direct solvers, iterative solvers and domain decomposition methods, along with an introduction to MPI and OpenMP.</p>
<p>Preface xi</p> <p>Introduction xv</p> <p><b>Chapter 1. Computer Architectures </b><b>1</b></p> <p>1.1. Different types of parallelism 1</p> <p>1.1.1. Overlap, concurrency and parallelism 1</p> <p>1.1.2. Temporal and spatial parallelism for arithmetic logic units 4</p> <p>1.1.3. Parallelism and memory 6</p> <p>1.2. Memory architecture 7</p> <p>1.2.1. Interleaved multi-bank memory 7</p> <p>1.2.2. Memory hierarchy 8</p> <p>1.2.3. Distributed memory 13</p> <p>1.3. Hybrid architecture 14</p> <p>1.3.1. Graphics-type accelerators 14</p> <p>1.3.2. Hybrid computers 16</p> <p><b>Chapter 2. Parallelization and Programming Models </b><b>17</b></p> <p>2.1. Parallelization 17</p> <p>2.2. Performance criteria 19</p> <p>2.2.1. Degree of parallelism 19</p> <p>2.2.2. Load balancing 21</p> <p>2.2.3. Granularity 21</p> <p>2.2.4. Scalability 22</p> <p>2.3. Data parallelism 25</p> <p>2.3.1. Loop tasks 25</p> <p>2.3.2. Dependencies 26</p> <p>2.3.3. Examples of dependence 27</p> <p>2.3.4. Reduction operations 30</p> <p>2.3.5. Nested loops 31</p> <p>2.3.6. OpenMP 34</p> <p>2.4. Vectorization: a case study 37</p> <p>2.4.1. Vector computers and vectorization 37</p> <p>2.4.2. Dependence 38</p> <p>2.4.3. Reduction operations 39</p> <p>2.4.4. Pipeline operations 41</p> <p>2.5. Message-passing 43</p> <p>2.5.1. Message-passing programming 43</p> <p>2.5.2. Parallel environment management 44</p> <p>2.5.3. Point-to-point communications 45</p> <p>2.5.4. Collective communications 46</p> <p>2.6. Performance analysis 49</p> <p><b>Chapter 3. Parallel Algorithm Concepts </b><b>53</b></p> <p>3.1. Parallel algorithms for recurrences 54</p> <p>3.1.1. The principles of reduction methods 54</p> <p>3.1.2. Overhead and stability of reduction methods 55</p> <p>3.1.3. Cyclic reduction 57</p> <p>3.2. Data locality and distribution: product of matrices 58</p> <p>3.2.1. Row and column algorithms 58</p> <p>3.2.2. Block algorithms 60</p> <p>3.2.3. Distributed algorithms 64</p> <p>3.2.4. Implementation 66</p> <p><b>Chapter 4. Basics of Numerical Matrix Analysis </b><b>71</b></p> <p>4.1. Review of basic notions of linear algebra 71</p> <p>4.1.1. Vector spaces, scalar products and orthogonal projection 71</p> <p>4.1.2. Linear applications and matrices 74</p> <p>4.2. Properties of matrices 79</p> <p>4.2.1. Matrices, eigenvalues and eigenvectors 79</p> <p>4.2.2. Norms of a matrix 80</p> <p>4.2.3. Basis change 83</p> <p>4.2.4. Conditioning of a matrix 85</p> <p><b>Chapter 5. Sparse Matrices </b><b>93</b></p> <p>5.1. Origins of sparse matrices 93</p> <p>5.2. Parallel formation of sparse matrices: shared memory 98</p> <p>5.3. Parallel formation by block of sparse matrices: distributed memory 99</p> <p>5.3.1. Parallelization by sets of vertices 99</p> <p>5.3.2. Parallelization by sets of elements 101</p> <p>5.3.3. Comparison: sets of vertices and elements 101</p> <p><b>Chapter 6. Solving Linear Systems </b><b>105</b></p> <p>6.1. Direct methods 105</p> <p>6.2. Iterative methods 106</p> <p><b>Chapter 7. LU Methods for Solving Linear Systems </b><b>109</b></p> <p>7.1. Principle of LU decomposition 109</p> <p>7.2. Gauss factorization 113</p> <p>7.3. Gauss–Jordan factorization 115</p> <p>7.3.1. Row pivoting 118</p> <p>7.4. Crout and Cholesky factorizations for symmetric matrices 121</p> <p><b>Chapter 8. Parallelization of LU Methods for Dense Matrices </b><b>125</b></p> <p>8.1. Block factorization 125</p> <p>8.2. Implementation of block factorization in a message-passing environment 130</p> <p>8.3. Parallelization of forward and backward substitutions 135</p> <p><b>Chapter 9. LU Methods for Sparse Matrices </b><b>139</b></p> <p>9.1. Structure of factorized matrices 139</p> <p>9.2. Symbolic factorization and renumbering 142</p> <p>9.3. Elimination trees 147</p> <p>9.4. Elimination trees and dependencies 152</p> <p>9.5. Nested dissections 153</p> <p>9.6. Forward and backward substitutions 159</p> <p><b>Chapter 10. Basics of Krylov Subspaces </b><b>161</b></p> <p>10.1. Krylov subspaces 161</p> <p>10.2. Construction of the Arnoldi basis 164</p> <p><b>Chapter 11. Methods with Complete Orthogonalization for Symmetric Positive Definite Matrices </b><b>167</b></p> <p>11.1. Construction of the Lanczos basis for symmetric matrices 167</p> <p>11.2. The Lanczos method 168</p> <p>11.3. The conjugate gradient method 173</p> <p>11.4. Comparison with the gradient method 177</p> <p>11.5. Principle of preconditioning for symmetric positive definite matrices 180</p> <p><b>Chapter 12. Exact Orthogonalization Methods for Arbitrary Matrices </b><b>185</b></p> <p>12.1. The GMRES method 185</p> <p>12.2. The case of symmetric matrices: the MINRES method 193</p> <p>12.3. The ORTHODIR method 196</p> <p>12.4. Principle of preconditioning for non-symmetric matrices 198</p> <p><b>Chapter 13. Biorthogonalization Methods for Non-symmetric Matrices </b><b>201</b></p> <p>13.1. Lanczos biorthogonal basis for non-symmetric matrices 201</p> <p>13.2. The non-symmetric Lanczos method 206</p> <p>13.3. The biconjugate gradient method: BiCG 207</p> <p>13.4. The quasi-minimal residual method: QMR 211</p> <p>13.5. The BiCGSTAB 217</p> <p><b>Chapter 14. Parallelization of Krylov Methods </b><b>225</b></p> <p>14.1. Parallelization of dense matrix-vector product 225</p> <p>14.2. Parallelization of sparse matrix-vector product based on node sets 227</p> <p>14.3. Parallelization of sparse matrix-vector product based on element sets 229</p> <p>14.3.1. Review of the principles of domain decomposition 229</p> <p>14.3.2. Matrix-vector product 231</p> <p>14.3.3. Interface exchanges 233</p> <p>14.3.4. Asynchronous matrix-vector product with non-blocking communications 236</p> <p>14.3.5. Comparison: parallelization based on node and element sets 236</p> <p>14.4. Parallelization of the scalar product 238</p> <p>14.4.1. By weight 239</p> <p>14.4.2. By distributivity 239</p> <p>14.4.3. By ownership 240</p> <p>14.5. Summary of the parallelization of Krylov methods 241</p> <p><b>Chapter 15. Parallel Preconditioning Methods </b><b>243</b></p> <p>15.1. Diagonal 243</p> <p>15.2. Incomplete factorization methods 245</p> <p>15.2.1. Principle 245</p> <p>15.2.2. Parallelization 248</p> <p>15.3. Schur complement method 250</p> <p>15.3.1. Optimal local preconditioning 250</p> <p>15.3.2. Principle of the Schur complement method 251</p> <p>15.3.3. Properties of the Schur complement method 254</p> <p>15.4. Algebraic multigrid 257</p> <p>15.4.1. Preconditioning using projection 257</p> <p>15.4.2. Algebraic construction of a coarse grid 258</p> <p>15.4.3. Algebraic multigrid methods 261</p> <p>15.5. The Schwarz additive method of preconditioning 263</p> <p>15.5.1. Principle of the overlap 263</p> <p>15.5.2. Multiplicative versus additive Schwarz methods 265</p> <p>15.5.3. Additive Schwarz preconditioning 268</p> <p>15.5.4. Restricted additive Schwarz: parallel implementation 269</p> <p>15.6. Preconditioners based on the physics 275</p> <p>15.6.1. Gauss–Seidel method 275</p> <p>15.6.2. Linelet method 276</p> <p><b>Appendices </b><b>279</b></p> <p>Appendix 1 281</p> <p>Appendix 2 301</p> <p>Appendix 3 323</p> <p>Bibliography 339</p> <p>Index 343</p>
<p><strong>Fr&Eeacute;déric Magoulès</strong> is Professor at LISA / MAS école Centrale Paris, France. <p><strong>François-Xavier Roux</strong> is Professor at University Pierre & Marie Curie - Paris 6, France.

Diese Produkte könnten Sie auch interessieren:

Bandwidth Efficient Coding
Bandwidth Efficient Coding
von: John B. Anderson
EPUB ebook
114,99 €
Digital Communications with Emphasis on Data Modems
Digital Communications with Emphasis on Data Modems
von: Richard W. Middlestead
PDF ebook
171,99 €
Bandwidth Efficient Coding
Bandwidth Efficient Coding
von: John B. Anderson
PDF ebook
114,99 €