Load balancing and communication optimization for parallel sparse matrix-vector multiplication 并行稀疏矩阵与向量乘的负载平衡和通信优化
Application Component Layer envelops the parallel algorithms of matrix-vector operation to offer users high-level interfaces; 其中应用组件层把矩阵向量运算的并行算法封装成类组件,为用户提供了高层次的接口;
A Load-Balancing Algorithm for Sparse Matrix-Vector Multiplication on Parallel Computers 并行计算稀疏矩阵乘以向量的负载平衡算法
And the basic two-dimensional transient state equations for semiconductor device are dispersed and linearized. The matrix-vector form of the basic functions is get. And the solution methods are also discussed. 离散和线性化了二维瞬态半导体基本方程,并得到了其矩阵-矢量方程形式,同时还讨论了其求解的方法。
In the thesis parallel computing about conjugate gradient method is realized through dispersed the matrix-vector multiplication. 文中通过对共轭梯度法中矩阵矢量乘部分的并行化来实现其并行。
Since the main cost per iteration of the GMRES method is the cost of the matrix-vector multiplication, the fast matrix-vector multiplication is the key of the algorithm. 广义极小剩余法的每一步迭代的主要运算量来自矩阵-向量相乘,因此构造矩阵-向量快速乘法是整个快速方法的关键。
Hayes in 1986, a vectorized algorithm of the global stiffness matrix-vector multiply in finite element structural analysis is developed when we in fact don't form the global stiffness matrix. Hayes1986年的工作为基础,给出了一个有限元结构分析中,当不形成总体刚度阵而计算它与向量乘积的向量化算法。
Multilevel fast multipole method is used to fast calculate the matrix-vector product when we solve the linear system by iterative method. 分层快速多极算法被用来加速用迭代法求解线性方程组时的矩阵向量乘积的运算。
In order to reduce the computation complexity of the matrix-vector product and the iterative times in the process of matrix inversing, a new wavelet-like preconditioning operator based on the lifting scheme is used. 一种新的基于提升法的类小波变换预处理算子的使用,降低了矩阵向量积的计算复杂度和求逆过程中的迭代次数。
The processing of matrix-vector multiplication is designed to be parallel for the same effectiveness of CG algorithm. 本文通过对CG中矩阵矢量乘的并行化来实现CG迭代算法的并行;
With this method, the inner products, vector updates and matrix-vector multiplication can be realized independently and the communication time was overlapped with vector updates without changing predictive ability. 该算法能有效使内积运算、向量数据更新、矩阵向量实现并行计算,并且数据之间的通信时间能和向量更新时间重叠,从而提高了计算效率,并能保证泛化能力。
To further speed up the solution of scattering from three dimensional electrically large object by multilevel fast multipole algorithm ( MLFMA), a local multilevel fast multipole algorithm ( LMLFMA) based on local interactions is proposed to evaluate matrix-vector multiplication. 为了进一步加速多层快速多极子算法求解电大尺寸目标电磁散射,提出了一种基于局部耦合技术计算矩阵矢量相乘的多层快速多极子方法。
First summarizes the differences on principle between two kinds of parallel algorithm of matrix-vector multiplication, namely, divided by row and divided by column. 文中首先总结按行划分和按列划分的并行矩阵向量乘法在原理上的异同。
Analysis on Parallel Matrix-Vector Multiplication Based on Divided by Row and Divided by Column 按行及按列划分的并行矩阵向量乘法的分析
Optical matrix-vector multiplication using multichannel shadow-casting correlator 光学多通投影相关矩阵-矢量乘法器
The FMM originally developed for free space problems is extended to microstrip problem with the aid of the DCIM, which reduce the number of matrix-vector products in using iterative method for solving MoM equations, and the memory requirement. 将快速多极方法FMM结合离散复镜像技术(DCIM)扩展用于分析分层微带结构,加速用迭代法求解矩量方程中的矩阵&矢量积运算,减少内存需求。
By using spatial areaencoding technique and shadow-casting system, fuzzy matrix-vector max-min operation needed by FAM model is optically realized. 利用空间区域编码技术和阴影投射系统,模糊联想存贮器所需的矩阵-矢量最大-最小合成运算可得到光学实现。
And then the conjugate gradient method is used for solving the matrix-vector equation. 接着采用共轭梯度法(CG)求解稀疏化后的矩阵向量方程,得到目标表面的近似电流。
An efficient computational algorithm based on the FFT-based Toeplitz matrix-vector multiplication is used to calculate the Hilbert Transform. 同时一种基于快速傅里叶变换的矩阵向量乘法的算法可以有效地计算Hilbert变化。
They do not require more storage, or even less, and are well adapted to paralleled computing, both for the construction of the preconditioner and for matrix-vector products. 块常数矩阵不要求较多的存储,在有些情况下,甚至更少;而且无论是预条件的构造过程或是矩阵与向量相乘都可以并行实现。
Therefore, an asynchronous parallel way based on streaming is utilized to speed up the impendence matrix fill and matrix-vector multiplication. Some numerical examples are given to demonstrate good accuracy and performance of the proposed algorithm. 为此,利用基于流的阻抗矩阵填充和矩阵矢量相乘的异步并行方式快速实现时域矩量法的仿真计算,并通过数值实例验证了程序的准确性和高效性。
The implementations for above algorithms involve matrix-vector multiplications, vector inner products and vector outer products, these operations are highly parallelizable, where the number of the iterations required is equal to the dimension of covariance matrix. 这两种权矢量递推算法中的主要计算为矩阵矢量积、矢量内积、矢量外积,均可并行实现,迭代计算所需次数等于协方差矩阵的维数。
When we use Krylov subspace methods as the outer iteration, we can apply relaxation strategy to inner iteration and use inexact matrix-vector product. 当外迭代用Krylov子空间方法,内迭代可以用松弛策略,非精确地求解。
By using the Dickson polynomial basis, finite field multiplication is converted to special Toeplitz matrix-vector multiplication, which can be computed by block segmentation and recursive computation approach. 在Dickson多项式基表示下,有限域乘法转化为特殊的Toeplitz矩阵-向量乘法,Toeplitz矩阵-向量乘法可以利用分块递归计算法计算。