93 lines
9.5 KiB
HTML
93 lines
9.5 KiB
HTML
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="General matrix multiplication for f32, f64, and complex matrices. Operates on matrices with general layout (they can use arbitrary row and column stride)."><title>matrixmultiply - Rust</title><script>if(window.location.protocol!=="file:")document.head.insertAdjacentHTML("beforeend","SourceSerif4-Regular-6b053e98.ttf.woff2,FiraSans-Regular-0fe48ade.woff2,FiraSans-Medium-e1aa3f0a.woff2,SourceCodePro-Regular-8badfe75.ttf.woff2,SourceCodePro-Semibold-aa29a496.ttf.woff2".split(",").map(f=>`<link rel="preload" as="font" type="font/woff2" crossorigin href="../static.files/${f}">`).join(""))</script><link rel="stylesheet" href="../static.files/normalize-9960930a.css"><link rel="stylesheet" href="../static.files/rustdoc-42caa33d.css"><meta name="rustdoc-vars" data-root-path="../" data-static-root-path="../static.files/" data-current-crate="matrixmultiply" data-themes="" data-resource-suffix="" data-rustdoc-version="1.84.0 (9fc6b4312 2025-01-07)" data-channel="1.84.0" data-search-js="search-92e6798f.js" data-settings-js="settings-0f613d39.js" ><script src="../static.files/storage-59e33391.js"></script><script defer src="../crates.js"></script><script defer src="../static.files/main-5f194d8c.js"></script><noscript><link rel="stylesheet" href="../static.files/noscript-893ab5e7.css"></noscript><link rel="alternate icon" type="image/png" href="../static.files/favicon-32x32-6580c154.png"><link rel="icon" type="image/svg+xml" href="../static.files/favicon-044be391.svg"></head><body class="rustdoc mod crate"><!--[if lte IE 11]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><nav class="mobile-topbar"><button class="sidebar-menu-toggle" title="show sidebar"></button></nav><nav class="sidebar"><div class="sidebar-crate"><h2><a href="../matrixmultiply/index.html">matrixmultiply</a><span class="version">0.3.9</span></h2></div><div class="sidebar-elems"><ul class="block"><li><a id="all-types" href="all.html">All Items</a></li></ul><section id="rustdoc-toc"><h3><a href="#">Sections</a></h3><ul class="block top-toc"><li><a href="#matrix-representation" title="Matrix Representation">Matrix Representation</a></li><li><a href="#portability-and-performance" title="Portability and Performance">Portability and Performance</a></li><li><a href="#features" title="Features">Features</a><ul><li><a href="#std" title="`std`"><code>std</code></a></li><li><a href="#threading" title="`threading`"><code>threading</code></a></li><li><a href="#cgemm" title="`cgemm`"><code>cgemm</code></a></li><li><a href="#constconf" title="`constconf`"><code>constconf</code></a></li></ul></li><li><a href="#other-notes" title="Other Notes">Other Notes</a></li><li><a href="#rust-version" title="Rust Version">Rust Version</a></li></ul><h3><a href="#functions">Crate Items</a></h3><ul class="block"><li><a href="#functions" title="Functions">Functions</a></li></ul></section><div id="rustdoc-modnav"></div></div></nav><div class="sidebar-resizer"></div><main><div class="width-limiter"><rustdoc-search></rustdoc-search><section id="main-content" class="content"><div class="main-heading"><h1>Crate <span>matrixmultiply</span><button id="copy-path" title="Copy item path to clipboard">Copy item path</button></h1><rustdoc-toolbar></rustdoc-toolbar><span class="sub-heading"><a class="src" href="../src/matrixmultiply/lib.rs.html#8-188">Source</a> </span></div><details class="toggle top-doc" open><summary class="hideme"><span>Expand description</span></summary><div class="docblock"><p>General matrix multiplication for f32, f64, and complex matrices. Operates on
|
||
matrices with general layout (they can use arbitrary row and column stride).</p>
|
||
<p>This crate uses the same macro/microkernel approach to matrix multiplication as
|
||
the <a href="https://github.com/flame/blis">BLIS</a> project.</p>
|
||
<p>We presently provide a few good microkernels, portable and for x86-64 and
|
||
AArch64 NEON, and only one operation: the general matrix-matrix
|
||
multiplication (“gemm”).</p>
|
||
<h3 id="matrix-representation"><a class="doc-anchor" href="#matrix-representation">§</a>Matrix Representation</h3>
|
||
<p><strong>matrixmultiply</strong> supports matrices with general stride, so a matrix
|
||
is passed using a pointer and four integers:</p>
|
||
<ul>
|
||
<li><code>a: *const f32</code>, pointer to the first element in the matrix</li>
|
||
<li><code>m: usize</code>, number of rows</li>
|
||
<li><code>k: usize</code>, number of columns</li>
|
||
<li><code>rsa: isize</code>, row stride</li>
|
||
<li><code>csa: isize</code>, column stride</li>
|
||
</ul>
|
||
<p>In this example, A is a m by k matrix. <code>a</code> is a pointer to the element at
|
||
index <em>0, 0</em>.</p>
|
||
<p>The <em>row stride</em> is the pointer offset (in number of elements) to the
|
||
element on the next row. It’s the distance from element <em>i, j</em> to <em>i + 1,
|
||
j</em>.</p>
|
||
<p>The <em>column stride</em> is the pointer offset (in number of elements) to the
|
||
element in the next column. It’s the distance from element <em>i, j</em> to <em>i,
|
||
j + 1</em>.</p>
|
||
<p>For example for a contiguous matrix, row major strides are <em>rsa=k,
|
||
csa=1</em> and column major strides are <em>rsa=1, csa=m</em>.</p>
|
||
<p>Strides can be negative or even zero, but for a mutable matrix elements
|
||
may not alias each other.</p>
|
||
<h3 id="portability-and-performance"><a class="doc-anchor" href="#portability-and-performance">§</a>Portability and Performance</h3>
|
||
<ul>
|
||
<li>
|
||
<p>The default kernels are written in portable Rust and available
|
||
on all targets. These may depend on autovectorization to perform well.</p>
|
||
</li>
|
||
<li>
|
||
<p><em>x86</em> and <em>x86-64</em> features can be detected at runtime by default or
|
||
compile time (if enabled), and the following kernel variants are
|
||
implemented:</p>
|
||
<ul>
|
||
<li><code>fma</code></li>
|
||
<li><code>avx</code></li>
|
||
<li><code>sse2</code></li>
|
||
</ul>
|
||
</li>
|
||
<li>
|
||
<p><em>aarch64</em> features can be detected at runtime by default or compile time
|
||
(if enabled), and the following kernel variants are implemented:</p>
|
||
<ul>
|
||
<li><code>neon</code></li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
<h3 id="features"><a class="doc-anchor" href="#features">§</a>Features</h3><h4 id="std"><a class="doc-anchor" href="#std">§</a><code>std</code></h4>
|
||
<p><code>std</code> is enabled by default.</p>
|
||
<p>This crate can be used without the standard library (<code>#![no_std]</code>) by
|
||
disabling the default <code>std</code> feature. To do so, use this in your
|
||
<code>Cargo.toml</code>:</p>
|
||
<div class="example-wrap"><pre class="language-toml"><code>matrixmultiply = { version = "0.3", default-features = false }</code></pre></div>
|
||
<p>Runtime CPU feature detection is available <strong>only</strong> when <code>std</code> is enabled.
|
||
Without the <code>std</code> feature, the crate uses special CPU features only if they
|
||
are enabled at compile time. (To enable CPU features at compile time, pass
|
||
the relevant
|
||
<a href="https://doc.rust-lang.org/rustc/codegen-options/index.html#target-cpu"><code>target-cpu</code></a>
|
||
or
|
||
<a href="https://doc.rust-lang.org/rustc/codegen-options/index.html#target-feature"><code>target-feature</code></a>
|
||
option to <code>rustc</code>.)</p>
|
||
<h4 id="threading"><a class="doc-anchor" href="#threading">§</a><code>threading</code></h4>
|
||
<p><code>threading</code> is an optional crate feature</p>
|
||
<p>Threading enables multithreading for the operations. The environment variable
|
||
<code>MATMUL_NUM_THREADS</code> decides how many threads are used at maximum. At the moment 1-4 are
|
||
supported and the default is the number of physical cpus (as detected by <code>num_cpus</code>).</p>
|
||
<h4 id="cgemm"><a class="doc-anchor" href="#cgemm">§</a><code>cgemm</code></h4>
|
||
<p><code>cgemm</code> is an optional crate feature.</p>
|
||
<p>It enables the <code>cgemm</code> and <code>zgemm</code> methods for complex matrix multiplication.
|
||
This is an <strong>experimental feature</strong> and not yet as performant as the float kernels on x86.</p>
|
||
<p>The complex representation we use is <code>[f64; 2]</code>.</p>
|
||
<h4 id="constconf"><a class="doc-anchor" href="#constconf">§</a><code>constconf</code></h4>
|
||
<p><code>constconf</code> is an optional feature. When enabled, cache-sensitive parameters of
|
||
the gemm implementations can be tweaked <em>at compile time</em> by defining the following variables:</p>
|
||
<ul>
|
||
<li><code>MATMUL_SGEMM_MC</code>
|
||
(And so on, for S, D, C, ZGEMM and with NC, KC or MC).</li>
|
||
</ul>
|
||
<h3 id="other-notes"><a class="doc-anchor" href="#other-notes">§</a>Other Notes</h3>
|
||
<p>The functions in this crate are thread safe, as long as the destination
|
||
matrix is distinct.</p>
|
||
<h3 id="rust-version"><a class="doc-anchor" href="#rust-version">§</a>Rust Version</h3>
|
||
<p>This version requires Rust 1.41.1 or later; the crate follows a carefully
|
||
considered upgrade policy, where updating the minimum Rust version is not a breaking
|
||
change.</p>
|
||
<p>Some features are enabled with later versions: from Rust 1.61 AArch64 NEON support.</p>
|
||
</div></details><h2 id="functions" class="section-header">Functions<a href="#functions" class="anchor">§</a></h2><ul class="item-table"><li><div class="item-name"><a class="fn" href="fn.dgemm.html" title="fn matrixmultiply::dgemm">dgemm</a><sup title="unsafe function">⚠</sup></div><div class="desc docblock-short">General matrix multiplication (f64)</div></li><li><div class="item-name"><a class="fn" href="fn.sgemm.html" title="fn matrixmultiply::sgemm">sgemm</a><sup title="unsafe function">⚠</sup></div><div class="desc docblock-short">General matrix multiplication (f32)</div></li></ul></section></div></main></body></html> |