-
Notifications
You must be signed in to change notification settings - Fork 7
BLAS compatibility library #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ChrisPattison
wants to merge
74
commits into
main
Choose a base branch
from
blas
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 41 commits
Commits
Show all changes
74 commits
Select commit
Hold shift + click to select a range
7cbf4de
BLAS library WIP
ChrisPattison 2d72035
Empty matrix is a noop
ChrisPattison 08461cc
put mpf_t in one place
ChrisPattison b531adc
PackFloat ToMpfr and ToGmp now const
ChrisPattison a59dd65
mpf_t/mpfr_t wrapper type
ChrisPattison d11fc10
unsigned long for precision
ChrisPattison 0745071
overload of ApfpInterfaceType constructor with precision specified
ChrisPattison 1da49df
ApfpInterfaceWrapper move semantics
ChrisPattison dd0db45
Fix memory leaks in BLAS library
ChrisPattison 6ecea7a
Merge commit 'b3c3232369122bda9a551eb6777cc90d7721124f' into blas
ChrisPattison 2dd8b21
Matrix Addition dummy
ChrisPattison e4165cb
mpf_t |-> mpf_ptr in PackedFloat
ChrisPattison 7da55be
const ToGmp
ChrisPattison aeb9ce1
Hostlib takes mpf_ptr. Host transpose/add syrk
ChrisPattison 4c499ae
Merge commit 'cd2be5046e33205e11bf54814d96263f8e66efed' into blas
ChrisPattison bfcacd9
MPFR BLAS interface
ChrisPattison b50c80e
Add unsigned long init and mul to wrapper header
ChrisPattison 4947c85
Generate takes mpfr_ptr
ChrisPattison d96de48
BLAS syrk unit test
ChrisPattison 4eb4881
Blas unit tests in separate executable
ChrisPattison c49e272
Search for kernel in current working directory
ChrisPattison bdd9f35
Throw an exception if we can't find the kernel
ChrisPattison adaba04
Guard against calling unitialized library
ChrisPattison 13aa0e9
Add mechanism to get ApfpBlas error strings
ChrisPattison e2c32d8
Guard error code for ApfpInit in UnitTests
ChrisPattison 41f75a8
More sophisticated kernel search routine
ChrisPattison 4ab5d11
Setup/teardown test case
ChrisPattison 3c37a15
Fix buffer size check on TransferToHost
ChrisPattison 98f4721
CopyTransposeFromMatrix destination LDA
ChrisPattison ee113ba
Blas unit tests pass
ChrisPattison 1d53578
Move interface type <gmp/mpfr> to Config.h
ChrisPattison c6a86a7
install kernels to lib
ChrisPattison 4c22885
Compile under GMP interface type
ChrisPattison 2aa28f5
Fix closeness check in BlasUnitTest for a=b=0
ChrisPattison 47022fd
Use generators for SYRK test case
ChrisPattison 8c8e2c2
Add config.h to install dirs
ChrisPattison b1449e1
Support 'T' argument in syrk
ChrisPattison 572a4ca
Check upper/lower Syrk mode
ChrisPattison 102c818
Fix MPFR wrapper argument order
ChrisPattison d22a4ee
Merge branch 'main' into blas
ChrisPattison b223108
Remove mystery character in CMakeLists.txt
ChrisPattison 71a4cf8
Fix LD_LIBRARY_PATH search for FPGA kernel
ChrisPattison 14274fc
Marginally more helpful error handling
ChrisPattison 0b435be
GMP allows aliasing inputs
ChrisPattison 1f43348
Install hw emu kernel
ChrisPattison 9856042
Do SYRK addition on the FPGA
ChrisPattison 1e8cffe
Move Apfp lib into namespace
ChrisPattison c1fdfa9
Make the interface type wrapping nicer
ChrisPattison 0677b71
Rename ErrorDescription
ChrisPattison 03859ee
Enum class Uplo/Trans
ChrisPattison b1a768d
Formatting because I keep forgetting
ChrisPattison 78a4594
apfpHostlib naming convention
ChrisPattison 119185c
Switch kernel to column major ordering
ChrisPattison 8338d02
Remove extremely large volume simulation test cases
ChrisPattison 1f8cd57
Merge branch 'main' into blas
ChrisPattison 137d11d
ApfpIsInitialized |-> IsInitialized
ChrisPattison 1cb5c90
Merge branch 'main' into col_major
ChrisPattison 98434fc
Scale back directory search for kernel
ChrisPattison 4e23918
Missing function renames
ChrisPattison 7f1fa90
Add cwd to kernel search path
ChrisPattison 6fd095e
Set INTERFACE_TYPE to SEMANTICS
ChrisPattison cacf283
Merge branch 'main' into blas
ChrisPattison 9d1173c
BlasError is scoped enum
ChrisPattison 9dbcb97
Merge branch 'col_major' into blas
ChrisPattison d0ab102
class Apfp -> Context
ChrisPattison 155d0b8
Use RNDZ MPFR rounding mode everywhere
ChrisPattison 55c9fe8
Throw KernelNotFoundException if APFP_KERNEL misset
ChrisPattison 4e5c34d
Add comment about memory layout
ChrisPattison 8b22b71
More descriptive SYRK unit tests
ChrisPattison ad5d63f
Add GEMM
ChrisPattison 40a61a6
Missing syrk test case
ChrisPattison 66ecc7e
Fix M and N in GEMM
ChrisPattison f9c7c3c
GEMM unit tests
ChrisPattison 1685fd2
Go fast and break things - just not the unit tests!
ChrisPattison File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| #include "Config.h" | ||
| #include <catch.hpp> | ||
| #include <iostream> | ||
| #include <limits> | ||
|
|
||
| // #include "ArithmeticOperations.h" | ||
| // #include "Karatsuba.h" | ||
| // #include "PackedFloat.h" | ||
| #include "Random.h" | ||
|
|
||
| #include "ApfpBlas.h" | ||
|
|
||
| void ApfpSetup() { | ||
| #ifdef APFP_GMP_INTERFACE_TYPE | ||
| mpf_set_default_prec(kMantissaBits); | ||
| #else | ||
| mpfr_set_default_prec(kMantissaBits); | ||
| #endif | ||
| auto apfp_error_code = ApfpInit(kMantissaBits); | ||
| REQUIRE(apfp_error_code == ApfpBlasError::success); | ||
| } | ||
|
|
||
| void ApfpTeardown() { | ||
| ApfpFinalize(); | ||
| } | ||
|
|
||
| bool IsZero(ApfpInterfaceTypeConstPtr a) { | ||
| #ifdef APFP_GMP_INTERFACE_TYPE | ||
| return mpf_sgn(a) == 0; | ||
| #else | ||
| return mpfr_sgn(a) == 0; | ||
| #endif | ||
| } | ||
|
|
||
| bool IsClose(ApfpInterfaceTypeConstPtr a, ApfpInterfaceTypeConstPtr b) { | ||
| // Avoids divide by zero if a = b = 0 | ||
| if(IsZero(a) && IsZero(b)) { | ||
| return true; | ||
| } | ||
|
|
||
| ApfpInterfaceWrapper diff, sum, ratio; | ||
| #ifdef APFP_GMP_INTERFACE_TYPE | ||
| mpf_sub(diff.get(), a, b); | ||
| mpf_add(sum.get(), a, b); | ||
| mpf_div(ratio.get(), diff.get(), sum.get()); | ||
| long exp; | ||
| mpf_get_d_2exp(&exp, ratio.get()); | ||
| #else | ||
| auto rounding_mode = mpfr_get_default_rounding_mode(); | ||
| mpfr_sub(diff.get(), a, b, rounding_mode); | ||
| mpfr_add(sum.get(), a, b, rounding_mode); | ||
| mpfr_div(ratio.get(), diff.get(), sum.get(), rounding_mode); | ||
| auto exp = mpfr_get_exp(ratio.get()); | ||
| #endif | ||
| // Require the numbers to match to the first 90% decimal places | ||
| return exp < -((kMantissaBits*3 * 9)/10); | ||
| } | ||
|
|
||
| TEST_CASE("Init_Teardown") { | ||
| ApfpSetup(); | ||
| ApfpTeardown(); | ||
| } | ||
|
|
||
| TEST_CASE("SYRK") { | ||
| ApfpSetup(); | ||
|
|
||
| auto rng = RandomNumberGenerator(); | ||
|
|
||
| unsigned long N = GENERATE(0, 1, 2, 8, 15, 16, 31, 32, 33); | ||
| unsigned long K = GENERATE(0, 1, 2, 8, 15, 16, 31, 32, 33); | ||
| char mode = GENERATE('N', 'T'); | ||
| char uplo_mode = GENERATE('U', 'L'); | ||
| // Test SYRK | ||
| // In 'N' mode, we perform AA^T + C | ||
| // A is NxK (A : R^K -> R^N) | ||
| // C is NxN | ||
| // Matrices are stored column major because BLAS | ||
| { | ||
| std::vector<ApfpInterfaceWrapper> a_matrix; | ||
| a_matrix.resize(N*K); | ||
| for(auto& v : a_matrix) { | ||
ChrisPattison marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| rng.Generate(v.get()); | ||
| } | ||
|
|
||
| std::vector<ApfpInterfaceWrapper> c_matrix; | ||
| c_matrix.resize(N*N); | ||
| for(auto& v : c_matrix) { | ||
| rng.Generate(v.get()); | ||
| } | ||
|
|
||
| std::vector<ApfpInterfaceWrapper> ref_result; | ||
| ref_result.resize(N*N); | ||
|
|
||
| // Compute reference result | ||
| ApfpInterfaceWrapper prod_temp, sum_temp; | ||
| for(unsigned long j = 0; j < N; ++j) { | ||
| // lower half | ||
| for(unsigned long i = 0; i < N; ++i) { | ||
| auto r_idx = i + j*N; | ||
| SetApfpInterfaceType(ref_result.at(r_idx).get(), c_matrix.at(r_idx).get()); | ||
|
|
||
| for(unsigned long k = 0; k < K; ++k) { | ||
| // A is NxK if N, KxN if T | ||
| if (mode == 'N') { | ||
| // (AB)_ij = sum_k A(i,k)B(k,j) | ||
| MulApfpInterfaceType(prod_temp.get(), a_matrix.at(i + k*N).get(), a_matrix.at(j + k*N).get()); | ||
| } else { | ||
| // (AB)_ij = sum_k A(i,k) B(k,j) | ||
| MulApfpInterfaceType(prod_temp.get(), a_matrix.at(k + i*K).get(), a_matrix.at(k + j*K).get()); | ||
| } | ||
| AddApfpInterfaceType(sum_temp.get(), prod_temp.get(), ref_result.at(r_idx).get()); | ||
| SetApfpInterfaceType(ref_result.at(r_idx).get(), sum_temp.get()); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Use APFP BLAS library | ||
| auto error_code = ApfpSyrk(uplo_mode, mode, N, K, | ||
| [&](unsigned long i) { return a_matrix.at(i).get(); }, mode == 'N' ? N : K, | ||
| [&](unsigned long i) { return c_matrix.at(i).get(); }, N); | ||
| REQUIRE(error_code == ApfpBlasError::success); | ||
|
|
||
| // Check all entries are sufficiently close | ||
| ApfpInterfaceWrapper diff; | ||
| for(unsigned long j = 0; j < N; ++j) { | ||
| // lower half | ||
| for(unsigned long i = 0; i < j; ++i) { | ||
| auto ref_value = uplo_mode == 'L' ? ref_result.at(i + j*N).get() : ref_result.at(j + i*N).get(); | ||
| auto test_value = uplo_mode == 'L' ? c_matrix.at(i + j*N).get() : c_matrix.at(j + i*N).get(); | ||
| REQUIRE(IsClose(ref_value, test_value)); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| ApfpTeardown(); | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.