How to diagonalize a matrix of 4,722,366,482,869,645,213,696 elements? Part 1
Wednesday Nov 22, 2017 23:22, last edition on Thursday Nov 23, 2017 00:20
Introduction
I am trying to diagonalize a very large matrix (4,722,366,482,869,645,213,696 elements) in C++ and it leads to some issues. I noticed that I solved most of the issues that I encounter by myself by just pretending to explain the problem to someone. So I decided that for now on, I will share these notes. I will not explain everything, just the technical difficulties. I will give references for the rest, especially my PhD thesis.
Mind your type
Today, I will present you one of the first technical issue that I faced. It is linked to the type of variable.
A simple example is the results of 2N.
In my code, I need to compute 236.
The most straight forward way to achieve is by using bitwise operations.
In fact, 2N can be written
1 << N
, for N∈
.
Indeed, the operator <<
move to the right all the bits by the number on the right hand.
Few examples:
1 << 0
.
1 is not changed, so
1 << 0
= 12 = 110 ≠ 20.
1 << 1
moved the 1 to the right, so
1 << 1
= 102 = 210 = 21
1 << 2
moved the 1 two times to the right, so
1 << 2
= 1002 = 410 = 22.
Here the issue.
For C++, 1
is a int
, which means that it is at least 16 bits, but the actual number of bits is not specified and is implementation dependent.
Let us assume that sizeof(int)
is equal to 4.
It is the number of bytes,
so it means that int
is 32 bits (sizeof(int)
×8).
××××××××××××××××××××××××××××××××
Only the first 31 bits (sizeof(int)
×8-1) represent a positive number,
which means that one can only move a bit up to 30 times (sizeof(int)
×8-1-1) to the right and still get a positive number.
1 << 30
= 010000000000000000000000000000002
= 107374182410
= 230
1 << 31
= 100000000000000000000000000000002
= -214748364810
≠ 231
1 << 32
= 000000000000000000000000000000002
= 010
≠ 232
If you try to compile a code with the last case, you will have a warning
main.cpp: In function ‘int
main()’:
main.cpp:10:17: warning: left shift count >= width of type [-Wshift-count-overflow]
auto
val = 1<<32;
^
As you can see auto
does not become
long long int
but will be
int
,
i.e. the return type of 1 << 32
.
What can we do?
The case 1 << 31
is pretty easy.
We just need to explicitly declare
1 << 31
to be unsigned int
.
In this case, the last bit is not for the sign, so
(unsigned int)1 << 31
= 2147483648
= 231
What about 1 << 32
?
Can I write (unsigned long long int)1 << 32
?
According to C++ standard,
unsigned long long int
is at least 64 bits, so one can think that
(unsigned long long int)1 << 32
will do the job.
Let us see what happens.
It is first 1 << 32
that is computed, which is an int,
so it returns 0
, which is cast into an unsigned long long int.
The result is then 0, so, no, it is not a workaround.
Notice that in this case,
the compiler will not give you a warning,
which is a bug, I think.
The solution is to actually to recast 1
into a type that stores enough bits for the bitwise operation and then apply the operator.
Let us do it step by step
1
is an int
, so its binary representation is
000000000000000000000000000000012
Now, we add enough bits with the cast ((unsigned long long int) 1)
. It has the following binary representation
00000000000000000000000000000000000000000000000000000000000000012
((unsigned long long int) 1) << 32
moves the first 1 32 times
00000000000000000000000000000001000000000000000000000000000000002 = 429496729610 = 232.
We now have a safe 2N for N∈[1, 63]. It is enough for 236, which was the first thing that I had difficulties with.
Notice the largest number in C++ is 264-1 = 18446744073709551615, which is smaller than the number of elements in my matrix, 4722366482869645213696 = 272. We will see that it will be another problem to tackle.
What's next?
The main problem that I am facing is the memory consumption. I am using Lanczos algorithm do solve the eigen-problem. Right now, I am rewriting my code to use a very nice property of this method, which is the fact that the matrix can be defined implicitly.
Edit: The source code that I used for the tests. To compile
% g++ -std=gnu++14 main.cpp
main.cpp
1 #include <string> // std::string
2 #include <iostream> // ostream (std::ostream, std::endl)
3 // std::cout
4 #include <typeinfo> // typeid
5 #include <bitset> // std::bitset
6 #include <array> // std::array
7 #include <memory> // std::shared_ptr
8
9 // Reviewed
10 const std::string exec(const auto * cmd)
+ - 11 +-- 14 lines: {
11 {
| 12 std::array<char, 128> buffer;
| 13 std::string result;
| 14 std::shared_ptr< FILE > pipe(popen(cmd, "r"), pclose);
| 15 if(!pipe) throw std::runtime_error("popen() failed!");
| 16 while(!feof(pipe.get()))
|+ |- 17 +--- 6 lines: {
17 {
|| 18 if(fgets(buffer.data(), 128, pipe.get()) != nullptr)
||+ ||- 19 +---- 3 lines: {
19 {
||| 20 result += buffer.data();
||| 21 }
|| 22 }
| 23 return result;
| 24 }
25
26 // Reviewed
27 // No auto for std::ostream
28 constexpr void display_type(
29 const auto & name,
30 const auto & var,
31 std::ostream & s = std::cout
32 )
+ - 33 +-- 8 lines: {
33 {
| 34 // No auto
| 35 std::string command = "c++filt -t ";
| 36 command += typeid(var).name();
| 37 s<<"Type of \""<<name<<"\""<<std::endl;
| 38 auto stype = exec(command.data());
| 39 s<<stype<<std::endl;
| 40 }
41
42 int main()
+ - 43 +-- 31 lines: {
43 {
| 44 std::cout<<"* 1<<32"<<std::endl;
| 45 auto val1 = 1<<32;
| 46 display_type("1<<32", val1);
| 47 std::cout<<"1<<32 in base 2";
| 48 std::<<std::endl;
| 49 unsigned char * binary1 = reinterpret_cast< decltype(binary1) >(&val1);
| 50 for(auto i = (signed) sizeof(val1) - 1; i >= 0; --i)
|+ |- 51 +--- 3 lines: {
51 {
|| 52 std::cout<<std::bitset< 8 >(binary1[i]);
|| 53 }
| 54 std::cout<<std::endl;
| 55 std::cout<<"1<<32 in base 10";
| 56 std::cout<<std::endl<<val1<<std::endl;
| 57
| 58 std::cout<<std::endl;
| 59 std::cout<<"* ((unsigned long long int) 1)<<32"<<std::endl;
| 60 auto val2 = ((unsigned long long int) 1)<<32;
| 61 display_type("((unsigned long long int) 1)<<32", val2);
| 62 std::cout<<"((unsigned long long int) 1)<<32 in base 2";
| 63 std::cout<<std::endl;
| 64 unsigned char * binary2 = reinterpret_cast< decltype(binary2) >(&val2);
| 65 for(auto i = (signed) sizeof(val2) - 1; i >= 0; --i)
|+ |- 66 +--- 3 lines: {
66 {
|| 67 std::cout<<std::bitset< 8 >(binary2[i]);
|| 68 }
| 69 std::cout<<std::endl;
| 70 std::cout<<"((unsigned long long int) 1)<<32 in base 10";
| 71 std::cout<<std::endl<<val2<<std::endl;
| 72 return 0;
| 73 }