Dr Clèm's Blog

Tags: Coding C++ Optimization GCC C++11 C++14 Physics

How to diagonalize a matrix of 4,722,366,482,869,645,213,696 elements? Part 1

Wednesday Nov 22, 2017 23:22, last edition on Thursday Nov 23, 2017 00:20
Introduction

I am trying to diagonalize a very large matrix (4,722,366,482,869,645,213,696 elements) in C++ and it leads to some issues. I noticed that I solved most of the issues that I encounter by myself by just pretending to explain the problem to someone. So I decided that for now on, I will share these notes. I will not explain everything, just the technical difficulties. I will give references for the rest, especially my PhD thesis.

Mind your type
Today, I will present you one of the first technical issue that I faced. It is linked to the type of variable.

A simple example is the results of 2N. In my code, I need to compute 236. The most straight forward way to achieve is by using bitwise operations. In fact, 2N can be written 1 << N, for N∈*. Indeed, the operator << move to the right all the bits by the number on the right hand. Few examples:
1 << 0. 1 is not changed, so 1 << 0 = 12 = 110 ≠ 20.
1 << 1 moved the 1 to the right, so 1 << 1 = 102 = 210 = 21
1 << 2 moved the 1 two times to the right, so 1 << 2 = 1002 = 410 = 22.
Here the issue. For C++, 1 is a int, which means that it is at least 16 bits, but the actual number of bits is not specified and is implementation dependent. Let us assume that sizeof(int) is equal to 4. It is the number of bytes, so it means that int is 32 bits (sizeof(int)×8).
××××××××××××××××××××××××××××××××
Only the first 31 bits (sizeof(int)×8-1) represent a positive number, which means that one can only move a bit up to 30 times (sizeof(int)×8-1-1) to the right and still get a positive number.
1 << 30 = 010000000000000000000000000000002 = 107374182410 = 230
1 << 31 = 100000000000000000000000000000002 = -214748364810 ≠ 231
1 << 32 = 000000000000000000000000000000002 = 010 ≠ 232

If you try to compile a code with the last case, you will have a warning

main.cpp: In function ‘int main()’:
main.cpp:10:17: warning: left shift count >= width of type [-Wshift-count-overflow]
  auto val = 1<<32;
                 ^

As you can see auto does not become long long int but will be int, i.e. the return type of 1 << 32.

What can we do?

The case 1 << 31 is pretty easy. We just need to explicitly declare 1 << 31 to be unsigned int. In this case, the last bit is not for the sign, so (unsigned int)1 << 31 = 2147483648 = 231

What about 1 << 32? Can I write (unsigned long long int)1 << 32? According to C++ standard, unsigned long long int is at least 64 bits, so one can think that (unsigned long long int)1 << 32 will do the job. Let us see what happens. It is first 1 << 32 that is computed, which is an int, so it returns 0, which is cast into an unsigned long long int. The result is then 0, so, no, it is not a workaround.
Notice that in this case, the compiler will not give you a warning, which is a bug, I think.

The solution is to actually to recast 1 into a type that stores enough bits for the bitwise operation and then apply the operator. Let us do it step by step
1 is an int, so its binary representation is
000000000000000000000000000000012
Now, we add enough bits with the cast ((unsigned long long int) 1). It has the following binary representation
00000000000000000000000000000000000000000000000000000000000000012
((unsigned long long int) 1) << 32 moves the first 1 32 times
00000000000000000000000000000001000000000000000000000000000000002 = 429496729610 = 232.
Output of 1 << 32 and ((unsigned long long int) 1) << 32

We now have a safe 2N for N∈[1, 63]. It is enough for 236, which was the first thing that I had difficulties with.

Notice the largest number in C++ is 264-1 = 18446744073709551615, which is smaller than the number of elements in my matrix, 4722366482869645213696 = 272. We will see that it will be another problem to tackle.

What's next?

The main problem that I am facing is the memory consumption. I am using Lanczos algorithm do solve the eigen-problem. Right now, I am rewriting my code to use a very nice property of this method, which is the fact that the matrix can be defined implicitly.

Edit: The source code that I used for the tests. To compile

% g++ -std=gnu++14 main.cpp

main.cpp
     1 #include <string>   // std::string
     2 #include <iostream> // ostream (std::ostream, std::endl)
     3                     // std::cout
     4 #include <typeinfo> // typeid
     5 #include <bitset>   // std::bitset
     6 #include <array>    // std::array
     7 #include <memory>   // std::shared_ptr
     8 
     9 // Reviewed
    10 const std::string exec(const auto * cmd)
+   -   11 +-- 14 lines: {
11 {
|   12  std::array<char, 128> buffer;
|   13  std::string result;
|   14  std::shared_ptr< FILE > pipe(popen(cmd, "r"), pclose);
|   15  if(!pipe) throw std::runtime_error("popen() failed!");
|   16  while(!feof(pipe.get()))
|+  |-  17 +---  6 lines: {
17  {
||  18   if(fgets(buffer.data(), 128, pipe.get()) != nullptr)
||+ ||- 19 +----  3 lines: {
19   {
||| 20    result += buffer.data();
||| 21   }
||  22  }
|   23  return result;
|   24 }
    25 
    26 // Reviewed
    27 // No auto for std::ostream
    28 constexpr void display_type(
    29  const auto & name,
    30  const auto & var,
    31  std::ostream & s = std::cout
    32 )
+   -   33 +--  8 lines: {
33 {
|   34  // No auto
|   35  std::string command = "c++filt -t ";
|   36  command += typeid(var).name();
|   37  s<<"Type of \""<<name<<"\""<<std::endl;
|   38  auto stype = exec(command.data());
|   39  s<<stype<<std::endl;
|   40 }
    41 
    42 int main()
+   -   43 +-- 31 lines: {
43 {
|   44  std::cout<<"* 1<<32"<<std::endl;
|   45  auto val1 = 1<<32;
|   46  display_type("1<<32", val1);
|   47  std::cout<<"1<<32 in base 2";
|   48  std::<<std::endl;
|   49  unsigned char * binary1 = reinterpret_cast< decltype(binary1) >(&val1);
|   50  for(auto i = (signed) sizeof(val1) - 1; i >= 0; --i)
|+  |-  51 +---  3 lines: {
51  {
||  52   std::cout<<std::bitset< 8 >(binary1[i]);
||  53  }
|   54  std::cout<<std::endl;
|   55  std::cout<<"1<<32 in base 10";
|   56  std::cout<<std::endl<<val1<<std::endl;
|   57 
|   58  std::cout<<std::endl;
|   59  std::cout<<"* ((unsigned long long int) 1)<<32"<<std::endl;
|   60  auto val2 = ((unsigned long long int) 1)<<32;
|   61  display_type("((unsigned long long int) 1)<<32", val2);
|   62  std::cout<<"((unsigned long long int) 1)<<32 in base 2";
|   63  std::cout<<std::endl;
|   64  unsigned char * binary2 = reinterpret_cast< decltype(binary2) >(&val2);
|   65  for(auto i = (signed) sizeof(val2) - 1; i >= 0; --i)
|+  |-  66 +---  3 lines: {
66  {
||  67   std::cout<<std::bitset< 8 >(binary2[i]);
||  68  }
|   69  std::cout<<std::endl;
|   70  std::cout<<"((unsigned long long int) 1)<<32 in base 10";
|   71  std::cout<<std::endl<<val2<<std::endl;
|   72  return 0;
|   73 }
Mastodon Follow me Mastodon Share
Comments
There is no comment yet.
Post a comment

* required field.

Your comment

*

About you

*

*

Dr Clément Février

Bonjour, Je suis Clément Février, docteur en physique théorique de l’université de Grenoble Alpes, ingénieur Recherche et Développement dans le domaine de l’imagerie médicale et de la chirurgie mini-invasive chez Surgivisio et soutien du mouvement La France Insoumise.


Dites les velotaffeur, vous avez entendu parler d'un casque qui a un feu avant, un feu arrière et des clignotants sur les cotés / avant arrière ? (Enfin un simple bandeau led de chaque coté se gère hein)
Genre un truc pour que les voitures voient qu'on tourne à droite ou gauche sans qu'on ai besoin de tendre le bras ?

Parce que bon, j'ai testé ce soir, passer sur les marquages au sol + les rails du tramway avec un bras levé sous la pluie, niveau stabilité on repassera ^^'

J'arrive pas à trouver :/ et les solutions existantes de clignotants sont seulement pour l'arrière

Si le coeur vous en dit vous pouvez partager ;)

#velotaff #boostappréciés :)

The only thing that really matter is, at the end of the day, to choose the correct method automatically without conditional statement at runtime. I really wonder is a workaround is possible.
I already tried to add "std::tuple< d1*, d2 > b_tuple" to class b, using some forward declaration, but, of course, std::get< i >(b_tuple) cannot compile since it will be known at runtime only.

But I allow complete refactoring code, even the b and d1 & d2. Templates, new classes and sub classes and even are fine (I already put 100-lines define to add iterator to enum and another one to allow enum inheritance, and my coworkers begin to hate me ^^)
For example, method "create" can become a class, or class b can be construct using a macro to make somehow the base class aware of derived ones
BASE(b, ...) std::tuple< __VA_ARGS__ > g_ ## b; class b