.. highlight:: c++ .. _developguide: Development Guide ================= The chapter gives guidance on how to extend the functionality of QMCPACK. Future examples will likely include topics such as the addition of a Jastrow function or a new QMC method. .. admonition:: Definitions * **Legacy**: code from previous work usually not in line current coding standards or design. It is mostly functional and correct within the context of legacy operations. Most has been modified piecemeal for years to extend functionality. * **Refactoring**: Process of redesigning code in place through incremental changes toward current design and functionality goals. QMCPACK coding standards ------------------------ This section presents what we collectively have agreed are best practices for the code. This includes formatting style, naming conventions, documentation conventions, and certain prescriptions for C++ language use. At the moment only the formatting can be enforced in an objective fashion. New development should follow these guidelines, and contributors are expected to adhere to them as they represent an integral part of our effort to continue QMCPACK as a world-class, sustainable QMC code. Although some of the source code has a ways to go to live up to these ideas, new code, even in old files, should follow the new conventions not the local conventions of the file whenever possible. Work on the code with continuous improvement in mind rather than a commitment to stasis. Files ~~~~~ Each file should start with the header. .. code:: c++ ////////////////////////////////////////////////////////////////////////////////////// // This file is distributed under the University of Illinois/NCSA Open Source License. // See LICENSE file in top directory for details. // // Copyright (c) 2025 QMCPACK developers // // File developed by: Name, email, affiliation // // File created by: Name, email, affiliation ////////////////////////////////////////////////////////////////////////////////////// If you make significant changes to an existing file, add yourself to the list of "developed by" authors. File organization ^^^^^^^^^^^^^^^^^ Header files should be placed in the same directory as their implementations. Unit tests should be written for all new functionality. These tests should be placed in a ``tests`` subdirectory below the implementations. File names ^^^^^^^^^^ Each class should be defined in a separate file with the same name as the class name. Use separate ``.cpp`` implementation files whenever possible to aid in incremental compilation. The filenames of tests are composed by the filename of the object tested and the prefix ``test_``. The filenames of *fake* and *mock* objects used in tests are composed by the prefixes ``fake_`` and ``mock_``, respectively, and the filename of the object that is imitated. Header files ^^^^^^^^^^^^ All header files should be self-contained (i.e., not dependent on following any other header when it is included). Nor should they include files that are not necessary for their use (i.e., headers needed only by the implementation). Implementation files should not include files only for the benefit of files they include. There are many header files that currently violate this. Each header must use ``#define`` guards to prevent multiple inclusion. The symbol name of the ``#define`` guards should be ``NAMESPACE(s)_CLASSNAME_H``. Includes ^^^^^^^^ Related header files should be included without any path. Header files from external projects and standard libraries should be includes using the ```` convention, while headers that are part of the QMCPACK project should be included using the ``"our_header.h"`` convention. We are now using a new header file inclusion style following the modern CMake transition in QMCPACK, while the legacy code may still use the legacy style. Newly written code and refactored code should be transitioned to the new style. New style for modern CMake """""""""""""""""""""""""" In QMCPACK, include paths are handled by modern CMake target dependency. Every top level folder is at least one target. For example, ``src/Particle/CMakeLists.txt`` defines `qmcparticle` target. It propagates include path ``qmcpack/src/Particle`` to compiling command lines in CMake via .. code:: cmake TARGET_INCLUDE_DIRECTORIES(qmcparticle PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}") For this reason, the file ``qmcpack/src/Particle/Lattice/ParticleBConds3DSoa.h`` should be included as :: #include "Lattice/ParticleBConds3DSoa.h" If the compiled file is not part of the same target as `qmcparticle`, the target it belongs to should have a dependency on `qmcparticle`. For example, test source files under ``qmcpack/src/Particle/tests`` are not part of `qmcparticle` and thus requires the following additional CMake setting .. code:: cmake TARGET_LINK_LIBRARIES(${UTEST_EXE} qmcparticle) Legacy style """""""""""" Header files should be included with the full path based on the ``src`` directory. For example, the file ``qmcpack/src/QMCWaveFunctions/SPOSet.h`` should be included as :: #include "QMCWaveFunctions/SPOSet.h" Even if the included file is located in the same directory as the including file, this rule should be obeyed. Ordering """""""" For readability, we suggest using the following standard order of includes: #. related header #. std C library headers #. std C++ library headers #. Other libraries’ headers #. QMCPACK headers In each section the included files should be sorted in alphabetical order. Naming ~~~~~~ The balance between description and ease of implementation should be balanced such that the code remains self-documenting within a single terminal window. If an extremely short variable name is used, its scope must be shorter than :math:`\sim 40` lines. An exception is made for template parameters, which must be in all CAPS. Legacy code contains a great variety of hard to read code style, read this section and do not imitate existing code that violates it. Namespace names ^^^^^^^^^^^^^^^ Namespace names should be one word, lowercase. Type and class names ^^^^^^^^^^^^^^^^^^^^ Type and class names should start with a capital letter and have a capital letter for each new word. Underscores (``_``) are not allowed. It's redundant to end these names with ``Type`` or ``_t``. .. code:: cpp // no using ValueMatrix_t = Matrix; using RealType = double; Variable names ^^^^^^^^^^^^^^ Variable names should not begin with a capital letter, which is reserved for type and class names. Underscores (``_``) should be used to separate words. Class data members ^^^^^^^^^^^^^^^^^^ Class private/protected data members names should follow the convention of variable names with a trailing underscore (``_``). The use of public member functions is discourage, rethink the need for it in the first place. Instead ``get`` and ``set`` functions are the preferred access method. (Member) function names ^^^^^^^^^^^^^^^^^^^^^^^ Function names should start with a lowercase character and have a capital letter for each new word. The exception are the special cases for prefixed multiwalker (``mw_``) and flex (``flex_``) batched API functions. Coding convention should follow after those prefixes. Template Parameters ^^^^^^^^^^^^^^^^^^^ Template parameters names should be in all caps with (``_``) separating words. It's redundant to end these names with ``_TYPE``, Lambda expressions ^^^^^^^^^^^^^^^^^^ Named lambda expressions follow the naming convention for functions: .. code:: cpp auto myWhatever = [](int i) { return i + 4; }; Macro names ^^^^^^^^^^^ Macro names should be all uppercase and can include underscores (``_``). The underscore is not allowed as first or last character. Test case and test names ^^^^^^^^^^^^^^^^^^^^^^^^ Test code files should be named as follows: .. code:: cpp class DiracMatrix; //leads to test_dirac_matrix.cpp //which contains test cases named TEST_CASE("DiracMatrix_update_row","[wavefunction][fermion]") where the test case covers the ``updateRow`` and ``[wavefunction][fermion]`` indicates the test belongs to the fermion wavefunction functionality. Comments ~~~~~~~~ Comment style ^^^^^^^^^^^^^ Use the ``// Comment`` syntax for actual comments. Use .. code:: cpp /** base class for Single-particle orbital sets * * SPOSet stands for S(ingle)P(article)O(rbital)Set which contains * a number of single-particle orbitals with capabilities of * evaluating \f$ \psi_j({\bf r}_i)\f$ */ or .. code:: cpp ///index in the builder list of sposets int builder_index; Documentation ^^^^^^^^^^^^^ Doxygen will be used for source documentation. Doxygen commands should be used when appropriate guidance on this has been decided. File docs """"""""" Do not put the file name after the ``\file`` Doxygen command. Doxygen will fill it in for the file the tag appears in. .. code:: cpp /** \file * File level documentation */ Class docs """""""""" Every class should have a short description (in the header of the file) of what it is and what is does. Comments for public class member functions follow the same rules as general function comments. Comments for private members are allowed but are not mandatory. Function docs """"""""""""" For function parameters whose type is non-const reference or pointer to non-const memory, it should be specified if they are input (In:), output (Out:) or input-output parameters (InOut:). Example: .. code:: cpp /** Updates foo and computes bar using in_1 .. in_5. * \param[in] in_3 * \param[in] in_5 * \param[in,out] foo * \param[out] bar */ //This is probably not what our clang-format would do void computeFooBar(Type in_1, const Type& in_2, Type& in_3, const Type* in_4, Type* in_5, Type& foo, Type& bar); Variable documentation """""""""""""""""""""" Name should be self-descriptive. If you need documentation consider renaming first. Golden rule of comments ^^^^^^^^^^^^^^^^^^^^^^^ If you modify a piece of code, also adapt the comments that belong to it if necessary. Formatting and "style" (clang-format) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We make use of clang-format for automatic source code formatting. It is widely available, including in all standard package managers. clang-format removes the burden of formatting source code by hand. We provide a clang-format style in ``src/.clang-format``. Use this to format ``.h``, ``.hpp``, ``.cu``, and ``.cpp`` files. Many of the following rules will be applied to the code by clang-format, which should allow you to ignore most of them if you always run it on your modified code. Use of clang-format can be fully automated: there are many guides to the use of clang-format with different editors online, as well as using it manually. In addition to running the formatting in your editor, you can use a Git precommit hook to run clang-format automatically, or you can run clang-format manually. If working on old code, please clang-format the file, and if you see particularly large changes, commit the formatting changes in a separate pull request. Disable the use of clang-format for sections of code that would be more readable when formatted differently, e.g., for matrices or for code blocks where a particular alignment would clearly help readability. .. code:: cpp // clang-format off Matrix my_matrix = { {1, 2, 3}, {4, 5, 6}, {7, 8 ,9} }; // clang-format on Indentation ^^^^^^^^^^^ Indentation consists of two spaces. Do not use tabs in the code. Line length ^^^^^^^^^^^ The length of each line of your code should be at most *120* characters. Horizontal spacing ^^^^^^^^^^^^^^^^^^ No trailing white spaces should be added to any line. Use no space before a comma (``,``) and a semicolon (``;``), and add a space after them if they are not at the end of a line. Preprocessor directives ^^^^^^^^^^^^^^^^^^^^^^^ The preprocessor directives are not indented. The hash is the first character of the line. Binary operators ^^^^^^^^^^^^^^^^ The assignment operators should always have spaces around them. Unary operators ^^^^^^^^^^^^^^^ Do not put any space between an unary operator and its argument. Types ^^^^^ The ``using`` syntax is preferred to ``typedef`` for type aliases. If the actual type is not excessively long or complex, simply use it; renaming simple types makes code less understandable. Pointers and references ^^^^^^^^^^^^^^^^^^^^^^^ Pointer or reference operators should go with the type. But understand the compiler reads them from right to left. .. code:: cpp Type* var; Type& var; //Understand this is incompatible with multiple declarations Type* var1, var2; // var1 is a pointer to Type but var2 is a Type. Templates ^^^^^^^^^ The angle brackets of templates should not have any external or internal padding. .. code:: cpp template class Class1; Class1> object; Vertical spacing ^^^^^^^^^^^^^^^^ Use empty lines when it helps to improve the readability of the code, but do not use too many. Do not use empty lines after a brace that opens a scope or before a brace that closes a scope. Each file should contain an empty line at the end of the file. Some editors add an empty line automatically, some do not. Variable declarations and definitions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - Avoid declaring multiple variables in the same declaration, especially if they are not fundamental types: .. code:: cpp int x, y; // Not recommended Matrix a("my-matrix"), b(size); // Not allowed // Preferred int x; int y; Matrix a("my-matrix"); Matrix b(10); - Use the following order for keywords and modifiers in variable declarations: .. code:: cpp // General type [static] [const/constexpr] Type variable_name; // Pointer [static] [const] Type* [const] variable_name; // Integer // the int is not optional not all platforms support long, etc. [static] [const/constexpr] [signedness] [size] int variable_name; // Examples: static const Matrix a(10); const double* const d(3.14); constexpr unsigned long l(42); Function declarations and definitions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The return type should be on the same line as the function name. Parameters should also be on the same line unless they do not fit on it, in which case one parameter per line aligned with the first parameter should be used. Also include the parameter names in the declaration of a function, that is, .. code:: cpp // calculates a*b+c double function(double a, double b, double c); // avoid double function(double, double, double); // dont do this double function(BigTemplatedSomething a, BigTemplatedSomething b, BigTemplatedSomething c); // do this double function(BigTemplatedSomething a, BigTemplatedSomething b, BigTemplatedSomething c); Conditionals ^^^^^^^^^^^^ Examples: .. code:: cpp if (condition) statement; else statement; if (condition) { statement; } else if (condition2) { statement; } else { statement; } Switch statement ^^^^^^^^^^^^^^^^ Switch statements should always have a default case. Example: .. code:: cpp switch (var) { case 0: statement1; statement2; break; case 1: statement1; statement2; break; default: statement1; statement2; } Loops ^^^^^ Examples: .. code:: cpp for (statement; condition; statement) statement; for (statement; condition; statement) { statement1; statement2; } while (condition) statement; while (condition) { statement1; statement2; } do { statement; } while (condition); .. _class-format: Class format ^^^^^^^^^^^^ ``public``, ``protected``, and ``private`` keywords are not indented. Example: .. code:: cpp class Foo : public Bar { public: Foo(); explicit Foo(int var); void function(); void emptyFunction() {} void setVar(const int var) { var_ = var; } int getVar() const { return var_; } private: bool privateFunction(); int var_; int var2_; }; Constructor initializer lists """"""""""""""""""""""""""""" Examples: .. code:: cpp // When everything fits on one line: Foo::Foo(int var) : var_(var) { statement; } // If the signature and the initializer list do not // fit on one line, the colon is indented by 4 spaces: Foo::Foo(int var) : var_(var), var2_(var + 1) { statement; } // If the initializer list occupies more lines, // they are aligned in the following way: Foo::Foo(int var) : some_var_(var), some_other_var_(var + 1) { statement; } // No statements: Foo::Foo(int var) : some_var_(var) {} Namespace formatting ^^^^^^^^^^^^^^^^^^^^ The content of namespaces is not indented. A comment should indicate when a namespace is closed. (clang-format will add these if absent). If nested namespaces are used, a comment with the full namespace is required after opening a set of namespaces or an inner namespace. Examples: .. code:: c++ namespace ns { void foo(); } // ns namespace ns1 { namespace ns2 { // ns1::ns2:: void foo(); namespace ns3 { // ns1::ns2::ns3:: void bar(); } // ns3 } // ns2 namespace ns4 { namespace ns5 { // ns1::ns4::ns5:: void foo(); } // ns5 } // ns4 } // ns1 QMCPACK C++ guidance -------------------- The guidance here, like any advice on how to program, should not be treated as a set of rules but rather the hard-won wisdom of many hours of suffering development. In the past, many rules were ignored, and the absolute worst results of that will affect whatever code you need to work with. Your PR should go much smoother if you do not ignore them. Encapsulation ~~~~~~~~~~~~~ A class is not just a naming scheme for a set of variables and functions. It should provide a logical set of methods, could contain the state of a logical object, and might allow access to object data through a well-defined interface related variables, while preserving maximally ability to change internal implementation of the class. Do not... ^^^^^^^^^ - use ``struct`` as a way to avoid controlling access to the class. Only when the instantiated objects are public data structure is ``struct`` appropriate. - use inheritance primarily as a means to break encapsulation. If your class could aggregate or compose another class, do that, and access it solely through its public interface. This will reduce dependencies. - pass entire classes as function arguments when what the function actually requires is one or a few arguments available through the public API of a class, encapsulation is harmful when is just used to save keystrokes and hides the logical data dependence of functions. All these "idioms" are common in legacy QMCPACK code and should not be perpetuated. Casting ~~~~~~~ In C++ source, avoid C style casts; they are difficult to search for and imprecise in function. Explicit C++ style casts make it clear what the safety of the cast is and what sort of conversion is expected to be possible. .. code::c++ int c = 2; int d = 3; double a; a = static_cast(c) / d; // Ok const class1 c1; class2* c2; c2 = (class2*)&c1; // NO SPOSetAdvanced* spo_advanced = new SPOSetAdvanced(); SPOSet* spo = (SPOSet*)spo_advanced; // NO SPOSet* spo = static_cast(spo_advanced); // OK if upcast, dangerous if downcast Pre-increment and pre-decrement ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use the pre-increment (pre-decrement) operator when a variable is incremented (decremented) and the value of the expression is not used. In particular, use the pre-increment (pre-decrement) operator for loop counters where i is not used: .. code:: cpp for (int i = 0; i < N; ++i) { doSomething(); } for (int i = 0; i < N; i++) { doSomething(i); } The post-increment and post-decrement operators create an unnecessary copy that the compiler cannot optimize away in the case of iterators or other classes with overloaded increment and decrement operators. Alternative operator representations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Alternative representations of operators and other tokens such as ``and``, ``or``, and ``not`` instead of ``&&``, ``||``, and ``!`` are not allowed. For the reason of consistency, the far more common primary tokens should always be used. Use of const ~~~~~~~~~~~~ - Add the ``const`` qualifier to all function parameters that are not modified in the function body. - For parameters passed by value, add only the keyword in the function definition. - Member functions should be specified const whenever possible. .. code:: cpp // Declaration int computeFoo(int bar, const Matrix& m) // Definition int computeFoo(const int bar, const Matrix& m) { int foo = 42; // Compute foo without changing bar or m. // ... return foo; } class MyClass { int count_ ... int getCount() const { return count_;} } Smart pointers ~~~~~~~~~~~~~~ Use of smart pointers is being adopted to help make QMCPACK memory leak free. Prior to C++11, C++ uses C-style pointers. A C-style pointer can have several meanings and the ownership of a piece of help memory may not be clear. This leads to confusion and causes memory leaks if pointers are not managed properly. Since C++11, smart pointers were introduced to resolve this issue. In addition, it demands developers to think about the ownership and lifetime of declared pointer objects. std::unique_ptr ^^^^^^^^^^^^^^^ A unique pointer is the unique owner of a piece of allocated memory. Pointers in per-walker data structure with distinct contents should be unique pointers. For example, every walker has a trial wavefunction object which contains an SPO object pointer. Because the SPO object has a vector to store SPO evaluation results, it cannot be shared between two trial wavefunction objects. For this reason the SPO object pointer should be an unique pointer. In QMCPACK, most raw pointers can be directly replaced with ``std::unique_ptr``. Corresponding use of ``new`` operator can be replaced with ``std:make_unique``. std::shared_ptr ^^^^^^^^^^^^^^^ A shared pointer is the shared owner of a piece of allocated memory. Moving a pointer ownership from one place to another should not use shared pointers but C++ move semantics. Shared contents between walkers may be candidates for shared pointers. For example, although the Jastrow factor object must be unique per walker, the pointer to the parameter data structure can be a shared pointer. During Jastrow optimization, any update to the parameter data managed by the shared pointer will be effective immediately in all the Jastrow objects. In another example, spline coefficients are managed by a shared pointer which achieves a single copy in memory shared by an SPOSet and all of its clones. Log and error output ~~~~~~~~~~~~~~~~~~~~ Take care to consider the volume of output and avoid unneeded output from all MPI ranks. ``app_log``, ``app_warning``, ``app_err`` and ``app_debug`` print out messages only on rank 0 to avoid repetitive messages from every MPI rank. For this reason, they are only suitable for printing messages identical to all MPI ranks. ``app_debug`` prints only when ``--verbosity=debug`` command line option is used. Messages that come from only one or a few MPI ranks should use ``std::cout`` and ``std::cerr``. Stopping or Aborting QMCPACK ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If the code needs to be stopped after an unrecoverable error that happens uniformly on all the MPI ranks, a bad input for example, avoid using ``app_err`` together with ``Communicate::abort(msg)`` or ``APP_ABORT(msg)`` because any MPI rank other than rank 0 may stop the whole run before rank 0 is able to print out the error message. To secure the printout before stopping, use ``Communicate::barrier_and_abort(msg)`` if an MPI communicator is available or throw a custom exception ``UniformCommunicateError`` and capture it where ``Communicate::barrier_and_abort()`` can be used. Note that ``UniformCommunicateError`` can only be used for uniform error, improper use may cause QMCPACK hanging. In addition, avoid directly calling C function ``abort()``, ``exit()`` and ``MPI_Abort()`` for stopping the code. Adding and using timers ~~~~~~~~~~~~~~~~~~~~~~~ In your class header file, add ```#include ```. Add a timer enumeration and define the timer names in the header file or preferably cpp file. For example, put the following under "protected/private" .. code:: cpp enum PSTimers { PS_newpos, PS_donePbyP, PS_setActive, PS_update }; TimerList_t myTimers; Initialize timers in the constructor: .. code:: cpp # if you have many timers; const TimerNameList_t PSTimerNames = {{PS_newpos, "ParticleSet::computeNewPosDTandSK"}, {PS_donePbyP, "ParticleSet::donePbyP"}, {PS_setActive, "ParticleSet::setActive"}, {PS_update, "ParticleSet::update"}}; setup_timers(myTimers, PSTimerNames, timer_level_fine); # if you have just one timer to be registered in the manager. myTimers[PS_update] = TimerManager.createTimer("ParticleSet::update", timer_level_fine); Finally, there are two ways of using the timers. Using the ScopedTimer is required when possible. .. code:: cpp // RAII pattern { ScopedTimer update_scope(myTimers[PS_update]); //timed body } // explicit controlling myTimers[PS_update]->start(); //timed body myTimers[PS_update]->stop(); GitHub Pull Request guidance ---------------------------- In the following we provide guidance on creating Pull Requests (PRs) on GitHub that will be rapidly and easily merged into the development branch. The guidance is similar to those of many other open source projects. Most importantly, to the extent possible, we encourage the creation of small PRs focused on individual topics instead of one or a few larger PRs. Large PRs (many hundreds of lines, mixtures of bug fixes, features, and refactoring) have over time proven significantly more challenging to review, to update, and to keep current with other developmental activities. In contrast, small PRs that are focused on achieving a single well-defined task are straightforward to review and merge promptly. It will save you and the maintainers considerable time if you follow this principle, the guidelines below, and ask questions in advance! We aim to be flexible and reasonable but maintainer resources are limited. The `current workflow conventions`_ for the project are currently described in the wiki on the GitHub repository. As a general principle, we aim to keep the mainline development branch QMCPACK in a fully working and deployable/shippable state at all times. Therefore, all PRs must pass the existing tests. If this is not possible, a plan to simultaneously update the existing tests will need to be devised. Please discuss this in advance if possible. The GitHub documentation provides some suggestions for `helping others review your changes`_. Here we'd like to highlight a few practices: Do ~~ * Make simple PRs. 'Simple' doesn't refer to the lines of code or the number of files being touched. It is more about restricting the PR to a focused topic. For example, non-functional changes may affect many lines and files, but they are conceptually simple and easy to review. This category includes refactoring changes like renaming files, classes, functions, variables or code formatting. Marking class member variables private and accessing them via accessor functions are also considered in this category. * Make orthogonal PRs. PRs that mix multiple topics slow down reviewing and merging. When large PRs can be broken into simple PRs, the amount of dependency usually can be reduced. * To achieve large changes, such as the addition of a new major feature, devise a series of PRs and indicate your plan in the first PR. If orthogonality is not achievable, a series of simple dependent PRs are still better than a single large PR. For example, you could add initial input handling, some initial tests, the initial implementation of your feature, and then an integration test. * If you, e.g., spot and fix a bug or improve unrelated documentation while working on your feature, make a separate PR with these improvements. This helps keep the PRs focused on distinct and unrelated topics, making the reviews considerably easier. It will likely also make your improvements available in mainline sooner. * When changes are getting too big on a feature branch, please consider upstreaming certain changes to the develop branch. Once they are accepted, merging the develop branch to the feature branch effectively reduces the size of changes in the feature branch. For example, you are working on a feature and introduced new classes and files. They can be merged to develop even they are just preliminary. You don't need to make them waiting for the full completion of your feature. * Follow the instructions in this section for formatting. You can follow the formatting conventions automatically using clang-format. The mechanics of clang-format setup and use can be found at https://github.com/QMCPACK/qmcpack/wiki/Source-formatting. All modern editors support easy invocation of this tool. The clang-format file found at ``qmcpack/src/.clang-format`` should be run over all code touched in a PR. We also encourage developers to run clang-tidy with the ``qmcpack/src/.clang-tidy`` configuration over all new code. Don't ~~~~~ * Do not mix functional changes with non-functional ones. Mixing them dramatically increases the review challenge of non-functional changes. If non-functional changes touch many files, make sure to upstream them before the functional changes using orthogonal PRs if possible or dependent PRs. * Do not mix bug fixes with feature development. If a bug was surfaced during feature development, make a PR including the fix to the develop branch. * Do not delay potential early mergeable changes until the full completion of a feature. We are happy to merge code that has not been fully polished and for the polishing to occur in subsequent PRs. It is not necessary and they are harder to adjust if they end up conflicting with the rest of the code, plus you will be able to get suggestions in a more time timely manner. As much as possible, try to avoid the "Don't"s. Aim for something that would take a reviewer no more than an hour to read, understand, and review. In this way we can maintain a good collective development velocity. .. _current workflow conventions: https://github.com/QMCPACK/qmcpack/wiki/Development-workflow .. _helping others review your changes: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/getting-started/helping-others-review-your-changes Release Process --------------- This section documents the steps to follow to make a new release of QMCPACK. The examples are given for the 4.1.0 release. This simple process should be followed step-by-step to ensure accuracy and avoid need for a rerelease. 1. Make a fresh clone of the repo and create a release candidate branch labeled rc_Mmp where Mmp are digits following semantic versioning. :: git clone https://github.com/QMCPACK/qmcpack.git cd qmcpack git branch rc_410 git checkout rc_410 2. Update the VERSION numbers in the project section of qmcpack/CMakeLists.txt for the new release, e.g. 4.1.0 3. Update qmcpack/CHANGELOG.md by replacing the unreleased section with the new version and release date. Update the notes section to recommend the update to all users (if appropriate) and note any major headline items such as backwards incompatibility. The preferred way to get a list of all merged PRs is via the gh cli tool. Use the following to generate a starting point, after adjusting the dates to include the day of the previous release. Edit the output to include only "significant" changes. .. code:: shell gh search prs --merged-at 2025-02-05..2025-04-29 -R QMCPACK/qmcpack --json title,url,repository,number -L 1000 --jq '.[]|("* " + .title + " [#" + (.number|tostring) +"]("+ .url + ")\n")' 4. Commit and push the changes to GitHub :: git add qmcpack/CMakeLists.txt qmcpack/CHANGELOG.md git commit -m "Increase version number, update release notes" git push origin rc_410 5. On GitHub, make a pull request into the main branch from the rc. 6. Ensure CI tests run and pass. 7. Make an independent download of the proposed merged PR. 8. Verify code configures and builds on at least one system. Be sure to do this on a clean, freshly cloned repo. 9. Run all the tests apart from the long ones, confirm version reports correctly. e.g. ctest -VV -E long. Confirm the results are appropriately similar to the last nightly and weekly test results. 10. Push any needed fixes to the rc and repeat the tests. 11. Add notes on the test status to the PR. 12. Request reviewer / other PR approver approves the merge. At least one review is required to merge to main. ** The reviewer needs to check the updated version number and CHANGELOG. ** 13. Merge the PR. 14. Create a new release using the release tool on GitHub. Link to the CHANGELOG.md in the release notes. The GitHub link can be copied & updated from previous release. 15. Issue a PR back to develop from main to update with the new version number and any other updates made to the rc. 16. Once the PR is merged to develop, update the QMCPACK_VERSION_PATCH version to 9 in CMakeLists.txt in the develop branch to indicate it is the development version. 17. Delete the rc branch. 18. Verify readthedocs is seeing the new release and update readthedocs configuration if needed. 19. Announce the release on qmcpack.org. Create a new page of type "Release" with title similar to "QMCPACK Release v4.1.0 - 2025-04-30" via https://qmcpack.org/user. The page type is required for https://www.qmcpack.org/releases to update automatically. Also check the documentation pages. 20. Announce the release on Google Groups. 21. Update the spack package https://packages.spack.io/package.html?name=qmcpack .. include:: input_code.txt .. _distance-tables: Particles and distance tables ----------------------------- ParticleSets ~~~~~~~~~~~~ The ``ParticleSet`` class stores particle positions and attributes (charge, mass, etc). The ``R`` member stores positions. For calculations, the ``R`` variable needs to be transferred to the structure-of-arrays (SoA) storage in ``RSoA``. This is done by the ``update`` method. In the future the interface may change to use functions to set and retrieve positions so the SoA transformation of the particle data can happen automatically. For now, it is crucial to call ``P.update()`` to populate ``RSoA`` anytime ``P.R`` is changed. Otherwise, the distance tables associated with ``R`` will be uninitialized or out-of-date. .. code:: cpp const SimulationCell sc; ParticleSet elec(sc), ions(sc); elec.setName("e"); ions.setName("ion0"); // initialize ions ions.create({2}); ions.R[0] = {0.0, 0.0, 0.0}; ions.R[1] = {0.5, 0.5, 0.5}; ions.update(); // transfer to RSoA // initialize elec elec.create({1,1}); elec.R[0] = {0.0, 0.0, 0.0}; elec.R[1] = {0.0, 0.25, 0.0}; const int itab = elec.addTable(ions); elec.update(); // update RSoA and distance tables // d_table is an electron-ion distance table const auto& d_table = elec.getDistTableAB(itab); A particular distance table is retrieved with ``getDistTable``. Use ``addTable`` to add a ``ParticleSet`` and return the index of the distance table. If the table already exists the index of the existing table will be returned. The mass and charge of each particle is stored in ``Mass`` and ``Z``. The flag, ``SameMass``, indicates if all the particles have the same mass (true for electrons). Groups ^^^^^^ Particles can belong to different groups. For electrons, the groups are up and down spins. For ions, the groups are the atomic elements. The group type for each particle can be accessed through the ``GroupID`` member. The number of groups is returned from ``groups()``. The total number particles is accessed with ``getTotalNum()``. The number of particles in a group is ``groupsize(int igroup)``. The particle indices for each group are found with ``first(int igroup)`` and ``last(int igroup)``. Distance tables ~~~~~~~~~~~~~~~ Distance tables store distances between particles. There are symmetric (AA) tables for distance between like particles (electron-electron or ion-ion) and asymmetric (AB) tables for distance between unlike particles (electron-ion) The ``Distances`` and ``Displacements`` members contain the data. The indexing order is target index first, then source. For electron-ion tables, the sources are the ions and the targets are the electrons. Looping over particles ~~~~~~~~~~~~~~~~~~~~~~ Some sample code on how to loop over all the particles in an electron-ion distance table: .. code:: cpp // d_table is an electron-ion distance table for (int jat = 0; j < d_table.targets(); jat++) { // Loop over electrons for (int iat = 0; i < d_table.sources(); iat++) { // Loop over ions d_table.Distances[jat][iat]; } } Interactions sometimes depend on the type of group of the particles. The code can loop over all particles and use ``GroupID[idx]`` to choose the interaction. Alternately, the code can loop over the number of groups and then loop from the first to last index for those groups. This method can attain higher performance by effectively hoisting tests for group ID out of the loop. An example of the first approach is .. code:: cpp // P is a ParticleSet for (int iat = 0; iat < P.getTotalNum(); iat++) { int group_idx = P.GroupID[iat]; // Code that depends on the group index } An example of the second approach is .. code:: cpp // P is a ParticleSet assert(P.IsGrouped == true); // ensure particles are grouped for (int ig = 0; ig < P.groups(); ig++) { // loop over groups for (int iat = P.first(ig); iat < P.last(ig); iat++) { // loop over elements in each group // Code that depends on group } } Walker ------ .. note:: Batched Version Documentation The following documentation section describes the code design and behavior used when the modern ``driver_version == batch`` at runtime. Lightweight representation of a Markov chain walker's state. It is managed during each QMC driver section by ``MCPopulation``. Between sections it is stored in the WalkerConfigurations container class. Walker Identifiers (walker_id) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``Walker::walker_id_`` and ``Walker::parent_id_`` allow logging walkers for later constructing trajectories etc. The basic data relations involved are shown in :numref:`fig1`. In each QMC section each walker ID will be unique and is generated using the equation .. math:: :label: eq_walker_id walker_id = walker_id = num_walkers_created_++ * num_ranks_ + rank_ + 1 where ``num_walkers_created_`` is a member variable of the sole ```MCPopulation`` object on the rank and initially set to 0. Each walkers ``parent_id`` is set to 0 when it is constructed and is assigned to ``walker_id`` of the walker it is transferred or copied from. If that assignment is from previous section's or run's ``WalkerConfigurations`` object then the value of the ``Walker::getWalkerID()`` is multiplied by -1. If the Walker's initial configuration comes from the golden particle set the parent_id will be 0. .. _fig1: .. figure:: /uml/WalkerID_chen.pdf :width: 400 :align: center During DMC branching and load-balancing, there are two distinct mechanisms by which new walkers appear on each rank. One is when a walker is transferred from another rank, the other is when walkers of multiplicity >= 2 are split. For a given walker, both of these mechanisms potentially occur in order within each step, transfer first and then split. During the branching stage multiplicity of a walker is derived from its weight. Walker multiplicity summed over all walkers gives the population of walkers for the next step. Based on the total multiplicity on each rank, the highest multiplicity walkers on overpopulated ranks are fully or partially sent to underpopulated ranks for optimal load balance. When possible rank multiplicities are balanced by transferring fewer walkers with more than one unit of multiplicity for minimized transfer traffic. This unit is unfortunately called 'copy' in the source code but it is simply the multiplicity that the walker will have after it is unpacked on the receiving rank. That amount of multiplicity is removed from the walker on the sending rank. When walker transfer happens, the receiving walker overwrites its ``parent_id`` with the received value of ``walker_id`` before assigning a new ID to its ``walker_id``. The multiplicity of the receiving walker is set to the multiplicity that sending walker lost. Walkers with multiplicity < 1 are removed before transfers for creating vacant receiving walkers and after transfers for removing fully displaced walkers. Each rank can still carry walkers with multiplicity >= 2 at this point. Some of these maybe exist before the population balancing and some may have been received. To achieve optimal sampling ergodicity, all the high multiplicity walkers will follow one trajectory for each unit of multiplicity they have which means they need to become independent replicas. For each unit of multiplicity >= 2, a new walker is spawned. The ``parent_id`` of each spawned walker is assigned to the ``walker_id`` of the original high multiplicity walker. At the end of this process all walkers have multiplicity == 1. They keep their ``walker_id``'s from spawn time. The overarching rule of ``parent_id`` is when a newly active walker is transferred or split from an older walker, the older walker's ``walker_id`` becomes the new walker's ``parent_id`` and the new walker's ``walker_id`` is the global unique ID that it was spawned with. Wavefunction ------------ A full ``TrialWaveFunction`` is formulated as a product of all the components. Each component derives from ``WaveFunctionComponent``. .. math:: \psi = \prod_c {\tilde \psi_c} QMCPACK doesn't directly use the product form but mostly uses the log of the wavefunction. It is a natural fit for QMC algorithms and offers a numerical advantage on computers. The log value grows linearly instead of exponentially, beyond the range of double precision, with respect to the electron counts in a Slater-Jastrow wave function. The code contains an example of a wavefunction component for a Helium atom using a simple form and is described in :ref:`helium-wavefunction-example` Mathematical preliminaries ~~~~~~~~~~~~~~~~~~~~~~~~~~ The wavefunction evaluation functions compute the log of the wavefunction, the gradient and the Laplacian of the log of the wavefunction. Expanded, the gradient and Laplacian are .. math:: :label: eq264 \begin{aligned} {\bf G} & = \{ \nabla_i \ln(\psi) \} = \left\{ \sum_c \nabla_i \ln(\tilde \psi_c)\right\} , & {\bf \tilde G} & = \{ \nabla_i \ln(\tilde \psi) \} = \left\{ \frac{\nabla_i \tilde \psi}{\tilde \psi}\right\} \\ {\bf L} & = \{ \nabla^2_i \ln(\psi) \} = \left\{ \sum_c \nabla^2_i \ln(\tilde \psi_c) \right\}, & {\bf \tilde L} & = \{ \nabla^2_i \ln(\tilde \psi) \} = \left\{\frac{{\nabla^2_i} \tilde \psi}{\tilde \psi} - {\tilde G_i} \cdot {\tilde G_i} \right\} \end{aligned} where :math:`i` is the electron index. In this separable form, each wavefunction component computes its :math:`{\bf \tilde G}` ``WaveFunctionComponent::G`` and :math:`{\bf \tilde L}` ``WaveFunctionComponent::L``. The sum over components are stored in ``TrialWaveFunction::G`` and ``TrialWaveFunction::L``. The :math:`\frac{{\nabla ^2} \psi}{\psi}` needed by kinetic part of the local energy can be computed as .. math:: \frac{\nabla^2_i \psi}{\psi} = {\bf L_i} + {\bf G}_i \cdot {\bf G}_i see ``QMCHamiltonians/BareKineticEnergy.h``. Wavefunction evaluation ~~~~~~~~~~~~~~~~~~~~~~~ The process for creating a new wavefunction component class is to derive from WaveFunctionComponent and implement a number pure virtual functions. To start most of them can be empty. The following four functions evaluate the wavefunction values and spatial derivatives: ``evaluateLog`` Computes the log of the wavefunction and the gradient and Laplacian (of the log of the wavefunction) for all particles. The input is the\ ``ParticleSet``\ (``P``) (of the electrons). The log of the wavefunction should be stored in the ``LogValue`` member variable, and used as the return value from the function. The gradient is stored in ``G`` and the Laplacian in ``L``. ``ratio`` Computes the wavefunction ratio (not the log) for a single particle move (:math:`\psi_{new}/\psi_{old}`). The inputs are the ``ParticleSet``\ (``P``) and the particle index (``iat``). ``evalGrad`` Computes the gradient for a given particle. The inputs are the ``ParticleSet``\ (``P``) and the particle index (``iat``). ``ratioGrad`` Computes the wavefunction ratio and the gradient at the new position for a single particle move. The inputs are the ``ParticleSet``\ (``P``) and the particle index (``iat``). The output gradient is in ``grad_iat``; The ``updateBuffer`` function needs to be implemented, but to start it can simply call ``evaluateLog``. The ``put`` function should be implemented to read parameter specifics from the input XML file. Function use ~~~~~~~~~~~~ For debugging it can be helpful to know the under what conditions the various routines are called. The VMC and DMC loops initialize the walkers by calling ``evaluateLog``. For all-electron moves, each timestep advance calls ``evaluateLog``. If the ``use_drift`` parameter is no, then only the wavefunction value is used for sampling. The gradient and Laplacian are used for computing the local energy. For particle-by-particle moves, each timestep advance #. calls ``evalGrad`` #. computes a trial move #. calls ``ratioGrad`` for the wavefunction ratio and the gradient at the trial position. (If the ``use_drift`` parameter is no, the ``ratio`` function is called instead.) The following example shows part of an input block for VMC with all-electron moves and drift. .. code:: xml yes Particle distances ~~~~~~~~~~~~~~~~~~ The ``ParticleSet`` parameter in these functions refers to the electrons. The distance tables that store the inter-particle distances are stored as an array. To get the electron-ion distances, add the ion ``ParticleSet`` using ``addTable`` and save the returned index. Use that index to get the ion-electron distance table. .. code:: cpp const int ei_id = elecs.addTable(ions); // in the constructor only const auto& ei_table = elecs.getDistTable(ei_id); // when consuming a distance table Getting the electron-electron distances is very similar, just add the electron ``ParticleSet`` using ``addTable``. Only the lower triangle for the electron-electron table should be used. It is the only part of the distance table valid throughout the run. During particle-by-particle move, there are extra restrictions. When a move of electron iel is proposed, only the lower triangle parts [0,iel)[0,iel) [iel, Nelec)[iel, Nelec) and the row [iel][0:Nelec) are valid. In fact, the current implementation of distance based two and three body Jastrow factors in QMCPACK only needs the row [iel][0:Nelec). In ``ratioGrad``, the new distances are stored in the ``Temp_r`` and ``Temp_dr`` members of the distance tables. Legacy Setup ~~~~~~~~~~~~ .. warning:: The following describes a deprecated method of handling user input. It is not to be used for new code. A builder processes XML input, creates the wavefunction, and adds it to ``targetPsi``. Builders derive from ``WaveFunctionComponentBuilder``. The new builder hooks into the XML processing in ``WaveFunctionFactory.cpp`` in the ``build`` function. Caching values ~~~~~~~~~~~~~~ The ``acceptMove`` and ``restore`` methods are called on accepted and rejected moves for the component to update cached values. Threading ~~~~~~~~~ The ``makeClone`` function needs to be implemented to work correctly with OpenMP threading. There will be one copy of the component created for each thread. If there is no extra storage, calling the copy constructor will be sufficient. If there are cached values, the clone call may need to create space. Parameter optimization ~~~~~~~~~~~~~~~~~~~~~~ The ``checkInVariables``, ``checkOutVariables``, and ``resetParameters`` functions manage the variational parameters. Optimizable variables also need to be registered when the XML is processed. Variational parameter derivatives are computed in the ``evaluateDerivatives`` function. It computes the derivatives of both the log of the wavefunction and kinetic energy with respect to optimizable parameters and adds the results to the corresponding output arrays. The kinetic energy derivatives are computed as .. math:: \sum_i -\frac{1}{2 m_i}({\partial}_\alpha {\bf L}_i + 2 {\bf G}_i \cdot {\partial}_\alpha {\bf G}_i) with each ``WaveFunctionComponent`` contributing .. math:: -\frac{1}{2}{\partial}_\alpha \tilde L - G \cdot {\partial}_\alpha \tilde G Right now :math:`1/m` factor is applied in ``TrialWaveFunction``. This is a bug when the particle set doesn't hold equal mass particles. .. _helium-wavefunction-example: Helium Wavefunction Example ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The code contains an example of a wavefunction component for a Helium atom using STO orbitals and a Pade Jastrow. The wavefunction is .. math:: :label: eq265 \psi = \frac{1}{\sqrt{\pi}} \exp(-Z r_1) \exp(-Z r_2) \exp(A / (1 + B r_{12})) where :math:`Z = 2` is the nuclear charge, :math:`A=1/2` is the electron-electron cusp, and :math:`B` is a variational parameter. The electron-ion distances are :math:`r_1` and :math:`r_2`, and :math:`r_{12}` is the electron-electron distance. The wavefunction is the same as the one expressed with built-in components in ``examples/molecules/He/he_simple_opt.xml``. The code is in ``src/QMCWaveFunctions/ExampleHeComponent.cpp``. The builder is in ``src/QMCWaveFunctions/ExampleHeBuilder.cpp``. The input file is in ``examples/molecules/He/he_example_wf.xml``. A unit test compares results from the wavefunction evaluation functions for consistency in ``src/QMCWaveFunctions/tests/test_example_he.cpp``. The recommended approach for creating a new wavefunction component is to copy the example and the unit test. Implement the evaluation functions and ensure the unit test passes. Linear Algebra -------------- Like in many methods which solve the Schrödinger equation, linear algebra plays a critical role in QMC algorithms and thus is crucial to the performance of QMCPACK. There are a few components in QMCPACK use BLAS/LAPACK with their own characteristics. Real space QMC ~~~~~~~~~~~~~~ Single particle orbitals ^^^^^^^^^^^^^^^^^^^^^^^^ Spline evaluation as commonly used in solid-state simulations does not use any dense linear algebra library calls. LCAO evaluation as commonly used in molecular calculations relies on BLAS2 GEMV to compute SPOs from a basis set. Slater determinants ^^^^^^^^^^^^^^^^^^^ Slater determinants are calculated on :math:`N \times N` Slater matrices. :math:`N` is the number of electrons for a given spin. In the actually implementation, operations on the inverse matrix of Slater matrix for each walker dominate the computation. To initialize it, DGETRF and DGETRI from LAPACK are called. The inverse matrix can be stored out of place. During random walking, inverse matrices are updated by either Sherman-Morrison rank-1 update or delayed update. Update algorithms heavily relies on BLAS. All the BLAS operations require S,C,D,Z cases. Sherman-Morrison rank-1 update uses BLAS2 GEMV and GER on :math:`N \times N` matrices. Delayed rank-K update uses - BLAS1 SCOPY on :math:`N` array. - BLAS2 GEMV, GER on :math:`k \times N` and :math:`k \times k` matrices. :math:`k` ranges from 1 to :math:`K` when updates are delayed and accumulated. - BLAS3 GEMM at the final update. - ’T’, ’N’, K, N, N - ’N’, ’N’, N, K, K - ’N’, ’N’, N, N, K The optimal K depends on the hardware but it usually ranges from 32 to 256. QMCPACK solves systems with a few to thousands of electrons. To make all the BLAS/LAPACK operation efficient on accelerators. Batching is needed and optimized for :math:`N < 2000`. Non-batched functions needs to be optimized for :math:`N > 500`. Note: 2000 and 500 are only rough estimates. Wavefunction optimizer ^^^^^^^^^^^^^^^^^^^^^^ to be added. Auxiliary field QMC ~~~~~~~~~~~~~~~~~~~ The AFQMC implementation in QMCPACK relies heavily on linear algebra operations from BLAS/LAPACK. The performance of the code is netirely dependent on the performance of these libraries. See below for a detailed list of the main routines used from BLAS/LAPACK. Since the AFQMC code can work with both single and double precision builds, all 4 versions of these routines (S,C,D,Z) are generally needed, for this reason we omit the data type label. - BLAS1: SCAL, COPY, DOT, AXPY - BLAS2: GEMV, GER - BLAS3: GEMM - LAPACK: GETRF, GETRI, GELQF, UNGLQ, ORGLQ, GESVD, HEEVR, HEGVX While the dimensions of the matrix operations will depend entirely on the details of the calculation, typical matrix dimensions range from the 100s, for small system sizes, to over 20000 for the largest calculations attempted so far. For builds with GPU accelerators, we make use of batched and strided implementations of these routines. Batched implementations of GEMM, GETRF, GETRI, GELQF and UNGLQ are particularly important for the performance of the GPU build on small to medium size problems. Batched implementations of DOT, AXPY and GEMV would also be quite useful, but they are not yet generally available. On GPU builds, the code uses batched implementations of these routines when available by default. Slater-backflow wavefunction implementation details --------------------------------------------------- For simplicity, consider :math:`N` identical fermions of the same spin (e.g., up electrons) at spatial locations :math:`\{\mathbf{r}_1,\mathbf{r}_2,\dots,\mathbf{r}_{N}\}`. Then the Slater determinant can be written as .. math:: :label: eq245 S=\det M\:, where each entry in the determinant is an SPO evaluated at a particle position .. math:: :label: eq246 \begin{aligned} M_{ij} = \phi_i(\mathbf{r}_j)\:.\end{aligned} When backflow transformation is applied to the determinant, the particle coordinates :math:`\mathbf{r}_i` that go into the SPOs are replaced by quasi-particle coordinates :math:`\mathbf{x}_i`: .. math:: :label: eq247 \begin{aligned} M_{ij} = \phi_i(\mathbf{x}_j)\:, \end{aligned} where .. math:: :label: eq248 \begin{aligned} \mathbf{x}_i=\mathbf{r}_i+\sum\limits_{j=1,j\neq i}^N\eta(r_{ij})(\mathbf{r}_i-\mathbf{r}_j)\:. \end{aligned} :math:`r_{ij}=\vert\mathbf{r}_i-\mathbf{r}_j\vert`. The integers i,j label the particle/quasi-particle. There is a one-to-one correspondence between the particles and the quasi-particles, which is simplest when :math:`\eta=0`. Value ~~~~~ The evaluation of the Slater-backflow wavefunction is almost identical to that of a Slater wavefunction. The only difference is that the quasi-particle coordinates are used to evaluate the SPOs. The actual value of the determinant is stored during the inversion of the matrix :math:`M` (``cgetrf``\ :math:`\rightarrow`\ ``cgetri``). Suppose :math:`M=LU`, then :math:`S=\prod\limits_{i=1}^N L_{ii} U_{ii}`. .. code:: cpp // In DiracDeterminantWithBackflow::evaluateLog(P,G,L) Phi->evaluate(BFTrans->QP, FirstIndex, LastIndex, psiM,dpsiM,grad_grad_psiM); psiMinv = psiM; LogValue=InvertWithLog(psiMinv.data(),NumPtcls,NumOrbitals ,WorkSpace.data(),Pivot.data(),PhaseValue); QMCPACK represents the complex value of the wavefunction in polar coordinates :math:`S=e^Ue^{i\theta}`. Specifically, ``LogValue`` :math:`U` and ``PhaseValue`` :math:`\theta` are handled separately. In the following, we will consider derivatives of the log value only. Gradient ~~~~~~~~ To evaluate particle gradient of the log value of the Slater-backflow wavefunction, we can use the :math:`\log\det` identity in :eq:`eq249`. This identity maps the derivative of :math:`\log\det M` with respect to a real variable :math:`p` to a trace over :math:`M^{-1}dM`: .. math:: :label: eq249 \begin{aligned} \frac{\partial}{\partial p}\log\det M = \text{tr}\left( M^{-1} \frac{\partial M}{\partial p} \right) .\end{aligned} Following Kwon, Ceperley, and Martin :cite:`Kwon1993backflow`, the particle gradient .. math:: :label: eq250 \begin{aligned} G_i^\alpha \equiv \frac{\partial}{\partial r_i^\alpha} \log\det M = \sum\limits_{j=1}^N \sum\limits_{\beta=1}^3 F_{jj}^\beta A_{jj}^{\alpha\beta}\:, \end{aligned} where the quasi-particle gradient matrix .. math:: :label: eq251 \begin{aligned} A_{ij}^{\alpha\beta} \equiv \frac{\partial x_j^\beta}{\partial r_i^\alpha}\:,\end{aligned} and the intermediate matrix .. math:: :label: eq252 \begin{aligned} F_{ij}^\alpha\equiv\sum\limits_k M^{-1}_{ik} dM_{kj}^\alpha\:,\end{aligned} with the SPO derivatives (w.r. to quasi-particle coordinates) .. math:: :label: eq253 \begin{aligned} dM_{ij}^\alpha \equiv \frac{\partial M_{ij}}{\partial x_j^\alpha}\:.\end{aligned} Notice that we have made the name change of :math:`\phi\rightarrow M` from the notations of ref. :cite:`Kwon1993backflow`. This name change is intended to help the reader associate M with the QMCPACK variable ``psiM``. .. code:: cpp // In DiracDeterminantWithBackflow::evaluateLog(P,G,L) for(int i=0; iAmat(i,FirstIndex+j),Fmat(j,j)); } } } :eq:`eq250` is still relatively simple to understand. The :math:`A` matrix maps changes in particle coordinates :math:`d\mathbf{r}` to changes in quasi-particle coordinates :math:`d\mathbf{x}`. Dotting A into F propagates :math:`d\mathbf{x}` to :math:`dM`. Thus :math:`F\cdot A` is the term inside the trace operator of :eq:`eq249`. Finally, performing the trace completes the evaluation of the derivative. Laplacian ~~~~~~~~~ The particle Laplacian is given in :cite:`Kwon1993backflow` as .. math:: :label: eq254 \begin{aligned} L_i \equiv \sum\limits_{\beta} \frac{\partial^2}{\partial (r_i^\beta)^2} \log\det M = \sum\limits_{j\alpha} B_{ij}^\alpha F_{jj}^\alpha - \sum\limits_{jk}\sum\limits_{\alpha\beta\gamma} A_{ij}^{\alpha\beta}A_{ik}^{\alpha\gamma}\times\left(F_{kj}^\alpha F_{jk}^\gamma -\delta_{jk}\sum\limits_m M^{-1}_{jm} d2M_{mj}^{\beta\gamma}\right), \end{aligned} where the quasi-particle Laplacian matrix .. math:: :label: eq255 \begin{aligned} B_{ij}^{\alpha} \equiv \sum\limits_\beta \frac{\partial^2 x_j^\alpha}{\partial (r_i^\beta)^2}\:,\end{aligned} with the second derivatives of the single-particles orbitals being .. math:: :label: eq256 \begin{aligned} d2M_{ij}^{\alpha\beta} \equiv \frac{\partial^2 M_{ij}}{\partial x_j^\alpha\partial x_j^\beta}\:.\end{aligned} Schematically, :math:`L_i` has contributions from three terms of the form :math:`BF, AAFF, and tr(AA,Md2M)`, respectively. :math:`A, B, M ,d2M,` and :math:`F` can be calculated and stored before the calculations of :math:`L_i`. The first :math:`BF` term can be directly calculated in a loop over quasi-particle coordinates :math:`j\alpha`. .. code:: cpp // In DiracDeterminantWithBackflow::evaluateLog(P,G,L) for(int j=0; jBmat_full(i,FirstIndex+j)[a]*Fmat(j,j)[a]; Notice that :math:`B_{ij}^\alpha` is stored in ``Bmat_full``, NOT ``Bmat``. The remaining two terms both involve :math:`AA`. Thus, it is best to define a temporary tensor :math:`AA`: .. math:: :label: eq257 \begin{aligned} {}_iAA_{jk}^{\beta\gamma} \equiv \sum\limits_\alpha A_{ij}^{\alpha\beta} A_{ij}^{\alpha\gamma}\:,\end{aligned} which we will overwrite for each particle :math:`i`. Similarly, define :math:`FF`: .. math:: :label: eq258 \begin{aligned} FF_{jk}^{\alpha\gamma} \equiv F_{kj}^\alpha F_{jk}^\gamma\:,\end{aligned} which is simply the outer product of :math:`F\otimes F`. Then the :math:`AAFF` term can be calculated by fully contracting :math:`AA` with :math:`FF`. .. code:: cpp // In DiracDeterminantWithBackflow::evaluateLog(P,G,L) for(int j=0; j AA = dot(transpose(BFTrans->Amat(i,FirstIndex+j)),BFTrans->Amat(i,FirstIndex+k)); HessType FF = outerProduct(Fmat(k,j),Fmat(j,k)); myL(i) -= traceAtB(AA,FF); } Finally, define the SPO derivative term: .. math:: :label: eq259 \begin{aligned} Md2M_j^{\beta\gamma} \equiv \sum\limits_m M^{-1}_{jm} d2M_{mj}^\beta\:,\end{aligned} then the last term is given by the contraction of :math:`Md2M` (``q_j``) with the diagonal of :math:`AA`. .. code:: cpp for(int j=0; j AA = dot( transpose(BFTrans->Amat(i,FirstIndex+j)), BFTrans->Amat(i,FirstIndex+j) ); myL(i) += traceAtB(AA,q_j); } } Wavefunction parameter derivative ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To use the robust linear optimization method of :cite:`Toulouse2007linear`, the trial wavefunction needs to know its contributions to the overlap and hamiltonian matrices. In particular, we need derivatives of these matrices with respect to wavefunction parameters. As a consequence, the wavefunction :math:`\psi` needs to be able to evaluate :math:`\frac{\partial}{\partial p} \ln \psi` and :math:`\frac{\partial}{\partial p} \frac{\mathcal{H}\psi}{\psi}`, where :math:`p` is a parameter. When 2-body backflow is considered, a wavefunction parameter :math:`p` enters the :math:`\eta` function only (equation :eq:`eq248`). :math:`\mathbf{r}`, :math:`\phi`, and :math:`M` do not explicitly dependent on :math:`p`. Derivative of the log value is almost identical to particle gradient. Namely, :eq:`eq250` applies upon the substitution :math:`r_i^\alpha\rightarrow p`. .. math:: :label: eq260 \begin{aligned} \frac{\partial}{\partial p} \ln\det M = \sum\limits_{j=1}^N \sum\limits_{\beta=1}^3 F_{jj}^\beta \left({}_pC_{j}^{\beta}\right)\:,\end{aligned} where the quasi-particle derivatives are stored in ``Cmat`` .. math:: :label: eq261 \begin{aligned} {}_pC_{i}^{\alpha} \equiv \frac{\partial}{\partial p} x_{i}^{\alpha}\:.\end{aligned} The change in local kinetic energy is a lot more difficult to calculate .. math:: :label: eq262 \begin{aligned} \frac{\partial T_{\text{local}}}{\partial p} = \frac{\partial}{\partial p} \left\{ \left( \sum\limits_{i=1}^N \frac{1}{2m_i} \nabla^2_i \right) \ln \det M \right\} = \sum\limits_{i=1}^N \frac{1}{2m_i} \frac{\partial}{\partial p} L_i\:, \end{aligned} where :math:`L_i` is the particle Laplacian defined in :eq:`eq254` To evaluate :eq:`eq262`, we need to calculate parameter derivatives of all three terms defined in the Laplacian evaluation. Namely :math:`(B)(F)`, :math:`(AA)(FF)`, and :math:`\text{tr}(AA,Md2M)`, where we have put parentheses around previously identified data structures. After :math:`\frac{\partial}{\partial p}` hits, each of the three terms will split into two terms by the product rule. Each smaller term will contain a contraction of two data structures. Therefore, we will need to calculate the parameter derivatives of each data structure defined in the Laplacian evaluation: .. math:: :label: eq263 \begin{aligned} {}_pX_{ij}^{\alpha\beta} \equiv \frac{\partial}{\partial p} A_{ij}^{\alpha\beta}\:, \\ {}_pY_{ij}^{\alpha} \equiv \frac{\partial}{\partial p} B_{ij}^{\alpha}\:, \\ {}_pdF_{ij}^{\alpha} \equiv \frac{\partial}{\partial p} F_{ij}^{\alpha}\:, \\ {}_{pi}{AA'}_{jk}^{\beta\gamma} \equiv \frac{\partial}{\partial p} {}_iAA_{jk}^{\beta\gamma}\:, \\ {}_p {FF'}_{jk}^{\alpha\gamma} \equiv \frac{\partial}{\partial p} FF_{jk}^{\alpha\gamma}\:, \\ {}_p {Md2M'}_{j}^{\beta\gamma} \equiv \frac{\partial}{\partial p} Md2M_j^{\beta\gamma}\:.\end{aligned} X and Y are stored as ``Xmat`` and ``Ymat_full`` (NOT ``Ymat``) in the code. dF is ``dFa``. :math:`AA'` is not fully stored; intermediate values are stored in ``Aij_sum`` and ``a_j_sum``. :math:`FF'` is calculated on the fly as :math:`dF\otimes F+F\otimes dF`. :math:`Md2M'` is not stored; intermediate values are stored in ``q_j_prime``. Scalar estimator implementation ------------------------------- Introduction: Life of a specialized OperatorBase ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Almost all observables in QMCPACK are implemented as specialized derived classes of the OperatorBase base class. Each observable is instantiated in HamiltonianFactory and added to QMCHamiltonian for tracking. QMCHamiltonian tracks two types of observables: main and auxiliary. Main observables contribute to the local energy. These observables are elements of the simulated Hamiltonian such as kinetic or potential energy. Auxiliary observables are expectation values of matrix elements that do not contribute to the local energy. These Hamiltonians do not affect the dynamics of the simulation. In the code, the main observables are labeled by “physical” flag; the auxiliary observables have“physical” set to false. Initialization ^^^^^^^^^^^^^^ When an ```` tag is present in the ```` section, it is first read by HamiltonianFactory. In general, the ``type`` of the estimator will determine which specialization of OperatorBase should be instantiated, and a derived class with ``myName="est_name"`` will be constructed. Then, the put() method of this specific class will be called to read any other parameters in the ```` XML node. Sometimes these parameters will instead be read by HamiltonianFactory because it can access more objects than OperatorBase. Cloning ^^^^^^^ When ``OpenMP`` threads are spawned, the estimator will be cloned by the ``CloneManager``, which is a parent class of many QMC drivers. .. code:: cpp // In CloneManager.cpp #pragma omp parallel for shared(w,psi,ham) for(int ip=1; ipput(input_xml); return myclone; } Evaluate ^^^^^^^^ After the observable class (derived class of OperatorBase) is constructed and prepared (by the put() method), it is ready to be used in a QMCDriver. A QMCDriver will call ``H.auxHevaluate(W,thisWalker)`` after every accepted move, where H is the QMCHamiltonian that holds all main and auxiliary Hamiltonian elements, W is a MCWalkerConfiguration, and thisWalker is a pointer to the current walker being worked on. As shown in the following, this function goes through each auxiliary Hamiltonian element and evaluates it using the current walker configuration. Under the hood, observables are calculated and dumped to the main particle set's property list for later collection. .. code:: cpp // In QMCHamiltonian.cpp // This is more efficient. // Only calculate auxH elements if moves are accepted. void QMCHamiltonian::auxHevaluate(ParticleSet& P, Walker_t& ThisWalker) { #if !defined(REMOVE_TRACEMANAGER) collect_walker_traces(ThisWalker,P.current_step); #endif for(int i=0; isetHistories(ThisWalker); RealType sink = auxH[i]->evaluate(P); auxH[i]->setObservables(Observables); #if !defined(REMOVE_TRACEMANAGER) auxH[i]->collect_scalar_traces(); #endif auxH[i]->setParticlePropertyList(P.PropertyList,myIndex); } } For estimators that contribute to the local energy (main observables), the return value of evaluate() is used in accumulating the local energy. For auxiliary estimators, the return value is not used (``sink`` local variable above); only the value of Value is recorded property lists by the setObservables() method as shown in the preceding code snippet. By default, the setObservables() method will transfer ``auxH[i]->Value`` to ``P.PropertyList[auxH[i]->myIndex]``. The same property list is also kept by the particle set being moved by QMCDriver. This list is updated by ``auxH[i]->setParticlePropertyList(P.PropertyList,myIndex)``, where myIndex is the starting index of space allocated to this specific auxiliary Hamiltonian in the property list kept by the target particle set P. Collection ^^^^^^^^^^ The actual statistics are collected within the QMCDriver, which owns an EstimatorManager object. This object (or a clone in the case of multithreading) will be registered with each mover it owns. For each mover (such as VMCUpdatePbyP derived from QMCUpdateBase), an accumulate() call is made, which by default, makes an accumulate(walkerset) call to the EstimatorManager it owns. Since each walker has a property set, EstimatorManager uses that local copy to calculate statistics. The EstimatorManager performs block averaging and file I/O. Single scalar estimator implementation guide ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Almost all of the defaults can be used for a single scalar observable. With any luck, only the put() and evaluate() methods need to be implemented. As an example, this section presents a step-by-step guide for implementing a \verb|SpeciesKineticEnergy| estimator that calculates the kinetic energy of a specific species instead of the entire particle set. For example, a possible input to this estimator can be: ```` ````. This should create two extra columns in the ``scalar.dat`` file that contains the kinetic energy of the up and down electrons in two separate columns. If the estimator is properly implemented, then the sum of these two columns should be equal to the default ``Kinetic`` column. Barebone ^^^^^^^^ The first step is to create a barebone class structure for this simple scalar estimator. The goal is to be able to instantiate this scalar estimator with an XML node and have it print out a column of zeros in the ``scalar.dat`` file. To achieve this, first create a header file "SpeciesKineticEnergy.h" in the QMCHamiltonians folder, with only the required functions declared as follows: .. code:: cpp // In SpeciesKineticEnergy.h #ifndef QMCPLUSPLUS_SPECIESKINETICENERGY_H #define QMCPLUSPLUS_SPECIESKINETICENERGY_H #include #include namespace qmcplusplus { class SpeciesKineticEnergy: public OperatorBase { public: SpeciesKineticEnergy(ParticleSet& P):tpset(P){ }; bool put(xmlNodePtr cur); // read input xml node, required bool get(std::ostream& os) const; // class description, required Return_t evaluate(ParticleSet& P); inline Return_t evaluate(ParticleSet& P, std::vector& Txy) { // delegate responsity inline for speed return evaluate(P); } // pure virtual functions require overrider void resetTargetParticleSet(ParticleSet& P) { } // required OperatorBase* makeClone(ParticleSet& qp, TrialWaveFunction& psi); // required private: ParticleSet& tpset; }; // SpeciesKineticEnergy } // namespace qmcplusplus #endif Notice that a local reference ``tpset`` to the target particle set ``P`` is saved in the constructor. The target particle set carries much information useful for calculating observables. Next, make "SpeciesKineticEnergy.cpp," and make vacuous definitions. .. code:: cpp // In SpeciesKineticEnergy.cpp #include namespace qmcplusplus { bool SpeciesKineticEnergy::put(xmlNodePtr cur) { return true; } bool SpeciesKineticEnergy::get(std::ostream& os) const { return true; } SpeciesKineticEnergy::Return_t SpeciesKineticEnergy::evaluate(ParticleSet& P) { Value = 0.0; return Value; } OperatorBase* SpeciesKineticEnergy::makeClone(ParticleSet& qp, TrialWaveFunction& psi) { // no local array allocated, default constructor should be enough return new SpeciesKineticEnergy(*this); } } // namespace qmcplusplus Now, head over to HamiltonianFactory and instantiate this observable if an XML node is found requesting it. Look for "gofr" in HamiltonianFactory.cpp, for example, and follow the if block. .. code:: cpp // In HamiltonianFactory.cpp #include else if(potType =="specieskinetic") { SpeciesKineticEnergy* apot = new SpeciesKineticEnergy(*target_particle_set); apot->put(cur); targetH->addOperator(apot,potName,false); } The last argument of addOperator() (i.e., the ``false`` flag) is **crucial**. This tells QMCPACK that the observable we implemented is not a physical Hamiltonian; thus, it will not contribute to the local energy. Changes to the local energy will alter the dynamics of the simulation. Finally, add "SpeciesKineticEnergy.cpp" to HAMSRCS in "CMakeLists.txt" located in the QMCHamiltonians folder. Now, recompile QMCPACK and run it on an input that requests ```` in the ``hamiltonian`` block. A column of zeros should appear in the ``scalar.dat`` file under the name "ukinetic." Evaluate ^^^^^^^^ The evaluate() method is where we perform the calculation of the desired observable. In a first iteration, we will simply hard-code the name and mass of the particles. .. code:: cpp // In SpeciesKineticEnergy.cpp #include // laplaician() defined here SpeciesKineticEnergy::Return_t SpeciesKineticEnergy::evaluate(ParticleSet& P) { std::string group="u"; RealType minus_over_2m = -0.5; SpeciesSet& tspecies(P.getSpeciesSet()); Value = 0.0; for (int iat=0; iat bool SpeciesKineticEnergy::put(xmlNodePtr cur) { // read in extra parameter "group" OhmmsAttributeSet attrib; attrib.add(group,"group"); attrib.put(cur); // save mass of specified group of particles SpeciesSet& tspecies(tpset.getSpeciesSet()); int group_id = tspecies.findSpecies(group); int massind = tspecies.getAttribute("mass"); minus_over_2m = -1./(2.*tspecies(massind,group_id)); return true; } where we may keep "group" and "minus\_over\_2m" as local variables to our specific class. .. code:: cpp // In SpeciesKineticEnergy.h private: ParticleSet& tpset; std::string group; RealType minus_over_2m; Notice that the previous operations are made possible by the saved reference to the target particle set. Last but not least, compile and run a full example (i.e., a short DMC calculation) with the following XML nodes in your input: ```` ```` Make sure the sum of the “ukinetic" and “dkinetic" columns is **exactly** the same as the Kinetic columns at **every block**. For easy reference, a summary of the complete list of changes follows: .. code:: cpp // In HamiltonianFactory.cpp #include "QMCHamiltonians/SpeciesKineticEnergy.h" else if(potType =="specieskinetic") { SpeciesKineticEnergy* apot = new SpeciesKineticEnergy(*targetPtcl); apot->put(cur); targetH->addOperator(apot,potName,false); } .. code:: cpp // In SpeciesKineticEnergy.h #include #include namespace qmcplusplus { class SpeciesKineticEnergy: public OperatorBase { public: SpeciesKineticEnergy(ParticleSet& P):tpset(P){ }; // xml node is read by HamiltonianFactory, eg. the sum of following should be equivalent to Kinetic // // bool put(xmlNodePtr cur); // read input xml node, required bool get(std::ostream& os) const; // class description, required Return_t evaluate(ParticleSet& P); inline Return_t evaluate(ParticleSet& P, std::vector& Txy) { // delegate responsity inline for speed return evaluate(P); } // pure virtual functions require overrider void resetTargetParticleSet(ParticleSet& P) { } // required OperatorBase* makeClone(ParticleSet& qp, TrialWaveFunction& psi); // required private: ParticleSet& tpset; // reference to target particle set std::string group; // name of species to track RealType minus_over_2m; // mass of the species !! assume same mass // for multiple species, simply initialize multiple estimators }; // SpeciesKineticEnergy } // namespace qmcplusplus #endif .. code:: cpp // In SpeciesKineticEnergy.cpp #include #include // laplaician() defined here #include namespace qmcplusplus { bool SpeciesKineticEnergy::put(xmlNodePtr cur) { // read in extra parameter "group" OhmmsAttributeSet attrib; attrib.add(group,"group"); attrib.put(cur); // save mass of specified group of particles int group_id = tspecies.findSpecies(group); int massind = tspecies.getAttribute("mass"); minus_over_2m = -1./(2.*tspecies(massind,group_id)); return true; } bool SpeciesKineticEnergy::get(std::ostream& os) const { // class description os << "SpeciesKineticEnergy: " << myName << " for species " << group; return true; } SpeciesKineticEnergy::Return_t SpeciesKineticEnergy::evaluate(ParticleSet& P) { Value = 0.0; for (int iat=0; iat species_names; std::vector species_kinetic,vec_minus_over_2m; // In SpeciesKineticEnergy.cpp SpeciesKineticEnergy::SpeciesKineticEnergy(ParticleSet& P):tpset(P) { SpeciesSet& tspecies(P.getSpeciesSet()); int massind = tspecies.getAttribute("mass"); num_species = tspecies.size(); species_kinetic.resize(num_species); vec_minus_over_2m.resize(num_species); species_names.resize(num_species); for (int ispec=0; ispec`` and check that the sum of all columns that start with "skinetic" is equal to the default "Kinetic" column. HDF5 output ~~~~~~~~~~~ If we desire an observable that will output hundreds of scalars per simulation step (e.g., SkEstimator), then it is preferred to output to the ``stat.h5`` file instead of the ``scalar.dat`` file for better organization. A large chunk of data to be registered in the ``stat.h5`` file is called a "Collectable" in QMCPACK. In particular, if a OperatorBase object is initialized with ``UpdateMode.set(COLLECTABLE,1)``, then the "Collectables" object carried by the main particle set will be processed and written to the ``stat.h5`` file, where "UpdateMode" is a bit set (i.e., a collection of flags) with the following enumeration: .. code:: cpp // In OperatorBase.h ///enum for UpdateMode enum {PRIMARY=0, OPTIMIZABLE=1, RATIOUPDATE=2, PHYSICAL=3, COLLECTABLE=4, NONLOCAL=5, VIRTUALMOVES=6 }; As a simple example, to put the two columns we produced in the previous section into the ``stat.h5`` file, we will first need to declare that our observable uses "Collectables." .. code:: cpp // In constructor add: hdf5_out = true; UpdateMode.set(COLLECTABLE,1); Then make some room in the "Collectables" object carried by the target particle set. .. code:: cpp // In addObservables(PropertySetType& plist, BufferType& collectables) add: if (hdf5_out) { h5_index = collectables.size(); std::vector tmp(num_species); collectables.add(tmp.begin(),tmp.end()); } Next, make some room in the ``stat.h5`` file by overriding the registerCollectables() method. .. code:: cpp // In SpeciesKineticEnergy.cpp void SpeciesKineticEnergy::registerCollectables(std::vector& h5desc, hid_t gid) const { if (hdf5_out) { std::vector ndim(1,num_species); observable_helper h5o(myName); h5o.set_dimensions(ndim,h5_index); h5o.open(gid); h5desc.push_back(h5o); } } Finally, edit evaluate() to use the space in the "Collectables" object. .. code:: cpp // In SpeciesKineticEnergy.cpp SpeciesKineticEnergy::Return_t SpeciesKineticEnergy::evaluate(ParticleSet& P) { RealType wgt = tWalker->Weight; // MUST explicitly use DMC weights in Collectables! std::fill(species_kinetic.begin(),species_kinetic.end(),0.0); for (int iat=0; iatwrite(AverageCache.data(),SquaredAverageCache.data()); H5Fflush(h_file,H5F_SCOPE_LOCAL); } ``EstimatorManagerBase::collectBlockAverages`` is triggered from the master-thread estimator via either ``stopBlock(std::vector)`` or ``stopBlock(RealType, true)``. Notice that file writing is NOT triggered by the slave-thread estimator method ``stopBlock(RealType, false)``. .. code:: cpp // In EstimatorManagerBase.cpp void EstimatorManagerBase::stopBlock(RealType accept, bool collectall) { //take block averages and update properties per block PropertyCache[weightInd]=BlockWeight; PropertyCache[cpuInd] = MyTimer.elapsed(); PropertyCache[acceptInd] = accept; for(int i=0; itakeBlockAverage(AverageCache.begin(),SquaredAverageCache.begin()); if(Collectables) { Collectables->takeBlockAverage(AverageCache.begin(),SquaredAverageCache.begin()); } if(collectall) collectBlockAverages(1); } .. code:: cpp // In ScalarEstimatorBase.h template inline void takeBlockAverage(IT first, IT first_sq) { first += FirstIndex; first_sq += FirstIndex; for(int i=0; iaccumulate(W,W.begin(),W.end(),norm); if(Collectables)//collectables are normalized by QMC drivers Collectables->accumulate_all(W.Collectables,1.0); } .. code:: cpp void EstimatorManagerBase::accumulate(MCWalkerConfiguration& W , MCWalkerConfiguration::iterator it , MCWalkerConfiguration::iterator it_end) { // intended to be called slaveEstimator only BlockWeight += it_end-it; RealType norm=1.0/W.getGlobalNumWalkers(); for(int i=0; i< Estimators.size(); i++) Estimators[i]->accumulate(W,it,it_end,norm); if(Collectables) Collectables->accumulate_all(W.Collectables,1.0); } .. code:: cpp // In LocalEnergyEstimator.h inline void accumulate(const Walker_t& awalker, RealType wgt) { // ensemble average W.Properties // expect ePtr to be W.Properties; expect wgt = 1/GlobalNumberOfWalkers const RealType* restrict ePtr = awalker.getPropertyBase(); RealType wwght= wgt* awalker.Weight; scalars[0](ePtr[WP::LOCALENERGY],wwght); scalars[1](ePtr[WP::LOCALENERGY]*ePtr[WP::LOCALENERGY],wwght); scalars[2](ePtr[LOCALPOTENTIAL],wwght); for(int target=3, source=FirstHamiltonian; target inline void saveProperty(IT first) { // expect first to be W.Properties first[LOCALPOTENTIAL]= LocalEnergy-KineticEnergy; copy(Observables.begin(),Observables.end(),first+myIndex); } ``Collectables``'s load stage is combined with its evaluate stage. Evaluate stage ^^^^^^^^^^^^^^ | The master ``QMCHamiltonian::Observables`` is populated by slave ``OperatorBase`` ``::setObservables``. However, the call signature must be ``OperatorBase::setObservables`` ``(QMCHamiltonian::`` | ``Observables)``. This call signature is enforced by ``QMCHamiltonian::evaluate`` and ``QMCHamiltonian::`` | ``auxHevaluate``. .. code:: cpp // In QMCHamiltonian.cpp QMCHamiltonian::Return_t QMCHamiltonian::evaluate(ParticleSet& P) { LocalEnergy = 0.0; for(int i=0; istart(); LocalEnergy += H[i]->evaluate(P); H[i]->setObservables(Observables); #if !defined(REMOVE_TRACEMANAGER) H[i]->collect_scalar_traces(); #endif myTimers[i]->stop(); H[i]->setParticlePropertyList(P.PropertyList,myIndex); } KineticEnergy=H[0]->Value; P.PropertyList[WP::LOCALENERGY]=LocalEnergy; P.PropertyList[LOCALPOTENTIAL]=LocalEnergy-KineticEnergy; // auxHevaluate(P); return LocalEnergy; } .. code:: cpp // In QMCHamiltonian.cpp void QMCHamiltonian::auxHevaluate(ParticleSet& P, Walker_t& ThisWalker) { #if !defined(REMOVE_TRACEMANAGER) collect_walker_traces(ThisWalker,P.current_step); #endif for(int i=0; isetHistories(ThisWalker); RealType sink = auxH[i]->evaluate(P); auxH[i]->setObservables(Observables); #if !defined(REMOVE_TRACEMANAGER) auxH[i]->collect_scalar_traces(); #endif auxH[i]->setParticlePropertyList(P.PropertyList,myIndex); } } Estimator use cases ~~~~~~~~~~~~~~~~~~~ VMCSingleOMP pseudo code ^^^^^^^^^^^^^^^^^^^^^^^^ .. code:: cpp bool VMCSingleOMP::run() { masterEstimator->start(nBlocks); for (int ip=0; ipstartRun(nBlocks,false); // slaveEstimator->start(blocks, record) do // block { #pragma omp parallel { Movers[ip]->startBlock(nSteps); // slaveEstimator->startBlock(steps) RealType cnorm = 1.0/static_cast(wPerNode[ip+1]-wPerNode[ip]); do // step { wClones[ip]->resetCollectables(); Movers[ip]->advanceWalkers(wit, wit_end, recompute); wClones[ip]->Collectables *= cnorm; Movers[ip]->accumulate(wit, wit_end); } // end step Movers[ip]->stopBlock(false); // slaveEstimator->stopBlock(acc, false) } // end omp masterEstimator->stopBlock(estimatorClones); // write files } // end block masterEstimator->stop(estimatorClones); } DMCOMP pseudo code ^^^^^^^^^^^^^^^^^^^ .. code:: cpp bool DMCOMP::run() { masterEstimator->setCollectionMode(true); masterEstimator->start(nBlocks); for(int ip=0; ipstartRun(nBlocks,false); // slaveEstimator->start(blocks, record) do // block { masterEstimator->startBlock(nSteps); for(int ip=0; ipstartBlock(nSteps); // slaveEstimator->startBlock(steps) do // step { #pragma omp parallel { wClones[ip]->resetCollectables(); // advanceWalkers } // end omp //branchEngine->branch { // In WalkerControlMPI.cpp::branch wgt_inv=WalkerController->NumContexts/WalkerController->EnsembleProperty.Weight; walkers.Collectables *= wgt_inv; slaveEstimator->accumulate(walkers); } masterEstimator->stopBlock(acc) // write files } // end for step } // end for block masterEstimator->stop(); } Summary ~~~~~~~ Two ensemble-level data structures, ``ParticleSet::Properties`` and ``::Collectables``, serve as intermediaries between evaluate classes and output classes to ``scalar.dat`` and ``stat.h5``. ``Properties`` appears in both ``scalar.dat`` and ``stat.h5``, whereas ``Collectables`` appears only in ``stat.h5``. ``Properties`` is overwritten by ``QMCHamiltonian::Observables`` at the end of each step. ``QMCHamiltonian::Observables`` is filled upon call to ``QMCHamiltonian::evaluate`` and ``::auxHevaluate``. ``Collectables`` is zeroed at the beginning of each step and accumulated upon call to ``::auxHevaluate``. | Data are output to ``scalar.dat`` in four stages: evaluate, load, unload, and collect. In the evaluate stage, ``QMCHamiltonian::Observables`` is populated by a list of ``OperatorBase``. In the load stage, ``QMCHamiltonian::Observables`` is transferred to ``Properties`` by ``QMCDriver``. In the unload stage, ``Properties`` is copied to ``LocalEnergyEstimator::scalars``. In the collect stage, ``LocalEnergyEstimator::scalars`` is block-averaged to ``EstimatorManagerBase`` | ``::AverageCache`` and dumped to file. For ``Collectables``, the evaluate and load stages are combined in a call to ``QMCHamiltonian::auxHevaluate``. In the unload stage, ``Collectables`` is copied to ``CollectablesEstimator::scalars``. In the collect stage, ``CollectablesEstimator`` | ``::scalars`` is block-averaged to ``EstimatorManagerBase::AverageCache`` and dumped to file. Appendix: dmc.dat ~~~~~~~~~~~~~~~~~ | There is an additional data structure, ``ParticleSet::EnsembleProperty``, that is managed by ``WalkerControlBase::EnsembleProperty`` and directly dumped to ``dmc.dat`` via its own averaging procedure. ``dmc.dat`` is written by ``WalkerControlBase::measureProperties``, which is called by ``WalkerControlBase::branch``, which is called by ``SimpleFixedNodeBranch`` | ``::branch``, for example. .. bibliography:: /bibs/developing.bib