Commit b737dd7d by Maarten L. Hekkelman

Even more documentation

parent 137ffaf7
......@@ -53,7 +53,7 @@ endif()
option(BUILD_SHARED_LIBS "Build a shared library instead of a static one" OFF)
# Build documentation?
option(BUILD_DOC "Build the documentation" OFF)
option(BUILD_DOCUMENTATION "Build the documentation" OFF)
# We do not want to write an export file for all our symbols...
set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
......@@ -480,7 +480,7 @@ if(CIFPP_INSTALL_UPDATE_SCRIPT)
target_compile_definitions(cifpp PUBLIC CACHE_DIR="${CIFPP_CACHE_DIR}")
endif()
if(BUILD_DOC)
if(BUILD_DOCUMENTATION)
add_subdirectory(docs)
endif()
......
libcifpp
========
# libcifpp
This library contains code to work with mmCIF and legacy PDB files.
Synopsis
--------
## Synopsis
```c++
// A simple program counting residues with an OXT atom
......@@ -57,8 +55,14 @@ int main(int argc, char *argv[])
}
```
Requirements
------------
## Installation
You might be able to use libcifpp from a package manager used by your
OS distribution. But most likely this package will be out-of-date.
Therefore it is recommended to build *libcifpp* from code. It is not
hard to do.
### Requirements
The code for this library was written in C++17. You therefore need a
recent compiler to build it. For the development gcc 9.4 and clang 9.0
......@@ -66,6 +70,7 @@ have been used as well as MSVC version 2019.
Other requirements are:
- [cmake](https://cmake.org) A build tool.
- [mrc](https://github.com/mhekkel/mrc), a resource compiler that
allows including data files into the executable making them easier to
install. Strictly speaking this is optional, but at the expense of
......@@ -76,20 +81,20 @@ Other requirements are:
`libeigen3-dev`
- zlib, the development version of this library. On Debian/Ubuntu this
is the package `zlib1g-dev`.
- [boost](https://www.boost.org). The boost libraries are only needed if
you want to build the testing code.
- [boost](https://www.boost.org).
When building using MS Visual Studio, you will also need [libzeep](https://github.com/mhekkel/libzeep)
since MSVC does not yet provide a C++ template required by libcifpp.
Building
--------
The Boost libraries are only needed in case you want to build the test
code or if you are using GCC. That last condition is due to a long
standing bug in the implementation of std::regex. It simply crashes
on the regular expressions used in the mmcif_pdbx dictionary and so
we use the boost regex implementation instead.
This library uses [cmake](https://cmake.org). The usual way of building
and installing is to create a `build` directory and run cmake there.
### Building
On linux e.g. you would issue the following commands to build and install
libcifpp in your `$HOME/.local` folder:
Building the code is as simple as typing:
```console
git clone https://github.com/PDB-REDO/libcifpp.git --recurse-submodules
......@@ -104,4 +109,12 @@ where cmake stores its files. Run a configure, build the code and then
it installs the library and auxiliary files.
If you want to run the tests before installing, you should add `-DENABLE_TESTING=ON`
to the first cmake command.
to the first cmake command. So that would be:
```console
git clone https://github.com/PDB-REDO/libcifpp.git --recurse-submodules
cd libcifpp
cmake -S . -B build -DCMAKE_INSTALL_PREFIX=$HOME/.local -DCMAKE_BUILD_TYPE=Release -DENABLE_TESTING=ON
cmake --build build
ctest --test-dir build
```
Bits & Pieces
=============
The *libcifpp* library offers some extra code that makes the life of developers a bit easier.
gzio
----
To work with compressed data files a *std::streambuf* implemenation was added based on the code in `gxrio <https://github.com/mhekkel/gxrio>`_. This allows you to read and write compressed data streams transparently.
When working with files you can use :cpp:class:`cif::gzio::ifstream` and :cpp:class:`cif::gzio::ofstream`. The selection of whether to use compression or not is based on the file extension. If it is ``.gz`` gzip compression is used:
.. code-block:: cpp
cif::gzio::ifstream file("my-file.txt.gz");
std::string line;
while (std::getline(file, line))
std::cout << line << '\n';
Writing is equally easy:
.. code-block:: cpp
cif::gzio::ofstream file("/tmp/output.txt.gz");
file << "Hello, world!";
file.close();
You can also use the :cpp:class:`cif::gzio:istream` and feed it a *std::streambuf* object that may or may not contain compressed data. In that case the first bytes of the input are sniffed and if it is gzip compressed data, decompression will be done.
A progress bar
--------------
Applications based on *libcifpp* may have a longer run time. To give some feedback to the user running your application in a terminal you can use the :cif:class:`cif::progress_bar`. This class will display an ASCII progress bar along with optional status messages, but only if output is to a real TTY (terminal).
A progress bar is also shown only if the duration is more than two seconds. To avoid having flashing progress bars for short actions.
The progress bar uses an internal progress counter that starts at zero and ends when the max value has been reached after which it will be removed from the screen. Updating this internal progress counter can be done by adding a number of steps calling :cpp:func:`cif::progress_bar::consumed` or by setting the exact value for the counter by calling :cpp:func:`cif::progress_bar::progress`.
Colouring output
----------------
It is also nice to emphasise some output in the terminal by using colours. For this you can create output manipulators using :cpp:func:`cif::coloured`. To write a string in white, and bold letters on a red background you can do:
.. code-block:: cpp
using namespace cif::colour;
std::cout << cif::coloured("Hello, world!", white, red, bold) << '\n';
Chemical Compounds
==================
The data in *CIF* and *mmCIF* files often describes the structure of some chemical compounds. The structure is recorded in the categories *atom_site* and friends. Records in these categories refer to chemical compounds using a compound ID. This compound ID is the ID field of the *chem_comp* category. For all of the known compounds in the PDB there is an entry in the Chemical Compounds Dictionary or `CCD <https://www.wwpdb.org/data/ccd>`_. If *libcifpp* was properly installed you have a copy of this file somewhere on your disk. And if you have installed the update scripts, a fresh version of this file will be retrieved weekly.
As an alternative to CCD there are the monomer library files from `CCP4 <https://www.ccp4.ac.uk/>`_. These contain somewhat different data but the overlap is good enough for usage in *libcifpp*.
Information about compounds is captured in the :cpp:class:`cif::compound`. An instance of a compound object for a certain compound ID can be obtained by using the singleton :cpp:class:`cif::compound_factory`.
If the compound you want to use is not available in the CCD or in CCP4, you can add that information yourself. For this you can use the method :cpp:func:`cif::compound_factory::push_dictionary`.
So, given that we have CCD, CCP4 monomer library and used defined compound definitions, what will you get when you try to retrieve such a compound by ID? The answer is, the factory has a stack of compound generators. The first thrown on the stack is the one for a CCD file (*components.cif*) if it can be found. Then, if the *CLIBD_MON* environmental variable is defined, a generator for monomer library files is added to the stack. And then all generators for files you added using *push_dictionary* are added in order. The generators are searched in the reverse order in which they were added to see if it creates a compound object for the ID. If no compound was created at all, nullptr is returned.
\ No newline at end of file
......@@ -36,8 +36,11 @@ Using *libcifpp* is easy, if you are familiar with modern C++:
self
basics.rst
compound.rst
model.rst
resources.rst
symmetry.rst
bitsandpieces.rst
api/library_root.rst
genindex.rst
Molecular Model
===============
Theoretically it is possible to get along with only the classes *cif::file*, *cif::datablock* and *cif::category*. But to keep your data complete and valid you then have to update lots of categories for all but the simplest manipulations. For this *libcifpp* comes with a higher level API modelling atoms, residues, monomers, polymers and complete structures in their respective classes.
Note that these classes only work properly if you are using *mmCIF* files and have an mmcif_pdbx dictionary available, either compiled in using `mrc <https://github.com/mhekkel/mrc.git>`_ or installed in the proper location.
.. note::
This part of *libcifpp* is the least developed part. What is available should work but functionality should eventually be extended.
Atom
----
The :cpp:class:`cif::mm::atom` is a lightweight proxy class giving access to the data stored in *atom_site* and *atom_site_anisotrop*. It only caches the most often used item data and every modification is directly written back into the *mmCIF* categories.
Atoms can be copied by value with low cost. The atom class only contains a pointer to an implementation that is reference counted.
Residue, Monomer and Polymer
----------------------------
The :cpp:class:`cif::mm::residue`, :cpp:class:`cif::mm::monomer` and :cpp:class:`cif::mm::polymer` implement what you'd expect. A monomer is a residue that is part of a polymer and thus has a sequence number and siblings.
Sugars & Branches
-----------------
There are also classes for modelling sugars and sugar branches. You can create sugar branches
Structure
---------
The :cpp:class:`cif::mm::structure` can be used to load one of the models from an *mmCIF* file. By default the first model is loaded. (Multiple models are often only available files containing structures defined using NMR).
A structure holds a reference to a *cif::datablock* and retrieves its data from this datablock and writes any modification back into that datablock.
One of the most useful parts of the structure class is the ability to create and modify residues. This updates related *chem_comp* and *entity* categories as well.
\ No newline at end of file
......@@ -48,6 +48,9 @@
/// may also be generated from the CCP4 monomer library.
///
/// Note that the information in CCP4 and CCD is not equal.
///
/// See also :doc:`/compound` for more information.
namespace cif
{
......
......@@ -1031,17 +1031,17 @@ class structure
/// \brief Create a new and empty (sugar) branch
branch &create_branch();
/// \brief Create a new (sugar) branch with one first NAG containing atoms constructed from \a atoms
branch &create_branch(std::vector<row_initializer> atoms);
/// \brief Extend an existing (sugar) branch identified by \a asymID with one sugar containing atoms constructed from \a atom_info
///
/// \param asym_id The asym id of the branch to extend
/// \param atom_info Array containing the info for the atoms to construct for the new sugar
/// \param link_sugar The sugar to link to, note: this is the sugar number (1 based)
/// \param link_atom The atom id of the atom linked in the sugar
branch &extend_branch(const std::string &asym_id, std::vector<row_initializer> atom_info,
int link_sugar, const std::string &link_atom);
// /// \brief Create a new (sugar) branch with one first NAG containing atoms constructed from \a atoms
// branch &create_branch(std::vector<row_initializer> atoms);
// /// \brief Extend an existing (sugar) branch identified by \a asymID with one sugar containing atoms constructed from \a atom_info
// ///
// /// \param asym_id The asym id of the branch to extend
// /// \param atom_info Array containing the info for the atoms to construct for the new sugar
// /// \param link_sugar The sugar to link to, note: this is the sugar number (1 based)
// /// \param link_atom The atom id of the atom linked in the sugar
// branch &extend_branch(const std::string &asym_id, std::vector<row_initializer> atom_info,
// int link_sugar, const std::string &link_atom);
/// \brief Remove \a branch
void remove_branch(branch &branch);
......
......@@ -2840,7 +2840,8 @@ void reconstruct_pdbx(datablock &db)
if (db.get("atom_site") == nullptr)
throw std::runtime_error("Cannot reconstruct PDBx file, atom data missing");
assert(false);
throw std::runtime_error("not implemented yet");
}
} // namespace pdbx
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment