Merge pull request #13 from rmcgibbo/package

Package up PDBFixer

Merge pull request #13 from rmcgibbo/package
Package up PDBFixer
9f149d4e · peastman · 111fd4c0 · 4c104519 · 9f149d4e · 9f149d4e
Commit 9f149d4e authored Mar 25, 2014 by peastman
49 changed files
--- a/.gitignore
+++ b/.gitignore
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+env/
+bin/
+build/
+develop-eggs/
+dist/
+eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+*.egg-info/
+.installed.cfg
+*.egg
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.coverage
+.cache
+nosetests.xml
+coverage.xml
+
+# Translations
+*.mo
+
+# Mr Developer
+.mr.developer.cfg
+.project
+.pydevproject
+
+# Rope
+.ropeproject
+
+# Django stuff:
+*.log
+*.pot
+
+# Sphinx documentation
+docs/_build/
--- a/Manual.html
+++ b/Manual.html
@@ -23,14 +23,25 @@ Protein Data Bank (PDB) files often have a number of problems that must be fixed
 PDBFixer can fix all of these problems for you in a fully automated way.  You simply select a file, tell it which problems to fix, and it does everything else.
 <p>
 PDBFixer can be used in three different ways: as a desktop application with a graphical user interface; as a command line application; or as a Python API.  This allows you to use it in whatever way best matches your own needs for flexibility, ease of use, and scriptability.  The following sections describe how to use it in each of these ways.
+
+
+<h1>2. Installation</h2>
+
+To install PDBFixer, navigate to the root directory of the source distribution you've download and type
+<p>
+<tt>python setup.py install</tt>
 <p>
-Before running PDBFixer, you must first install <a href="https://simtk.org/home/openmm">OpenMM</a> 5.2 or later.  Follow the installation instructions in the OpenMM manual.  It is also highly recommended that you install CUDA or OpenCL.  In principle PDBFixer can use the OpenMM reference platform, but it will be prohibitively slow.  Finally, PDBFixer requires that <a href="http://www.numpy.org">NumPy</a> be installed.
+This will install the PDBFixer python package as well as the command line program <tt>pdbfixer</tt>.
+
+<p>
+Before running PDBFixer, you must first install <a href="https://simtk.org/home/openmm">OpenMM</a> 5.2 or later.  Follow the installation instructions in the OpenMM manual.  It is also highly recommended that you install CUDA or OpenCL.  In principle PDBFixer can use the OpenMM reference platform, but it will be prohibitively slow. PDBFixer requires that <a href="http://www.numpy.org">NumPy</a> be installed.
+

-<h1>2. PDBFixer as a Desktop Application</h1>
+<h1>3. PDBFixer as a Desktop Application</h1>

 To run PDBFixer as a desktop application, type
 <p>
-<tt>python pdbfixer.py</tt>
+<tt>pdbfixer</tt>
 <p>
 on the command line.  PDBFixer displays its user interface through a web browser, but it is still a single user desktop application.  It should automatically launch a web browser and open a new window displaying the user interface.  If for any reason this does not happen, you can launch a web browser yourself and point it to <a href="http://localhost:8000">http://localhost:8000</a>.
 <p>
@@ -70,18 +81,18 @@ This page gives you the chance to make several other optional changes:

 You're all done!  Click "Save File" to save the processed PDB file to disk.  Then click "Process Another File" if you have more files to process, or "Quit" (in the top right corner of the page) if you are finished.

-<h1>3. PDBFixer as a Command Line Application</h1>
+<h1>4. PDBFixer as a Command Line Application</h1>

 PDBFixer provides a simple command line interface that is especially useful if you want to script it to process many files at once.  This interface is significantly less flexible than either the desktop interface or the Python API, but it is still powerful enough for many purposes.
 <p>
 To get usage instructions for the command line interface, type
 <p>
-<tt>python pdbfixer.py --help</tt>
+<tt>pdbfixer --help</tt>
 <p>
 This displays the following information:
 <tt><pre>
-Usage: pdbfixer.py
-       pdbfixer.py [options] filename
+Usage: pdbfixer
+       pdbfixer [options] filename

 When run with no arguments, it launches the user interface.  If any arguments are specified, it runs in command line mode.

@@ -110,11 +121,11 @@ Options:
 </pre></tt>
 For example, consider the following command line:
 <p>
-<tt>python pdbfixer.py --keep-heterogens=water --replace-nonstandard --water-box=4.0 4.0 3.0 myfile.pdb</tt>
+<tt>pdbfixer --keep-heterogens=water --replace-nonstandard --water-box=4.0 4.0 3.0 myfile.pdb</tt>
 <p>
 This will load the file "myfile.pdb".  It will add any missing atoms to existing residues, but not add any missing residues (because we did not specify <tt>--add-residues</tt>).  Hydrogens will be added that are appropriate at pH 7.0 (the default).  If the file contains any nonstandard amino acids or nucleotides, they will be replaced with the closest equivalent standard ones.  Any water molecules present in the file will be kept, but all other heterogens will be deleted.  Finally a water box of size 4 by 4 by 3 nanometers will be added surrounding the structure.  If necessary, counterions will be added to neutralize it (Na+ or Cl-), but no other ions will be added (because we accepted the default ionic strength of 0.0).  After making all these changes, the result will be written to a file called "output.pdb".

-<h1>4. PDBFixer as a Python API</h1>
+<h1>5. PDBFixer as a Python API</h1>

 This is the most powerful way to use PDBFixer.  It allows you to script the processing of PDB files while maintaining precise programmatic control of every part of the process.
 <p>

--- a/pdbfixer/__init__.py
+++ b/pdbfixer/__init__.py
--- a/createSoftForcefield.py
+++ b/createSoftForcefield.py
@@ -29,6 +29,7 @@ DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
 USE OR OTHER DEALINGS IN THE SOFTWARE.
 """
+from __future__ import print_function
 __author__ = "Peter Eastman"
 __version__ = "1.0"

@@ -42,11 +43,11 @@ angleK = 10.0

 # Create the new force field file.

-print '<ForceField>'
+print('<ForceField>')

 # Print the atom types, while identifying types and classes to omit.

-print ' <AtomTypes>'
+print(' <AtomTypes>')
 omitTypes = set()
 omitClasses = set()
 for atomType in forcefield._atomTypes:
@@ -55,94 +56,94 @@ for atomType in forcefield._atomTypes:
        omitTypes.add(atomType)
        omitClasses.add(atomClass)
    else:
-        print '  <Type name="%s" class="%s" element="%s" mass="%g"/>' % (atomType, atomClass, element.symbol, mass)
-print ' </AtomTypes>'
+        print('  <Type name="%s" class="%s" element="%s" mass="%g"/>' % (atomType, atomClass, element.symbol, mass))
+print(' </AtomTypes>')

 # Print the residue templates.

-print ' <Residues>'
-for template in forcefield._templates.itervalues():
-    print '  <Residue name="%s">' % template.name
+print(' <Residues>')
+for template in forcefield._templates.values():
+    print('  <Residue name="%s">' % template.name)
    atomIndex = {}
    for i, atom in enumerate(template.atoms):
        if atom.type not in omitTypes:
-            print '   <Atom name="%s" type="%s"/>' % (atom.name, atom.type)
+            print('   <Atom name="%s" type="%s"/>' % (atom.name, atom.type))
            atomIndex[i] = len(atomIndex)
    for (a1, a2) in template.bonds:
        if a1 in atomIndex and a2 in atomIndex:
-            print '   <Bond from="%d" to="%d"/>' % (atomIndex[a1], atomIndex[a2])
+            print('   <Bond from="%d" to="%d"/>' % (atomIndex[a1], atomIndex[a2]))
    for atom in template.externalBonds:
        if atom in atomIndex:
-            print '   <ExternalBond from="%d"/>' % atomIndex[atom]
-    print '  </Residue>'
-print ' </Residues>'
+            print('   <ExternalBond from="%d"/>' % atomIndex[atom])
+    print('  </Residue>')
+print(' </Residues>')

 # Print the harmonic bonds.

-print ' <HarmonicBondForce>'
+print(' <HarmonicBondForce>')
 bonds = [f for f in forcefield._forces if isinstance(f, ff.HarmonicBondGenerator)][0]
 for i in range(len(bonds.types1)):
-    type1 = iter(bonds.types1[i]).next()
-    type2 = iter(bonds.types2[i]).next()
+    type1 = next(iter(bonds.types1[i]))
+    type2 = next(iter(bonds.types2[i]))
    if type1 not in omitTypes and type2 not in omitTypes:
        class1 = forcefield._atomTypes[type1][0]
        class2 = forcefield._atomTypes[type2][0]
-        print '  <Bond class1="%s" class2="%s" length="%g" k="%g"/>' % (class1, class2, bonds.length[i], bondK)
-print ' </HarmonicBondForce>'
+        print('  <Bond class1="%s" class2="%s" length="%g" k="%g"/>' % (class1, class2, bonds.length[i], bondK))
+print(' </HarmonicBondForce>')

 # Print the harmonic angles.

-print ' <HarmonicAngleForce>'
+print(' <HarmonicAngleForce>')
 angles = [f for f in forcefield._forces if isinstance(f, ff.HarmonicAngleGenerator)][0]
 for i in range(len(angles.types1)):
-    type1 = iter(angles.types1[i]).next()
-    type2 = iter(angles.types2[i]).next()
-    type3 = iter(angles.types3[i]).next()
+    type1 = next(iter(angles.types1[i]))
+    type2 = next(iter(angles.types2[i]))
+    type3 = next(iter(angles.types3[i]))
    if type1 not in omitTypes and type2 not in omitTypes and type3 not in omitTypes:
        class1 = forcefield._atomTypes[type1][0]
        class2 = forcefield._atomTypes[type2][0]
        class3 = forcefield._atomTypes[type3][0]
-        print '  <Angle class1="%s" class2="%s" class3="%s" angle="%g" k="%g"/>' % (class1, class2, class3, angles.angle[i], angleK)
-print ' </HarmonicAngleForce>'
+        print('  <Angle class1="%s" class2="%s" class3="%s" angle="%g" k="%g"/>' % (class1, class2, class3, angles.angle[i], angleK))
+print(' </HarmonicAngleForce>')

 # Print the periodic torsions.

-print ' <PeriodicTorsionForce>'
+print(' <PeriodicTorsionForce>')
 torsions = [f for f in forcefield._forces if isinstance(f, ff.PeriodicTorsionGenerator)][0]
 for torsion in torsions.proper:
-    type1 = iter(torsion.types1).next()
-    type2 = iter(torsion.types2).next()
-    type3 = iter(torsion.types3).next()
-    type4= iter(torsion.types4).next()
+    type1 = next(iter(torsion.types1))
+    type2 = next(iter(torsion.types2))
+    type3 = next(iter(torsion.types3))
+    type4= next(iter(torsion.types4))
    if type1 not in omitTypes and type2 not in omitTypes and type3 not in omitTypes and type4 not in omitTypes:
        class1 = forcefield._atomTypes[type1][0]
        class2 = forcefield._atomTypes[type2][0]
        class3 = forcefield._atomTypes[type3][0]
        class4 = forcefield._atomTypes[type4][0]
-        print '  <Proper class1="%s" class2="%s" class3="%s" class4="%s"' % (class1, class2, class3, class4),
+        print('  <Proper class1="%s" class2="%s" class3="%s" class4="%s"' % (class1, class2, class3, class4), end=' ')
        for i in range(len(torsion.k)):
-            print ' periodicity%d="%d" phase%d="%g" k%d="%g"' % (i+1, torsion.periodicity[i], i+1, torsion.phase[i], i+1, torsion.k[i]),
-        print '/>'
+            print(' periodicity%d="%d" phase%d="%g" k%d="%g"' % (i+1, torsion.periodicity[i], i+1, torsion.phase[i], i+1, torsion.k[i]), end=' ')
+        print('/>')
 for torsion in torsions.improper:
-    type1 = iter(torsion.types1).next()
-    type2 = iter(torsion.types2).next()
-    type3 = iter(torsion.types3).next()
-    type4= iter(torsion.types4).next()
+    type1 = next(iter(torsion.types1))
+    type2 = next(iter(torsion.types2))
+    type3 = next(iter(torsion.types3))
+    type4= next(iter(torsion.types4))
    if type1 not in omitTypes and type2 not in omitTypes and type3 not in omitTypes and type4 not in omitTypes:
        class1 = forcefield._atomTypes[type1][0]
        class2 = forcefield._atomTypes[type2][0]
        class3 = forcefield._atomTypes[type3][0]
        class4 = forcefield._atomTypes[type4][0]
-        print '  <Improper class1="%s" class2="%s" class3="%s" class4="%s"' % (class1, class2, class3, class4),
+        print('  <Improper class1="%s" class2="%s" class3="%s" class4="%s"' % (class1, class2, class3, class4), end=' ')
        for i in range(len(torsion.k)):
-            print ' periodicity%d="%d" phase%d="%g" k%d="%g"' % (i+1, torsion.periodicity[i], i+1, torsion.phase[i], i+1, torsion.k[i]),
-        print '/>'
-print ' </PeriodicTorsionForce>'
+            print(' periodicity%d="%d" phase%d="%g" k%d="%g"' % (i+1, torsion.periodicity[i], i+1, torsion.phase[i], i+1, torsion.k[i]), end=' ')
+        print('/>')
+print(' </PeriodicTorsionForce>')

 # Print the script to add the soft-core nonbonded force.

-print ' <Script>'
-print """import simtk.openmm as mm
+print(' <Script>')
+print("""import simtk.openmm as mm
 nb = mm.CustomNonbondedForce('C/((r/0.2)^4+1)')
 nb.addGlobalParameter('C', 1.0)
 sys.addForce(nb)
@@ -154,7 +155,7 @@ for bond in data.bonds:
 for angle in data.angles:
    exclusions.add((min(angle[0], angle[2]), max(angle[0], angle[2])))
 for a1, a2 in exclusions:
-    nb.addExclusion(a1, a2)"""
-print ' </Script>'
+    nb.addExclusion(a1, a2)""")
+print(' </Script>')

-print '</ForceField>'
\ No newline at end of file
+print('</ForceField>')
\ No newline at end of file
--- a/html/addHeavyAtoms.html
+++ b/html/addHeavyAtoms.html
--- a/html/addHydrogens.html
+++ b/html/addHydrogens.html
--- a/html/addResidues.html
+++ b/html/addResidues.html
--- a/html/convertResidues.html
+++ b/html/convertResidues.html
--- a/html/error.html
+++ b/html/error.html
--- a/html/header.html
+++ b/html/header.html
--- a/html/quit.html
+++ b/html/quit.html
--- a/html/removeChains.html
+++ b/html/removeChains.html
--- a/html/saveFile.html
+++ b/html/saveFile.html
--- a/html/start.html
+++ b/html/start.html
--- a/pdbfixer.py
+++ b/pdbfixer.py
@@ -626,7 +626,7 @@ class PDBFixer(object):
        return nearest


-if __name__=='__main__':
+def main():
    if len(sys.argv) < 2:
        # Display the UI.
        
@@ -674,3 +674,6 @@ if __name__=='__main__':
        if options.box is not None:
            fixer.addSolvent(options.box*unit.nanometer, options.positiveIon, options.negativeIon, options.ionic*unit.molar)
        app.PDBFile.writeFile(fixer.topology, fixer.positions, open(options.output, 'w'))
+
+if __name__ == '__main__':
+    main()
\ No newline at end of file
--- a/soft.xml
+++ b/soft.xml
--- a/templates/A.pdb
+++ b/templates/A.pdb
--- a/templates/ACE.pdb
+++ b/templates/ACE.pdb
--- a/templates/ALA.pdb
+++ b/templates/ALA.pdb
--- a/templates/ARG.pdb
+++ b/templates/ARG.pdb
--- a/templates/ASN.pdb
+++ b/templates/ASN.pdb
--- a/templates/ASP.pdb
+++ b/templates/ASP.pdb
--- a/templates/C.pdb
+++ b/templates/C.pdb
--- a/templates/CYS.pdb
+++ b/templates/CYS.pdb
--- a/templates/DA.pdb
+++ b/templates/DA.pdb
--- a/templates/DC.pdb
+++ b/templates/DC.pdb
--- a/templates/DG.pdb
+++ b/templates/DG.pdb
--- a/templates/DT.pdb
+++ b/templates/DT.pdb
--- a/templates/G.pdb
+++ b/templates/G.pdb
--- a/templates/GLN.pdb
+++ b/templates/GLN.pdb
--- a/templates/GLU.pdb
+++ b/templates/GLU.pdb
--- a/templates/GLY.pdb
+++ b/templates/GLY.pdb
--- a/templates/HIS.pdb
+++ b/templates/HIS.pdb
--- a/templates/ILE.pdb
+++ b/templates/ILE.pdb
--- a/templates/LEU.pdb
+++ b/templates/LEU.pdb
--- a/templates/LYS.pdb
+++ b/templates/LYS.pdb
--- a/templates/MET.pdb
+++ b/templates/MET.pdb
--- a/templates/NME.pdb
+++ b/templates/NME.pdb
--- a/templates/PHE.pdb
+++ b/templates/PHE.pdb
--- a/templates/PRO.pdb
+++ b/templates/PRO.pdb
--- a/templates/SER.pdb
+++ b/templates/SER.pdb
--- a/templates/THR.pdb
+++ b/templates/THR.pdb
--- a/templates/TRP.pdb
+++ b/templates/TRP.pdb
--- a/templates/TYR.pdb
+++ b/templates/TYR.pdb
--- a/templates/U.pdb
+++ b/templates/U.pdb
--- a/templates/VAL.pdb
+++ b/templates/VAL.pdb
--- a/ui.py
+++ b/ui.py
@@ -6,6 +6,7 @@ import uiserver
 import webbrowser
 import os.path
 import gzip
+import time
 from io import BytesIO
 try:
    from urllib.request import urlopen
@@ -204,4 +205,15 @@ def launchUI():
    uiserver.beginServing()
    uiserver.setCallback(controlsCallback, "/controls")
    displayStartPage()
-    webbrowser.open('http://localhost:'+str(uiserver.server.server_address[1]))
+    url = 'http://localhost:'+str(uiserver.server.server_address[1])
+    print("PDBFixer running: %s " % url)
+    webbrowser.open(url)
+
+    # the uiserver is running in a background daemon thread that dies whenever
+    # the main thread exits. So, to keep the whole process alive, we just sleep
+    # here in the main thread. When Control-C is called, the main thread shuts
+    # down and then the uiserver exits. Without this daemon/sleep combo, the
+    # process cannot be killed with Control-C. Reference stack overflow link:
+    # http://stackoverflow.com/a/11816038/1079728
+    while True:
+        time.sleep(0.5)
--- a/uiserver.py
+++ b/uiserver.py
@@ -69,7 +69,9 @@ callback = {}
 server = _ThreadingHTTPServer(("localhost", 8000), _Handler)

 def beginServing():
-    Thread(target=server.serve_forever).start()
+    t = Thread(target=server.serve_forever)
+    t.daemon = True
+    t.start()

 def setContent(newContent):
    global content

--- a/setup.py
+++ b/setup.py
+"""pdbfixer: Fixes problems in PDB files
+
+Protein Data Bank (PDB) files often have a number of problems that must be
+fixed before they can be used in a molecular dynamics simulation. The details
+vary depending on how the file was generated. Here are some of the most common
+ones:
+
+- If the structure was generated by X-ray crystallography, most or all of the 
+- hydrogen atoms will usually be missing.
+- There may also be missing heavy atoms in flexible regions that could not be
+  clearly resolved from the electron density. This may include anything from a
+  few atoms at the end of a sidechain to entire loops.
+- Many PDB files are also missing terminal atoms that should be present at the 
+  ends of chains.
+- The file may include nonstandard residues that were added for crystallography
+  purposes, but are not present in the naturally occurring molecule you want to
+  simulate.
+- The file may include more than what you want to simulate. For example, there
+  may be salts, ligands, or other molecules that were added for experimental
+  purposes. Or the crystallographic unit cell may contain multiple copies of a
+  protein, but you only want to simulate a single copy.
+- There may be multiple locations listed for some atoms.
+- If you want to simulate the structure in explicit solvent, you will need to
+  add a water box surrounding it.
+
+PDBFixer can fix all of these problems for you in a fully automated way. You
+simply select a file, tell it which problems to fix, and it does everything else.
+"""
+from __future__ import print_function
+import os
+import sys
+from os.path import relpath, join
+from setuptools import setup, find_packages
+DOCLINES = __doc__.split("\n")
+
+########################
+__version__ = '1.0'
+VERSION = __version__
+ISRELEASED = False
+########################
+CLASSIFIERS = """\
+Development Status :: 3 - Alpha
+Intended Audience :: Science/Research
+Intended Audience :: Developers
+License :: OSI Approved :: MIT License
+Programming Language :: Python
+Programming Language :: Python :: 3
+Topic :: Scientific/Engineering :: Bio-Informatics
+Topic :: Scientific/Engineering :: Chemistry
+Operating System :: Microsoft :: Windows
+Operating System :: POSIX
+Operating System :: Unix
+Operating System :: MacOS
+"""
+
+
+def find_package_data():
+    files = []
+    for root, dirnames, filenames in os.walk('pdbfixer'):
+        for fn in filenames:
+            files.append(relpath(join(root, fn), 'pdbfixer'))
+    return files
+
+
+def check_dependencies():
+    from distutils.version import StrictVersion
+    found_openmm = True
+    found_openmm_52_or_later = True
+    found_numpy = True
+
+    try:
+        from simtk import openmm
+        openmm_version = StrictVersion(openmm.Platform.getOpenMMVersion())
+        if openmm_version < StrictVersion('5.2'):
+            found_openmm_52_or_later = False
+    except ImportError as err:
+        found_openmm = False
+
+    try:
+        import numpy
+    except:
+        found_numpy = False
+
+    msg = None
+    bar = ('-' * 70) + "\n" + ('-' * 70)
+    if found_openmm:
+        if not found_openmm_52_or_later:
+            msg = [bar, '[Unmet Dependency] PDBFixer requires OpenMM version 5.2 or later. You have version %s.' % openmm_version, bar]
+    else:
+        msg = [bar, '[Unmet Dependency] PDBFixer requires the OpenMM python package. Refer to <http://openmm.org> for details and installation instructions.', bar]
+
+    if not found_numpy:
+        msg = [bar, '[Unmet Dependency] PDBFixer requires the numpy python package. Refer to <http://www.scipy.org/scipylib/download.html> for numpy installation instructions.', bar]
+
+    if msg is not None:
+        import textwrap
+        print()
+        print(os.linesep.join([line for e in msg for line in textwrap.wrap(e)]), file=sys.stderr)
+        #print('\n'.join(list(textwrap.wrap(e) for e in msg)))
+
+setup(
+    name='pdbfixer',
+    author='Peter Eastman',
+    description=DOCLINES[0],
+    long_description="\n".join(DOCLINES[2:]),
+    version=__version__,
+    license='MIT',
+    url='https://github.com/peastman/pdbfixer',
+    platforms=['Linux', 'Mac OS-X', 'Unix', 'Windows'],
+    classifiers=CLASSIFIERS.splitlines(),
+    packages=find_packages(),
+    package_data={'pdbfixer': find_package_data()},
+    zip_safe=False,
+    entry_points={'console_scripts': ['pdbfixer = pdbfixer.pdbfixer:main']})
+
+check_dependencies()