------------------------------------------------------------------- Wed Nov 20 20:17:31 UTC 2019 - Todd R - Update to version 1.75 * The restriction enzyme list in Bio.Restriction has been updated to the August 2019 release of REBASE. * ``Bio.SeqIO`` now supports reading and writing files in the native format of Christian Marck's DNA Strider program ("xdna" format, also used by Serial Cloner), as well as reading files in the native formats of GSL Biotech's SnapGene ("snapgene") and Textco Biosoftware's Gene Construction Kit ("gck"). * ``Bio.AlignIO`` now supports GCG MSF multiple sequence alignments as the "msf" format (work funded by the National Marrow Donor Program). * The main ``Seq`` object now has string-like ``.index()`` and ``.rindex()`` methods, matching the existing ``.find()`` and ``.rfind()`` implementations. The ``MutableSeq`` object retains its more list-like ``.index()`` behaviour. * The ``MMTFIO`` class has been added that allows writing of MMTF file format files from a Biopython structure object. ``MMTFIO`` has a similar interface to ``PDBIO`` and ``MMCIFIO``, including the use of a ``Select`` class to write out a specified selection. This final addition to read/write support for PDB/mmCIF/MMTF in Biopython allows conversion between all three file formats. * Values from mmCIF files are now read in as a list even when they consist of a single value. This change improves consistency and reduces the likelihood of making an error, but will require user code to be updated accordingly. * ``Bio.PDB`` has been updated to support parsing REMARK 99 header entries from PDB-style Astral files. * A new keyword parameter ``full_sequences`` was added to ``Bio.pairwise2``'s pretty print method ``format_alignment`` to restore the output of local alignments to the 'old' format (showing the whole sequences including the un-aligned parts instead of only showing the aligned parts). * A new function ``charge_at_pH(pH)`` has been added to ``ProtParam`` and ``IsoelectricPoint`` in ``Bio.SeqUtils``. * The ``PairwiseAligner`` in ``Bio.Align`` was extended to allow generalized pairwise alignments, i.e. alignments of any Python object, for example three-letter amino acid sequences, three-nucleotide codons, and arrays of integers. * A new module ``substitution_matrices`` was added to ``Bio.Align``, which includes an ``Array`` class that can be used as a substitution matrix. As the ``Array`` class is a subclass of a numpy array, mathematical operations can be applied to it directly, and C code that makes use of substitution matrices can directly access the numerical values stored in the substitution matrices. This module is intended as a replacement of ``Bio.SubsMat``, which is currently unmaintained. * As in recent releases, more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. * Additionally, a number of small bugs and typos have been fixed with further additions to the test suite, and there has been further work to follow the Python PEP8, PEP257 and best practice standard coding style. We have also started to use the ``black`` Python code formatting tool. ------------------------------------------------------------------- Tue Jul 23 01:23:01 UTC 2019 - Todd R - Update to version 1.74 * Our core sequence objects (``Seq``, ``UnknownSeq``, and ``MutableSeq``) now have a string-like ``.join()`` method. * The NCBI now allows longer accessions in the GenBank file LOCUS line, meaning the fields may not always follow the historical column based positions. We no longer give a warning when parsing these. We now allow writing such files (although with a warning as support for reading them is not yet widespread). * Support for the ``mysqlclient`` package, a fork of MySQLdb, has been added. * We now capture the IDcode field from PDB Header records. * ``Bio.pairwise2``'s pretty-print output from ``format_alignment`` has been optimized for local alignments: If they do not consist of the whole sequences, only the aligned section of the sequences are shown, together with the start positions of the sequences (in 1-based notation). Alignments of lists will now also be prettily printed. * ``Bio.SearchIO`` now supports parsing the text output of the HHsuite protein sequence search tool. The format name is ``hhsuite2-text`` and ``hhsuite3-text``, for versions 2 and 3 of HHsuite, respectively. * ``Bio.SearchIO`` HSP objects has a new attribute called ``output_index``. This attribute is meant for capturing the order by which the HSP were output in the parsed file and is set with a default value of -1 for all HSP objects. It is also used for sorting the output of ``QueryResult.hsps``. * ``Bio.SeqIO.AbiIO`` has been updated to preserve bytes value when parsing. The goal of this change is make the parser more robust by being able to extract string-values that are not utf-8-encoded. This affects all tag values, except for ID and description values, where they need to be extracted as strings to conform to the ``SeqRecord`` interface. In this case, the parser will attempt to decode using ``utf-8`` and fall back to the system encoding if that fails. This change affects Python 3 only. * ``Bio.motifs.mast`` has been updated to parse XML output files from MAST over the plain-text output file. The goal of this change is to parse a more structured data source with minimal loss of functionality upon future MAST releases. Class structure remains the same plus an additional attribute ``Record.strand_handling`` required for diagram parsing. * ``Bio.Entrez`` now automatically retries HTTP requests on failure. The maximum number of tries and the sleep between them can be configured by changing ``Bio.Entrez.max_tries`` and ``Bio.Entrez.sleep_between_tries``. (The defaults are 3 tries and 15 seconds, respectively.) * All tests using the older print-and-compare approach have been replaced by unittests following Python's standard testing framework. * On the documentation side, all the public modules, classes, methods and functions now have docstrings (built in help strings). Furthermore, the PDF version of the *Biopython Tutorial and Cookbook* now uses syntax coloring for code snippets. * Additionally, a number of small bugs and typos have been fixed with further additions to the test suite, and there has been further work to follow the Python PEP8, PEP257 and best practice standard coding style. ------------------------------------------------------------------- Fri Jan 4 17:31:38 UTC 2019 - Todd R - Update to version 1.73 * As in recent releases, more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. * The dictionary-like indexing in SeqIO and SearchIO will now explicitly preserve record order to match a behaviour change in the Python standard dict object. This means looping over the index will load the records in the on-disk order, which will be much faster (previously it would be effectively at random, based on the key hash sorting). * The "grant" matrix in Bio.SubsMat.MatrixInfo has been replaced as our original values taken from Gerhard Vogt's old webpages at EMBL Heidelberg were discovered to be in error. The new values have been transformed following Vogt's approach, taking the global maximum 215 minus the similarity scores from the original paper Grantham (1974), to give a distance measure. * Additionally, a number of small bugs and typos have been fixed with further additions to the test suite, and there has been further work to follow the Python PEP8, PEP257 and best practice standard coding style. * Double-quote characters in GenBank feature qualifier values in ``Bio.SeqIO`` are now escaped as per the NCBI standard. Improperly escaped values trigger a warning on parsing. * There is a new command line wrapper for the BWA-MEM sequence mapper. * The string-based FASTA parsers in ``Bio.SeqIO.FastaIO`` have been optimised, which also speeds up parsing FASTA files using ``Bio.SeqIO.parse()``. - Update to version 1.72 * Internal changes to Bio.SeqIO have sped up the SeqRecord .format method and SeqIO.write (especially when used in a for loop). * The MAF alignment indexing in Bio.AlignIO.MafIO has been updated to use inclusive end co-ordinates to better handle searches at end points. This will require you to rebuild any existing MAF index files. * In this release more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. * The Entrez module now supports the NCBI API key. Also you can now set a custom directory for DTD and XSD files. This allows Entrez to be used in environments like AWS Lambda, which restricts write access to specific directories. Improved support for parsing NCBI Entrez XML files that use XSD schemas. * Internal changes to our C code mean that NumPy is no longer required at compile time - only at run time (and only for those modules which use NumPy). * Seq, UnknownSeq, MutableSeq and derived classes now support integer multiplication methods, matching native Python string methods. * A translate method has been added to Bio.SeqFeature that will extract a feature and translate it using the codon_start and transl_table qualifiers of the feature if they are present. * Bio.SearchIO is no longer considered experimental, and so it does not raise warnings anymore when imported. * A new pairwise sequence aligner is available in Bio.Align, as an alternative to the existing pairwise sequence aligner in Bio.pairwise2. ------------------------------------------------------------------- Wed May 9 03:23:14 UTC 2018 - toddrme2178@gmail.com - Update to version 1.71 * Encoding issues have been fixed in several parsers when reading data files with non-ASCII characters, like accented letters in people's names. This would raise ``UnicodeDecodeError: 'ascii' codec can't decode byte ...`` under some system locale settings. * Bio.KEGG can now parse Gene files. * The multiple-sequence-alignment object used by Bio.AlignIO etc now supports a per-column annotation dictionary, useful for richly annotated alignments in the Stockholm/PFAM format. * The SeqRecord object now has a translate method, following the approach used for its existing reverse_complement method etc. * The output of function ``format_alignment`` in ``Bio.pairwise2`` for displaying a pairwise sequence alignment as text now indicates gaps and mis-matches. * Bio.SeqIO now supports reading and writing two-line-per-record FASTA files under the format name "fasta-2line", useful if you wish to work without line-wrapped sequences. * Bio.PDB now contains a writer for the mmCIF file format, which has been the standard PDB archive format since 2014. This allows structural objects to be written out and facilitates conversion between the PDB and mmCIF file formats. * Bio.Emboss.Applications has been updated to fix a wrong parameter in fuzznuc wrapper and include a new wrapper for fuzzpro. * The restriction enzyme list in Bio.Restriction has been updated to the November 2017 release of REBASE. * New codon tables 27-31 from NCBI (NCBI genetic code table version 4.2) were added to Bio.Data.CodonTable. Note that tables 27, 28 and 31 contain no dedicated stop codons; the stop codons in these codes have a context dependent encoding as either STOP or as amino acid. * In this release more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. * IO functions such as ``SeqIO.parse`` now accept any objects which can be passed to the builtin ``open`` function. Specifically, this allows using ``pathlib.Path`` objects under Python 3.6 and newer, as per `PEP 519 `_. * Bio.SearchIO can now parse InterProScan XML files. * For Python 3 compatibility, comparision operators for the entities within a Bio.PDB Structure object were implemented. These allow the comparison of models, chains, residues, and atoms with the common operators (==, !=, >, ...) Comparisons are based on IDs and take the parents of the entity up to the model level into account. For consistent behaviour of all entities the operators for atoms were modified to also consider the parent IDs. NOTE: this represents a change in behaviour in respect to v1.70 for Atom comparisons. In order to mimic the behaviour of previous versions, comparison will have to be done for Atom IDs and alternative locations specifically. * Additionally, a number of small bugs and typos have been fixed with further additions to the test suite, and there has been further work to follow the Python PEP8, PEP257 and best practice standard coding style. - Update to version 1.70 * Biopython now has a new logo, contributed by Patrick Kunzmann. Drawing on our original logo and the current Python logo, this shows a yellow and blue snake forming a double helix. * For installation Biopython now assumes ``setuptools`` is present, and takes advantage of this to declare we require NumPy at install time (except under Jython). This should help ensure ``pip install biopython`` works smoothly. * Bio.AlignIO now supports Mauve's eXtended Multi-FastA (XMFA) file format under the format name "mauve" (contributed by Eric Rasche). * Bio.ExPASy was updated to fix fetching PROSITE and PRODOC records, and return text-mode handles for use under Python 3. * Two new arguments for reading and writing blast-xml files have been added to the Bio.SearchIO functions (read/parse and write, respectively). They are 'use_raw_hit_ids' and 'use_raw_query_ids'. Check out the relevant SearchIO.BlastIO documentation for a complete description of what these arguments do. * Bio.motifs was updated to support changes in MEME v4.11.4 output. * The Bio.Seq sequence objects now have a ``.count_overlap()`` method to supplement the Python string like non-overlap based ``.count()`` method. * The Bio.SeqFeature location objects can now be compared for equality. * Bio.Phylo.draw_graphviz is now deprecated. We recommend using Bio.Phylo.draw instead, or another library or program if more advanced plotting functionality is needed. * In Bio.Phylo.TreeConstruction, the DistanceMatrix class (previously _DistanceMatrix) has a new method 'format_phylip' to write Phylip-compatible distance matrix files (contributed by Jordan Willis). * Additionally, a number of small bugs have been fixed with further additions to the test suite, and there has been further work to follow the Python PEP8, PEP257 and best practice standard coding style. - Use license tag ------------------------------------------------------------------- Wed May 24 14:28:23 UTC 2017 - toddrme2178@gmail.com - Implement single-spec version - Fix source URL. - updated to version 1.69 * We now expect and take advantage of NumPy under PyPy, and compile most of the Biopython C code modules as well. * Bio.AlignIO now supports the UCSC Multiple Alignment Format (MAF) under the format name "maf", using new module Bio.AlignIO.MafIO which also offers indexed access to these potentially large files using SQLite3 (contributed by Andrew Sczesnak, with additional refinements from Adam Novak). * Bio.SearchIO.AbiIO has been extended to support parsing FSA files. The underlying format (ABIF) remains the same as AB1 files and so the string 'abif' is the expected format argument in the main SeqIO functions. AbiIO determines whether the file is AB1 or FSA based on the presence of specific tags. * The Uniprot parser is now able to parse "submittedName" elements in XML files. * The NEXUS parser handling of internal node comments has been improved, which should help if working with tools like the BEAST TreeAnnotator. Slashes are now also allowed in identifiers. * New parser for ExPASy Cellosaurus, a cell line database, cell line catalogue, and cell line ontology (contributed by Steve Marshall). * For consistency the Bio.Seq module now offers a complement function (already available as a method on the Seq and MutableSeq objects). * The SeqFeature object's qualifiers is now an explicitly ordered dictionary (note that as of Python 3.6 the Python dict is ordered by default anyway). This helps reproduce GenBank/EMBL files on input/output. * The Bio.SeqIO UniProt-XML parser was updated to cope with features with unknown locations which can be found in mass spec data. * The Bio.SeqIO GenBank, EMBL, and IMGT parsers now record the molecule type from the LOCUS/ID line explicitly in the record.annotations dictionary. The Bio.SeqIO EMBL parser was updated to cope with more variants seen in patent data files, and the related IMGT parser was updated to cope with IPD-IMGT/HLA database files after release v3.16.0 when their ID line changed. The GenBank output now uses colon space to match current NCBI DBLINK lines. * The Bio.Affy package supports Affymetrix version 4 of the CEL file format, in addition to version 3. * The restriction enzyme list in Bio.Restriction has been updated to the February 2017 release of REBASE. * Bio.PDB.PDBList now can download PDBx/mmCif (new default), PDB (old default), PDBML/XML and mmtf format protein structures. This is inline with the RCSB recommendation to use PDBx/mmCif and deprecate the PDB file format. Biopython already has support for parsing mmCif files. * Additionally, a number of small bugs have been fixed with further additions to the test suite, and there has been further work to follow the Python PEP8, PEP257 and best practice standard coding style. ------------------------------------------------------------------- Thu Nov 17 10:10:59 UTC 2016 - alinm.elena@gmail.com - updated to version 1.68 ------------------------------------------------------------------- Mon Dec 9 16:00:01 UTC 2013 - toddrme2178@gmail.com - Update to version 1.63 * 2to3 no longer needed for python 3 - Added additional dependencies ------------------------------------------------------------------- Thu Sep 19 02:06:32 UTC 2013 - highwaystar.ru@gmail.com - upgrade to version 1.62 * The translation functions will give a warning on any partial codons * Phylo module now supports the file formats NeXML and CDAO * New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats from UniProt-GOA. * The BioSQL module is now supported in Jython. * Feature labels on circular GenomeDiagram figures now support the label_position argument (start, middle or end) * The code for parsing 3D structures in mmCIF files was updated to use the Python standard library's shlex module instead of C code using flex. * The Bio.Sequencing.Applications module now includes a BWA command line wrapper. * Bio.motifs supports JASPAR format files with multiple position-frequence matrices. ------------------------------------------------------------------- Wed Feb 1 14:09:33 UTC 2012 - saschpe@suse.de - Ran spec-cleaner - Set license to MIT (looks like it) ------------------------------------------------------------------- Wed Jan 11 14:56:08 UTC 2012 - toddrme2178@gmail.com - Cleaned up spec file ------------------------------------------------------------------- Thu Sep 8 19:36:32 UTC 2011 - alinm.elena@gmail.com - Initial commit