python-biopython/python-biopython.changes
Dirk Mueller 754d74b384 - update to 1.81:
* The API documentation and the `Biopython Tutorial and
    Cookbook` have been updated to better annotate use and
    application of the ``Bio.PDB.internal_coords`` module.
  * ``Bio.Phylo`` now supports ``Alignment`` and
    ``MultipleSeqAlignment`` objects as input.
  * Several improvements and bug fixes to the snapgene parser

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-biopython?expand=0&rev=19
2023-02-15 14:04:46 +00:00

457 lines
27 KiB
Plaintext

-------------------------------------------------------------------
Wed Feb 15 14:03:39 UTC 2023 - Dirk Müller <dmueller@suse.com>
- update to 1.81:
* The API documentation and the `Biopython Tutorial and
Cookbook` have been updated to better annotate use and
application of the ``Bio.PDB.internal_coords`` module.
* ``Bio.Phylo`` now supports ``Alignment`` and
``MultipleSeqAlignment`` objects as input.
* Several improvements and bug fixes to the snapgene parser
-------------------------------------------------------------------
Wed Jan 4 14:19:49 UTC 2023 - Dirk Müller <dmueller@suse.com>
- update to 1.80:
* This release of Biopython supports Python 3.7, 3.8, 3.9, 3.10, 3.11. It
has also been tested on PyPy3.7 v7.3.5.
* Functions ``read``, ``parse``, and ``write`` were added to ``Bio.Align``
to read and write ``Alignment`` objects.
* Because dict retains the item order by default since Python3.6, all
instances of ``collections.OrderedDict`` have been replaced by either standard
``dict`` or where appropriate by ``collections.defaultsdict``.
* The ``Bio.motifs.jaspar.db`` now returns ``tf_family`` and ``tf_class``
as a string array since the JASPAR 2018 release.
* The Local Composition Complexity functions from ``Bio.SeqUtils`` now
uses base 4 log instead of 2 as stated in the original reference Konopka
(2005), * Sequence Complexity and Composition. https://doi.org/10.1038/npg.els.0005260
* Append mode is now supported in ``Bio.bgzf`` (and a bug parsing blocked
GZIP files with an internal empty block fixed).
* The experimental warning was dropped from ``Bio.phenotype`` (which was
new in Biopython 1.67).
* Sequences now have a ``defined`` attribute that returns a boolean
indicating if the underlying data is defined or not.
* The ``Bio.PDB`` module now includes a structural alignment module, using
the combinatorial extension algorithm of Shindyalov and Bourne, commonly
known as CEAlign. The module allows for two structures to be aligned based solely
on their 3D conformation, ie. in a sequence-independent manner. The method
is particularly powerful when the structures shared a very low degree of
sequence similarity. The new module is available in ``Bio.PDB.CEAligner`` with an
interface similar to other 3D superimposition modules.
* A new module ``Bio.PDB.qcprot`` implements the QCP superposition
algorithm in pure Python, deprecating the existing C implementation. This leads to a
slight performance improvement and to much better maintainability. The
refactored ``qcprot.QCPSuperimposer`` class has small changes to its API, to better
mirror that of ``Bio.PDB.Superimposer``.
* The ``Bio.PDB.PDBList`` module now allows downloading biological
assemblies, for one or more entries of the wwPDB.
* In the ``Bio.Restriction`` module, each restriction enzyme now includes
an `id` property giving the numerical identifier for the REBASE database
identifier from which the enzyme object was created, and a `uri` property with a
canonical `identifiers.org` link to the database, for use in linked-data
representations.
* Add new ``gc_fraction`` function in ``SeqUtils`` and marks ``GC`` for
future deprecation.
* Support for the old format (dating back to 2004) of the GN line in
SwissProt files was dropped in ``Bio.SwissProt``.
* Additionally, a number of small bugs and typos have been fixed with
additions to the test suite.
-------------------------------------------------------------------
Sun Mar 27 13:57:19 UTC 2022 - Dirk Müller <dmueller@suse.com>
- update to 1.79:
* This is intended to be our final release supporting Python 3.6. It also
supports Python 3.7, 3.8 and 3.9, and has also been tested on PyPy3.6.1 v7.1.1.
* Detailed list of changes see
https://github.com/biopython/biopython/blob/biopython-179/NEWS.rst#1-june-2021-biopython-179
-------------------------------------------------------------------
Sat Feb 20 19:29:23 UTC 2021 - andy great <andythe_great@pm.me>
- Update to version 1.7.8.
* The main change is that Bio.Alphabet is no longer used. In some
cases you will now have to specify expected letters, molecule
type (DNA, RNA, protein), or gap character explicitly.
* Bio.SeqIO.parse() is faster with "fastq" format due to small
improvements in the Bio.SeqIO.QualityIO module.
* The SeqFeature object's .extract() method can now be used for
trans-spliced locations via an optional dictionary of references.
* As in recent releases, more of our code is now explicitly
available under either our original "Biopython License Agreement",
or the very similar but more commonly used "3-Clause BSD License".
See the LICENSE.rst file for more details.
* Additionally, a number of small bugs and typos have been fixed
with additions to the test suite. There has been further work to
follow the Python PEP8, PEP257 and best practice standard coding
style, and all of the tests have been reformatted with the black
tool to match the main code base.
- Skip python36 because numpy no longer support it.
-------------------------------------------------------------------
Tue Nov 3 15:58:16 UTC 2020 - Matej Cepl <mcepl@suse.com>
- Remove ridiculously wide find commands in %prep, which break a lot
(binary) files.
-------------------------------------------------------------------
Wed Jul 8 07:31:29 UTC 2020 - Marketa Calabkova <mcalabkova@suse.com>
- Update to version 1.77
* **We have dropped support for Python 2 now.**
* ``pairwise2`` now allows the input of parameters with keywords and returns the
alignments as a list of ``namedtuples``.
* The codon tables have been updated to NCBI genetic code table version 4.5,
which adds Cephalodiscidae mitochondrial as table 33.
* Updated ``Bio.Restriction`` to the January 2020 release of REBASE.
* A major contribution by Rob Miller to ``Bio.PDB`` provides new methods to
handle protein structure transformations using dihedral angles (internal
coordinates). The new framework supports lossless interconversion between
internal and cartesian coordinates, which, among other uses, simplifies the
analysis and manipulation of coordinates of proteins structures.
* ``PDBParser`` and ``PDBIO`` now support PQR format file parsing and input/
output.
* In addition to the mainstream ``x86_64`` aka ``AMD64`` CPU architecture, we
now also test every contribution on the ``ARM64``, ``ppc64le``, and ``s390x``
CPUs under Linux thanks to Travis CI. Further post-release testing done by
Debian and other packagers and distributors of Biopython also covers these
CPUs.
* ``Bio.motifs.PositionSpecificScoringMatrix.search()`` method has been
re-written: it now applies ``.calculate()`` to chunks of the sequence
to maintain a low memory footprint for long sequences.
* Additionally, a number of small bugs and typos have been fixed with further
additions to the test suite. There has been further work to follow the Python
PEP8, PEP257 and best practice standard coding style, and more of the code
style has been reformatted with the ``black`` tool.
-------------------------------------------------------------------
Wed Nov 20 20:17:31 UTC 2019 - Todd R <toddrme2178@gmail.com>
- Update to version 1.75
* The restriction enzyme list in Bio.Restriction has been updated to the August
2019 release of REBASE.
* ``Bio.SeqIO`` now supports reading and writing files in the native format of
Christian Marck's DNA Strider program ("xdna" format, also used by Serial
Cloner), as well as reading files in the native formats of GSL Biotech's
SnapGene ("snapgene") and Textco Biosoftware's Gene Construction Kit ("gck").
* ``Bio.AlignIO`` now supports GCG MSF multiple sequence alignments as the "msf"
format (work funded by the National Marrow Donor Program).
* The main ``Seq`` object now has string-like ``.index()`` and ``.rindex()``
methods, matching the existing ``.find()`` and ``.rfind()`` implementations.
The ``MutableSeq`` object retains its more list-like ``.index()`` behaviour.
* The ``MMTFIO`` class has been added that allows writing of MMTF file format
files from a Biopython structure object. ``MMTFIO`` has a similar interface to
``PDBIO`` and ``MMCIFIO``, including the use of a ``Select`` class to write
out a specified selection. This final addition to read/write support for
PDB/mmCIF/MMTF in Biopython allows conversion between all three file formats.
* Values from mmCIF files are now read in as a list even when they consist of a
single value. This change improves consistency and reduces the likelihood of
making an error, but will require user code to be updated accordingly.
* ``Bio.PDB`` has been updated to support parsing REMARK 99 header entries from
PDB-style Astral files.
* A new keyword parameter ``full_sequences`` was added to ``Bio.pairwise2``'s
pretty print method ``format_alignment`` to restore the output of local
alignments to the 'old' format (showing the whole sequences including the
un-aligned parts instead of only showing the aligned parts).
* A new function ``charge_at_pH(pH)`` has been added to ``ProtParam`` and
``IsoelectricPoint`` in ``Bio.SeqUtils``.
* The ``PairwiseAligner`` in ``Bio.Align`` was extended to allow generalized
pairwise alignments, i.e. alignments of any Python object, for example
three-letter amino acid sequences, three-nucleotide codons, and arrays of
integers.
* A new module ``substitution_matrices`` was added to ``Bio.Align``, which
includes an ``Array`` class that can be used as a substitution matrix. As
the ``Array`` class is a subclass of a numpy array, mathematical operations
can be applied to it directly, and C code that makes use of substitution
matrices can directly access the numerical values stored in the substitution
matrices. This module is intended as a replacement of ``Bio.SubsMat``,
which is currently unmaintained.
* As in recent releases, more of our code is now explicitly available under
either our original "Biopython License Agreement", or the very similar but
more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for
more details.
* Additionally, a number of small bugs and typos have been fixed with further
additions to the test suite, and there has been further work to follow the
Python PEP8, PEP257 and best practice standard coding style. We have also
started to use the ``black`` Python code formatting tool.
-------------------------------------------------------------------
Tue Jul 23 01:23:01 UTC 2019 - Todd R <toddrme2178@gmail.com>
- Update to version 1.74
* Our core sequence objects (``Seq``, ``UnknownSeq``, and ``MutableSeq``) now
have a string-like ``.join()`` method.
* The NCBI now allows longer accessions in the GenBank file LOCUS line, meaning
the fields may not always follow the historical column based positions. We
no longer give a warning when parsing these. We now allow writing such files
(although with a warning as support for reading them is not yet widespread).
* Support for the ``mysqlclient`` package, a fork of MySQLdb, has been added.
* We now capture the IDcode field from PDB Header records.
* ``Bio.pairwise2``'s pretty-print output from ``format_alignment`` has been
optimized for local alignments: If they do not consist of the whole sequences,
only the aligned section of the sequences are shown, together with the start
positions of the sequences (in 1-based notation). Alignments of lists will now
also be prettily printed.
* ``Bio.SearchIO`` now supports parsing the text output of the HHsuite protein
sequence search tool. The format name is ``hhsuite2-text`` and
``hhsuite3-text``, for versions 2 and 3 of HHsuite, respectively.
* ``Bio.SearchIO`` HSP objects has a new attribute called ``output_index``. This
attribute is meant for capturing the order by which the HSP were output in the
parsed file and is set with a default value of -1 for all HSP objects. It is
also used for sorting the output of ``QueryResult.hsps``.
* ``Bio.SeqIO.AbiIO`` has been updated to preserve bytes value when parsing. The
goal of this change is make the parser more robust by being able to extract
string-values that are not utf-8-encoded. This affects all tag values, except
for ID and description values, where they need to be extracted as strings
to conform to the ``SeqRecord`` interface. In this case, the parser will
attempt to decode using ``utf-8`` and fall back to the system encoding if that
fails. This change affects Python 3 only.
* ``Bio.motifs.mast`` has been updated to parse XML output files from MAST over
the plain-text output file. The goal of this change is to parse a more
structured data source with minimal loss of functionality upon future MAST
releases. Class structure remains the same plus an additional attribute
``Record.strand_handling`` required for diagram parsing.
* ``Bio.Entrez`` now automatically retries HTTP requests on failure. The
maximum number of tries and the sleep between them can be configured by
changing ``Bio.Entrez.max_tries`` and ``Bio.Entrez.sleep_between_tries``.
(The defaults are 3 tries and 15 seconds, respectively.)
* All tests using the older print-and-compare approach have been replaced by
unittests following Python's standard testing framework.
* On the documentation side, all the public modules, classes, methods and
functions now have docstrings (built in help strings). Furthermore, the PDF
version of the *Biopython Tutorial and Cookbook* now uses syntax coloring
for code snippets.
* Additionally, a number of small bugs and typos have been fixed with further
additions to the test suite, and there has been further work to follow the
Python PEP8, PEP257 and best practice standard coding style.
-------------------------------------------------------------------
Fri Jan 4 17:31:38 UTC 2019 - Todd R <toddrme2178@gmail.com>
- Update to version 1.73
* As in recent releases, more of our code is now explicitly available under
either our original "Biopython License Agreement", or the very similar but
more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for
more details.
* The dictionary-like indexing in SeqIO and SearchIO will now explicitly preserve
record order to match a behaviour change in the Python standard dict object.
This means looping over the index will load the records in the on-disk order,
which will be much faster (previously it would be effectively at random, based
on the key hash sorting).
* The "grant" matrix in Bio.SubsMat.MatrixInfo has been replaced as our original
values taken from Gerhard Vogt's old webpages at EMBL Heidelberg were
discovered to be in error. The new values have been transformed following
Vogt's approach, taking the global maximum 215 minus the similarity scores
from the original paper Grantham (1974), to give a distance measure.
* Additionally, a number of small bugs and typos have been fixed with further
additions to the test suite, and there has been further work to follow the
Python PEP8, PEP257 and best practice standard coding style.
* Double-quote characters in GenBank feature qualifier values in ``Bio.SeqIO``
are now escaped as per the NCBI standard. Improperly escaped values trigger a
warning on parsing.
* There is a new command line wrapper for the BWA-MEM sequence mapper.
* The string-based FASTA parsers in ``Bio.SeqIO.FastaIO`` have been optimised,
which also speeds up parsing FASTA files using ``Bio.SeqIO.parse()``.
- Update to version 1.72
* Internal changes to Bio.SeqIO have sped up the SeqRecord .format method and
SeqIO.write (especially when used in a for loop).
* The MAF alignment indexing in Bio.AlignIO.MafIO has been updated to use
inclusive end co-ordinates to better handle searches at end points. This
will require you to rebuild any existing MAF index files.
* In this release more of our code is now explicitly available under either our
original "Biopython License Agreement", or the very similar but more commonly
used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details.
* The Entrez module now supports the NCBI API key. Also you can now set a custom
directory for DTD and XSD files. This allows Entrez to be used in environments
like AWS Lambda, which restricts write access to specific directories.
Improved support for parsing NCBI Entrez XML files that use XSD schemas.
* Internal changes to our C code mean that NumPy is no longer required at
compile time - only at run time (and only for those modules which use NumPy).
* Seq, UnknownSeq, MutableSeq and derived classes now support integer
multiplication methods, matching native Python string methods.
* A translate method has been added to Bio.SeqFeature that will extract a
feature and translate it using the codon_start and transl_table qualifiers
of the feature if they are present.
* Bio.SearchIO is no longer considered experimental, and so it does not raise
warnings anymore when imported.
* A new pairwise sequence aligner is available in Bio.Align, as an alternative
to the existing pairwise sequence aligner in Bio.pairwise2.
-------------------------------------------------------------------
Wed May 9 03:23:14 UTC 2018 - toddrme2178@gmail.com
- Update to version 1.71
* Encoding issues have been fixed in several parsers when reading data files
with non-ASCII characters, like accented letters in people's names. This would
raise ``UnicodeDecodeError: 'ascii' codec can't decode byte ...`` under some
system locale settings.
* Bio.KEGG can now parse Gene files.
* The multiple-sequence-alignment object used by Bio.AlignIO etc now supports
a per-column annotation dictionary, useful for richly annotated alignments
in the Stockholm/PFAM format.
* The SeqRecord object now has a translate method, following the approach used
for its existing reverse_complement method etc.
* The output of function ``format_alignment`` in ``Bio.pairwise2`` for displaying
a pairwise sequence alignment as text now indicates gaps and mis-matches.
* Bio.SeqIO now supports reading and writing two-line-per-record FASTA files
under the format name "fasta-2line", useful if you wish to work without
line-wrapped sequences.
* Bio.PDB now contains a writer for the mmCIF file format, which has been the
standard PDB archive format since 2014. This allows structural objects to be
written out and facilitates conversion between the PDB and mmCIF file formats.
* Bio.Emboss.Applications has been updated to fix a wrong parameter in fuzznuc
wrapper and include a new wrapper for fuzzpro.
* The restriction enzyme list in Bio.Restriction has been updated to the
November 2017 release of REBASE.
* New codon tables 27-31 from NCBI (NCBI genetic code table version 4.2)
were added to Bio.Data.CodonTable. Note that tables 27, 28 and 31 contain
no dedicated stop codons; the stop codons in these codes have a context
dependent encoding as either STOP or as amino acid.
* In this release more of our code is now explicitly available under either our
original "Biopython License Agreement", or the very similar but more commonly
used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details.
* IO functions such as ``SeqIO.parse`` now accept any objects which can be passed
to the builtin ``open`` function. Specifically, this allows using
``pathlib.Path`` objects under Python 3.6 and newer, as per `PEP 519
<https://www.python.org/dev/peps/pep-0519/>`_.
* Bio.SearchIO can now parse InterProScan XML files.
* For Python 3 compatibility, comparision operators for the entities within a
Bio.PDB Structure object were implemented. These allow the comparison of
models, chains, residues, and atoms with the common operators (==, !=, >, ...)
Comparisons are based on IDs and take the parents of the entity up to the
model level into account. For consistent behaviour of all entities the operators
for atoms were modified to also consider the parent IDs. NOTE: this represents a
change in behaviour in respect to v1.70 for Atom comparisons. In order to mimic
the behaviour of previous versions, comparison will have to be done for Atom IDs
and alternative locations specifically.
* Additionally, a number of small bugs and typos have been fixed with further
additions to the test suite, and there has been further work to follow the
Python PEP8, PEP257 and best practice standard coding style.
- Update to version 1.70
* Biopython now has a new logo, contributed by Patrick Kunzmann. Drawing on our
original logo and the current Python logo, this shows a yellow and blue snake
forming a double helix.
* For installation Biopython now assumes ``setuptools`` is present, and takes
advantage of this to declare we require NumPy at install time (except under
Jython). This should help ensure ``pip install biopython`` works smoothly.
* Bio.AlignIO now supports Mauve's eXtended Multi-FastA (XMFA) file format
under the format name "mauve" (contributed by Eric Rasche).
* Bio.ExPASy was updated to fix fetching PROSITE and PRODOC records, and return
text-mode handles for use under Python 3.
* Two new arguments for reading and writing blast-xml files have been added
to the Bio.SearchIO functions (read/parse and write, respectively). They
are 'use_raw_hit_ids' and 'use_raw_query_ids'. Check out the relevant
SearchIO.BlastIO documentation for a complete description of what these
arguments do.
* Bio.motifs was updated to support changes in MEME v4.11.4 output.
* The Bio.Seq sequence objects now have a ``.count_overlap()`` method to
supplement the Python string like non-overlap based ``.count()`` method.
* The Bio.SeqFeature location objects can now be compared for equality.
* Bio.Phylo.draw_graphviz is now deprecated. We recommend using Bio.Phylo.draw
instead, or another library or program if more advanced plotting functionality
is needed.
* In Bio.Phylo.TreeConstruction, the DistanceMatrix class (previously
_DistanceMatrix) has a new method 'format_phylip' to write Phylip-compatible
distance matrix files (contributed by Jordan Willis).
* Additionally, a number of small bugs have been fixed with further additions
to the test suite, and there has been further work to follow the Python PEP8,
PEP257 and best practice standard coding style.
- Use license tag
-------------------------------------------------------------------
Wed May 24 14:28:23 UTC 2017 - toddrme2178@gmail.com
- Implement single-spec version
- Fix source URL.
- updated to version 1.69
* We now expect and take advantage of NumPy under PyPy, and compile most of the
Biopython C code modules as well.
* Bio.AlignIO now supports the UCSC Multiple Alignment Format (MAF) under the
format name "maf", using new module Bio.AlignIO.MafIO which also offers
indexed access to these potentially large files using SQLite3 (contributed by
Andrew Sczesnak, with additional refinements from Adam Novak).
* Bio.SearchIO.AbiIO has been extended to support parsing FSA files. The
underlying format (ABIF) remains the same as AB1 files and so the string
'abif' is the expected format argument in the main SeqIO functions. AbiIO
determines whether the file is AB1 or FSA based on the presence of specific
tags.
* The Uniprot parser is now able to parse "submittedName" elements in XML files.
* The NEXUS parser handling of internal node comments has been improved, which
should help if working with tools like the BEAST TreeAnnotator. Slashes are
now also allowed in identifiers.
* New parser for ExPASy Cellosaurus, a cell line database, cell line catalogue,
and cell line ontology (contributed by Steve Marshall).
* For consistency the Bio.Seq module now offers a complement function (already
available as a method on the Seq and MutableSeq objects).
* The SeqFeature object's qualifiers is now an explicitly ordered dictionary
(note that as of Python 3.6 the Python dict is ordered by default anyway).
This helps reproduce GenBank/EMBL files on input/output.
* The Bio.SeqIO UniProt-XML parser was updated to cope with features with
unknown locations which can be found in mass spec data.
* The Bio.SeqIO GenBank, EMBL, and IMGT parsers now record the molecule type
from the LOCUS/ID line explicitly in the record.annotations dictionary.
The Bio.SeqIO EMBL parser was updated to cope with more variants seen in
patent data files, and the related IMGT parser was updated to cope with
IPD-IMGT/HLA database files after release v3.16.0 when their ID line changed.
The GenBank output now uses colon space to match current NCBI DBLINK lines.
* The Bio.Affy package supports Affymetrix version 4 of the CEL file format,
in addition to version 3.
* The restriction enzyme list in Bio.Restriction has been updated to the
February 2017 release of REBASE.
* Bio.PDB.PDBList now can download PDBx/mmCif (new default), PDB (old default),
PDBML/XML and mmtf format protein structures. This is inline with the RCSB
recommendation to use PDBx/mmCif and deprecate the PDB file format. Biopython
already has support for parsing mmCif files.
* Additionally, a number of small bugs have been fixed with further additions
to the test suite, and there has been further work to follow the Python PEP8,
PEP257 and best practice standard coding style.
-------------------------------------------------------------------
Thu Nov 17 10:10:59 UTC 2016 - alinm.elena@gmail.com
- updated to version 1.68
-------------------------------------------------------------------
Mon Dec 9 16:00:01 UTC 2013 - toddrme2178@gmail.com
- Update to version 1.63
* 2to3 no longer needed for python 3
- Added additional dependencies
-------------------------------------------------------------------
Thu Sep 19 02:06:32 UTC 2013 - highwaystar.ru@gmail.com
- upgrade to version 1.62
* The translation functions will give a warning on any partial codons
* Phylo module now supports the file formats NeXML and CDAO
* New module Bio.UniProt adds parsers for the GAF, GPA and GPI
formats from UniProt-GOA.
* The BioSQL module is now supported in Jython.
* Feature labels on circular GenomeDiagram figures now support
the label_position argument (start, middle or end)
* The code for parsing 3D structures in mmCIF files was updated
to use the Python standard library's shlex module instead of C code
using flex.
* The Bio.Sequencing.Applications module now includes a BWA
command line wrapper.
* Bio.motifs supports JASPAR format files with multiple
position-frequence matrices.
-------------------------------------------------------------------
Wed Feb 1 14:09:33 UTC 2012 - saschpe@suse.de
- Ran spec-cleaner
- Set license to MIT (looks like it)
-------------------------------------------------------------------
Wed Jan 11 14:56:08 UTC 2012 - toddrme2178@gmail.com
- Cleaned up spec file
-------------------------------------------------------------------
Thu Sep 8 19:36:32 UTC 2011 - alinm.elena@gmail.com
- Initial commit