Markéta Machová
047369afb9
- Update to 3.8 * Refactor dispersion plot (#3082) * Provide type hints for LazyCorpusLoader variables (#3081) * Throw warning when LanguageModel is initialized with incorrect vocabulary (#3080) * Fix WordNet's all_synsets() function (#3078) * Resolve TreebankWordDetokenizer inconsistency with end-of-string contractions (#3070) * Support both iso639-3 codes and BCP-47 language tags (#3060) * Avoid DeprecationWarning in Regexp tokenizer (#3055) * Fix many doctests, add doctests to CI (#3054, #3050, #3048) * Fix bool field not being read in VerbNet (#3044) * Greatly improve time efficiency of SyllableTokenizer when tokenizing numbers (#3042) * Fix encodings of Polish udhr corpus reader (#3038) * Allow TweetTokenizer to tokenize emoji flag sequences (#3034) * Prevent LazyModule from increasing the size of nltk.__dict__ (#3033) * Fix CoreNLPServer non-default port issue (#3031) * Add "acion" suffix to the Spanish SnowballStemmer (#3030) * Allow loading WordNet without OMW (#3026) * Use input() in nltk.chat.chatbot() for Jupyter support (#3022) * Fix edit_distance_align() in distance.py (#3017) * Tackle performance and accuracy regression of sentence tokenizer since NLTK 3.6.6 (#3014) * Add the Iota operator to semantic logic (#3010) * Resolve critical errors in WordNet app (#3008) * Resolve critical error in CHILDES Corpus (#2998) * Make WordNet information_content() accept adjective satellites (#2995) * Add "strict=True" parameter to CoreNLP (#2993, #3043) * Resolve issue with WordNet's synset_from_sense_key (#2988) * Handle WordNet synsets that were lost in mapping (#2985) * Resolve TypeError in Boxer (#2979) * Add function to retrieve WordNet synonyms (#2978) * Warn about nonexistent OMW offsets instead of raising an error (#2974) OBS-URL: https://build.opensuse.org/request/show/1056422 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-nltk?expand=0&rev=41
383 lines
15 KiB
Plaintext
383 lines
15 KiB
Plaintext
-------------------------------------------------------------------
|
||
Fri Jan 6 15:32:43 UTC 2023 - Yogalakshmi Arunachalam <yarunachalam@suse.com>
|
||
|
||
- Update to 3.8
|
||
|
||
* Refactor dispersion plot (#3082)
|
||
* Provide type hints for LazyCorpusLoader variables (#3081)
|
||
* Throw warning when LanguageModel is initialized with incorrect vocabulary (#3080)
|
||
* Fix WordNet's all_synsets() function (#3078)
|
||
* Resolve TreebankWordDetokenizer inconsistency with end-of-string contractions (#3070)
|
||
* Support both iso639-3 codes and BCP-47 language tags (#3060)
|
||
* Avoid DeprecationWarning in Regexp tokenizer (#3055)
|
||
* Fix many doctests, add doctests to CI (#3054, #3050, #3048)
|
||
* Fix bool field not being read in VerbNet (#3044)
|
||
* Greatly improve time efficiency of SyllableTokenizer when tokenizing numbers (#3042)
|
||
* Fix encodings of Polish udhr corpus reader (#3038)
|
||
* Allow TweetTokenizer to tokenize emoji flag sequences (#3034)
|
||
* Prevent LazyModule from increasing the size of nltk.__dict__ (#3033)
|
||
* Fix CoreNLPServer non-default port issue (#3031)
|
||
* Add "acion" suffix to the Spanish SnowballStemmer (#3030)
|
||
* Allow loading WordNet without OMW (#3026)
|
||
* Use input() in nltk.chat.chatbot() for Jupyter support (#3022)
|
||
* Fix edit_distance_align() in distance.py (#3017)
|
||
* Tackle performance and accuracy regression of sentence tokenizer since NLTK 3.6.6 (#3014)
|
||
* Add the Iota operator to semantic logic (#3010)
|
||
* Resolve critical errors in WordNet app (#3008)
|
||
* Resolve critical error in CHILDES Corpus (#2998)
|
||
* Make WordNet information_content() accept adjective satellites (#2995)
|
||
* Add "strict=True" parameter to CoreNLP (#2993, #3043)
|
||
* Resolve issue with WordNet's synset_from_sense_key (#2988)
|
||
* Handle WordNet synsets that were lost in mapping (#2985)
|
||
* Resolve TypeError in Boxer (#2979)
|
||
* Add function to retrieve WordNet synonyms (#2978)
|
||
* Warn about nonexistent OMW offsets instead of raising an error (#2974)
|
||
* Fix missing ic argument in res, jcn and lin similarity functions of WordNet (#2970)
|
||
* Add support for the extended OMW (#2946)
|
||
* Fix LC cutoff policy of text tiling (#2936)
|
||
* Optimize ConditionalFreqDist.__add__ performance (#2939)
|
||
* Add Markdown corpus reader (#2902)
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Dec 26 10:41:22 UTC 2022 - Matej Cepl <mcepl@suse.com>
|
||
|
||
- Complete nltk_data.tar.xz for offline testing
|
||
- Fix failing tests (gh#nltk/nltk#2969) by adding patches:
|
||
- port-2to3.patch
|
||
- skip-networked-test.patch
|
||
- Clean up the SPEC to get rid of rpmlint warnings.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Mar 22 07:48:14 UTC 2022 - Matej Cepl <mcepl@suse.com>
|
||
|
||
- Update to 3.7
|
||
- Improve and update the NLTK team page on nltk.org (#2855,
|
||
#2941)
|
||
- Drop support for Python 3.6, support Python 3.10 (#2920)
|
||
- Update to 3.6.7
|
||
- Resolve IndexError in `sent_tokenize` and `word_tokenize`
|
||
(#2922)
|
||
- Update to 3.6.6
|
||
- Refactor `gensim.doctest` to work for gensim 4.0.0 and up
|
||
(#2914)
|
||
- Add Precision, Recall, F-measure, Confusion Matrix to Taggers
|
||
(#2862)
|
||
- Added warnings if .zip files exist without any corresponding
|
||
.csv files. (#2908)
|
||
- Fix `FileNotFoundError` when the `download_dir` is
|
||
a non-existing nested folder (#2910)
|
||
- Rename omw to omw-1.4 (#2907)
|
||
- Resolve ReDoS opportunity by fixing incorrectly specified
|
||
regex (#2906, bsc#1191030, CVE-2021-3828).
|
||
- Support OMW 1.4 (#2899)
|
||
- Deprecate Tree get and set node methods (#2900)
|
||
- Fix broken inaugural test case (#2903)
|
||
- Use Multilingual Wordnet Data from OMW with newer Wordnet
|
||
versions (#2889)
|
||
- Keep NLTKs "tokenize" module working with pathlib (#2896)
|
||
- Make prettyprinter to be more readable (#2893)
|
||
- Update links to the nltk book (#2895)
|
||
- Add `CITATION.cff` to nltk (#2880)
|
||
- Resolve serious ReDoS in PunktSentenceTokenizer (#2869)
|
||
- Delete old CI config files (#2881)
|
||
- Improve Tokenize documentation + add TokenizerI as superclass
|
||
for TweetTokenizer (#2878)
|
||
- Fix expected value for BLEU score doctest after changes from
|
||
#2572
|
||
- Add multi Bleu functionality and tests (#2793)
|
||
- Deprecate 'return_str' parameter in NLTKWordTokenizer and
|
||
TreebankWordTokenizer (#2883)
|
||
- Allow empty string in CFG's + more (#2888)
|
||
- Partition `tree.py` module into `tree` package + pickle fix
|
||
(#2863)
|
||
- Fix several TreebankWordTokenizer and NLTKWordTokenizer bugs
|
||
(#2877)
|
||
- Rewind Wordnet data file after each lookup (#2868)
|
||
- Correct __init__ call for SyntaxCorpusReader subclasses
|
||
(#2872)
|
||
- Documentation fixes (#2873)
|
||
- Fix levenstein distance for duplicated letters (#2849)
|
||
- Support alternative Wordnet versions (#2860)
|
||
- Remove hundreds of formatting warnings for nltk.org (#2859)
|
||
- Modernize `nltk.org/howto` pages (#2856)
|
||
- Fix Bleu Score smoothing function from taking log(0) (#2839)
|
||
- Update third party tools to newer versions and removing
|
||
MaltParser fixed version (#2832)
|
||
- Fix TypeError: _pretty() takes 1 positional argument but 2
|
||
were given in sem/drt.py (#2854)
|
||
- Replace `http` with `https` in most URLs (#2852)
|
||
- Update to 3.6.5
|
||
- modernised nltk.org website
|
||
- addressed LGTM.com issues
|
||
- support ZWJ sequences emoji and skin tone modifer emoji in
|
||
TweetTokenizer
|
||
- METEOR evaluation now requires pre-tokenized input
|
||
- Code linting and type hinting
|
||
- implement get_refs function for DrtLambdaExpression
|
||
- Enable automated CoreNLP, Senna, Prover9/Mace4, Megam,
|
||
MaltParser CI tests
|
||
- specify minimum regex version that supports regex.Pattern
|
||
- avoid re.Pattern and regex.Pattern which fail for Python 3.6,
|
||
3.7
|
||
- Update to 3.6.4
|
||
- deprecate `nltk.usage(obj)` in favor of `help(obj)`
|
||
- resolve ReDoS vulnerability in Corpus Reader
|
||
- solidify performance tests
|
||
- improve phone number recognition in tweet tokenizer
|
||
- refactored CISTEM stemmer for German
|
||
- identify NLTK Team as the author
|
||
- replace travis badge with github actions badge
|
||
- add SECURITY.md
|
||
- Update to 3.6.3
|
||
- Dropped support for Python 3.5
|
||
- Run CI tests on Windows, too
|
||
- Moved from Travis CI to GitHub Actions
|
||
- Code and comment cleanups
|
||
- Visualize WordNet relation graphs using Graphviz
|
||
- Fixed large error in METEOR score
|
||
- Apply isort, pyupgrade, black, added as pre-commit hooks
|
||
- Prevent debug_decisions in Punkt from throwing IndexError
|
||
- Resolved ZeroDivisionError in RIBES with dissimilar sentences
|
||
- Initialize WordNet IC total counts with smoothing value
|
||
- Fixed AttributeError for Arabic ARLSTem2 stemmer
|
||
- Many fixes and improvements to lm language model package
|
||
- Fix bug in nltk.metrics.aline, C_skip = -10
|
||
- Improvements to TweetTokenizer
|
||
- Optional show arg for FreqDist.plot, ConditionalFreqDist.plot
|
||
- edit_distance now computes Damerau-Levenshtein edit-distance
|
||
- Update to 3.6.2
|
||
- move test code to nltk/test
|
||
- fix bug in NgramAssocMeasures (order preserving fix)
|
||
- Update to 3.6
|
||
- add support for Python 3.9
|
||
- add Tree.fromlist
|
||
- compute Minimum Spanning Tree of unweighted graph using BFS
|
||
- fix bug with infinite loop in Wordnet closure and tree
|
||
- fix bug in calculating BLEU using smoothing method 4
|
||
- Wordnet synset similarities work for all pos
|
||
- new Arabic light stemmer (ARLSTem2)
|
||
- new syllable tokenizer (LegalitySyllableTokenizer)
|
||
- remove nose in favor of pytest
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Apr 23 13:54:08 UTC 2020 - John Vandenberg <jayvdb@gmail.com>
|
||
|
||
- Update to v3.5
|
||
* add support for Python 3.8
|
||
* drop support for Python 2
|
||
* create NLTK's own Tokenizer class distinct from the Treebank
|
||
reference tokeniser
|
||
* update Vader sentiment analyser
|
||
* fix JSON serialization of some PoS taggers
|
||
* minor improvements in grammar.CFG, Vader, pl196x corpus reader,
|
||
StringTokenizer
|
||
* change implementation <= and >= for FreqDist so they are partial
|
||
orders
|
||
* make FreqDist iterable
|
||
* correctly handle Penn Treebank trees with a unlabeled branching
|
||
top node
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Mar 14 09:07:16 UTC 2020 - Tomáš Chvátal <tchvatal@suse.com>
|
||
|
||
- Fix build without python2
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Oct 14 14:00:43 UTC 2019 - Matej Cepl <mcepl@suse.com>
|
||
|
||
- Replace %fdupes -s with plain %fdupes; hardlinks are better.
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Sep 11 11:05:01 UTC 2019 - Tomáš Chvátal <tchvatal@suse.com>
|
||
|
||
- Update to 3.4.5 (bsc#1146427, CVE-2019-14751):
|
||
* Fixed security bug in downloader: Zip slip vulnerability - for the
|
||
unlikely situation where a user configures their downloader to use
|
||
a compromised server CVE-2019-14751
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jul 23 13:52:24 UTC 2019 - Tomáš Chvátal <tchvatal@suse.com>
|
||
|
||
- Update to 3.4.4:
|
||
* fix bug in plot function (probability.py)
|
||
* add improved PanLex Swadesh corpus reader
|
||
* add Text.generate()
|
||
* add QuadgramAssocMeasures
|
||
* add SSP to tokenizers
|
||
* return confidence of best tag from AveragedPerceptron
|
||
* make plot methods return Axes objects
|
||
* don't require list arguments to PositiveNaiveBayesClassifier.train
|
||
* fix Tree classes to work with native Python copy library
|
||
* fix inconsistency for NomBank
|
||
* fix random seeding in LanguageModel.generate
|
||
* fix ConditionalFreqDist mutation on tabulate/plot call
|
||
* fix broken links in documentation
|
||
* fix misc Wordnet issues
|
||
* update installation instructions
|
||
|
||
-------------------------------------------------------------------
|
||
Thu May 23 12:41:31 UTC 2019 - pgajdos@suse.com
|
||
|
||
- version update to 3.4.1
|
||
* add chomsky_normal_form for CFGs
|
||
* add meteor score
|
||
* add minimum edit/Levenshtein distance based alignment function
|
||
* allow access to collocation list via text.collocation_list()
|
||
* support corenlp server options
|
||
* drop support for Python 3.4
|
||
* other minor fixes
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Feb 10 16:19:17 UTC 2019 - John Vandenberg <jayvdb@gmail.com>
|
||
|
||
- Remove Python 3 dependency on singledispatch
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Feb 9 16:16:11 UTC 2019 - John Vandenberg <jayvdb@gmail.com>
|
||
|
||
- Update to v3.4
|
||
+ Support Python 3.7
|
||
+ New Language Modeling package
|
||
+ Cistem Stemmer for German
|
||
+ Support Russian National Corpus incl POS tag model
|
||
+ Krippendorf Alpha inter-rater reliability test
|
||
+ Comprehensive code clean-ups
|
||
+ Switch continuous integration from Jenkins to Travis
|
||
- from v3.3
|
||
+ Support Python 3.6
|
||
+ New interface to CoreNLP
|
||
+ Support synset retrieval by sense key
|
||
+ Minor fixes to CoNLL Corpus Reader
|
||
+ AlignedSent
|
||
+ Fixed minor inconsistencies in APIs and API documentation
|
||
+ Better conformance to PEP8
|
||
+ Drop Moses Tokenizer (incompatible license)
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Feb 6 09:44:56 UTC 2019 - John Vandenberg <jayvdb@gmail.com>
|
||
|
||
- Add missing dependency six
|
||
- Remove unnecessary build dependency six
|
||
- Recommend all optional dependencies
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Mar 6 20:35:00 UTC 2018 - jengelh@inai.de
|
||
|
||
- Trim redundant wording from description.
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Mar 5 15:02:00 UTC 2018 - badshah400@gmail.com
|
||
|
||
- Use \%license instead of \%doc to install License.txt.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jan 30 17:16:13 UTC 2018 - guigo.lourenco@gmail.com
|
||
|
||
- Depend on the full python interpreter to fix sqlite3 import
|
||
during %check
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jan 16 11:02:13 UTC 2018 - guigo.lourenco@gmail.com
|
||
|
||
- Depend on python-rpm-macros
|
||
- Build for both Python2 and Python3
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Dec 19 15:50:13 UTC 2017 - badshah400@gmail.com
|
||
|
||
- Update to version 3.2.5:
|
||
* Arabic stemmers (ARLSTem, Snowball)
|
||
* NIST MT evaluation metric and added NIST
|
||
international_tokenize
|
||
* Moses tokenizer
|
||
* Document Russian tagger
|
||
* Fix to Stanford segmenter
|
||
* Improve treebank detokenizer, VerbNet, Vader
|
||
* Misc code and documentation cleanups
|
||
* Implement fixes suggested by LGTM
|
||
- Convert specfile to python single-spec style.
|
||
- Drop unneeded BuildRequires: python-PyYAML, python-xml,
|
||
python-devel; not required for building.
|
||
- Change existing Requires to Recommends: these are really needed
|
||
for additional features, and not required for basic nltk usage.
|
||
- Add new Recommends: python-scipy, python-matplotlib,
|
||
python-pyparsing, and python-gensim; enables other optional
|
||
features.
|
||
- Run fdupes to link-up duplicate files.
|
||
- Remove exec permissions for a file not intended to be executed
|
||
(not in exec path, no hashbang, etc.)
|
||
- Remove hashbangs from non-executable files.
|
||
- Run tests following the suggestion from
|
||
http://www.nltk.org/install.html.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Feb 21 13:11:31 UTC 2017 - stephan.barth@suse.com
|
||
|
||
- update to version 3.2.2
|
||
Upstream changelog:
|
||
Support for Aline, ChrF and GLEU MT evaluation metrics, Russian POS tagger
|
||
model, Moses detokenizer, rewrite Porter Stemmer and FrameNet corpus reader,
|
||
update FrameNet Corpus to version 1.7, fixes: stanford_segmenter.py,
|
||
SentiText, CoNLL Corpus Reader, BLEU, naivebayes, Krippendorff’s alpha,
|
||
Punkt, Moses tokenizer, TweetTokenizer, ToktokTokenizer; improvements to
|
||
testing framework
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Oct 14 00:31:15 UTC 2016 - toddrme2178@gmail.com
|
||
|
||
- Update to version 3.2.1
|
||
+ No changelog available
|
||
|
||
-------------------------------------------------------------------
|
||
Thu May 21 14:53:43 UTC 2015 - toddrme2178@gmail.com
|
||
|
||
- Remove upstreamed nltk-2.0.4-dont-use-python-distribute.patch
|
||
- Update to version 3.0.2
|
||
+ No changelog available
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Dec 8 13:33:14 UTC 2013 - p.drouand@gmail.com
|
||
|
||
- Update to version 2.0.4
|
||
+ No changelog available
|
||
- Add nltk-2.0.4-dont-use-python-distribute.patch ; force use of
|
||
python-setuptools instead of python-distribute
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Oct 24 11:09:19 UTC 2013 - speilicke@suse.com
|
||
|
||
- Require python-setuptools instead of distribute (upstreams merged)
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Sep 23 12:29:05 UTC 2011 - saschpe@suse.de
|
||
|
||
- Update to version 2.0.1rc1
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Feb 7 18:51:07 CST 2010 - oddrationale@gmail.com
|
||
|
||
- fixed copyright and license statements
|
||
- removed PyYAML, and added dependency to installers and download
|
||
instructions
|
||
- updated to LogicParser, DRT (Dan Garrette)
|
||
- WordNet similarity metrics return None instead of -1 when
|
||
they fail to find a path (Steve Bethard)
|
||
- shortest_path_distance uses instance hypernyms (Jordan
|
||
Boyd-Graber)
|
||
- clean_html improved (Bjorn Maeland)
|
||
- batch_parse, batch_interpret and batch_evaluate functions allow
|
||
grammar or grammar filename as argument
|
||
- more Portuguese examples (portuguese_en.doctest, examples/pt.py)
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Dec 10 17:23:51 CST 2009 - oddrationale@gmail.com
|
||
|
||
- added python-nltk-remove-yaml.patch to pevent conflict with
|
||
python-yaml
|
||
- added Requires: python-yaml
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Dec 9 15:39:35 CST 2009 - oddrationale@gmail.com
|
||
|
||
- Initial Release (Version 2.0b7): Sun Feb 7 18:50:18 CST 2010
|