- Update to 3.8
* Refactor dispersion plot (#3082)
* Provide type hints for LazyCorpusLoader variables (#3081)
* Throw warning when LanguageModel is initialized with incorrect vocabulary (#3080)
* Fix WordNet's all_synsets() function (#3078)
* Resolve TreebankWordDetokenizer inconsistency with end-of-string contractions (#3070)
* Support both iso639-3 codes and BCP-47 language tags (#3060)
* Avoid DeprecationWarning in Regexp tokenizer (#3055)
* Fix many doctests, add doctests to CI (#3054, #3050, #3048)
* Fix bool field not being read in VerbNet (#3044)
* Greatly improve time efficiency of SyllableTokenizer when tokenizing numbers (#3042)
* Fix encodings of Polish udhr corpus reader (#3038)
* Allow TweetTokenizer to tokenize emoji flag sequences (#3034)
* Prevent LazyModule from increasing the size of nltk.__dict__ (#3033)
* Fix CoreNLPServer non-default port issue (#3031)
* Add "acion" suffix to the Spanish SnowballStemmer (#3030)
* Allow loading WordNet without OMW (#3026)
* Use input() in nltk.chat.chatbot() for Jupyter support (#3022)
* Fix edit_distance_align() in distance.py (#3017)
* Tackle performance and accuracy regression of sentence tokenizer since NLTK 3.6.6 (#3014)
* Add the Iota operator to semantic logic (#3010)
* Resolve critical errors in WordNet app (#3008)
* Resolve critical error in CHILDES Corpus (#2998)
* Make WordNet information_content() accept adjective satellites (#2995)
* Add "strict=True" parameter to CoreNLP (#2993, #3043)
* Resolve issue with WordNet's synset_from_sense_key (#2988)
* Handle WordNet synsets that were lost in mapping (#2985)
* Resolve TypeError in Boxer (#2979)
* Add function to retrieve WordNet synonyms (#2978)
* Warn about nonexistent OMW offsets instead of raising an error (#2974)
OBS-URL: https://build.opensuse.org/request/show/1056422
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-nltk?expand=0&rev=41
- Update to 3.7
- Improve and update the NLTK team page on nltk.org (#2855,
#2941)
- Drop support for Python 3.6, support Python 3.10 (#2920)
- Update to 3.6.7
- Resolve IndexError in `sent_tokenize` and `word_tokenize`
(#2922)
- Update to 3.6.6
- Refactor `gensim.doctest` to work for gensim 4.0.0 and up
(#2914)
- Add Precision, Recall, F-measure, Confusion Matrix to Taggers
(#2862)
- Added warnings if .zip files exist without any corresponding
.csv files. (#2908)
- Fix `FileNotFoundError` when the `download_dir` is
a non-existing nested folder (#2910)
- Rename omw to omw-1.4 (#2907)
- Resolve ReDoS opportunity by fixing incorrectly specified
regex (#2906, bsc#1191030, CVE-2021-3828).
- Support OMW 1.4 (#2899)
- Deprecate Tree get and set node methods (#2900)
- Fix broken inaugural test case (#2903)
- Use Multilingual Wordnet Data from OMW with newer Wordnet
versions (#2889)
- Keep NLTKs "tokenize" module working with pathlib (#2896)
- Make prettyprinter to be more readable (#2893)
- Update links to the nltk book (#2895)
- Add `CITATION.cff` to nltk (#2880)
- Resolve serious ReDoS in PunktSentenceTokenizer (#2869)
- Delete old CI config files (#2881)
OBS-URL: https://build.opensuse.org/request/show/965220
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-nltk?expand=0&rev=11
- Improve and update the NLTK team page on nltk.org (#2855,
#2941)
- Drop support for Python 3.6, support Python 3.10 (#2920)
- Update to 3.6.7
- Resolve IndexError in `sent_tokenize` and `word_tokenize`
(#2922)
- Update to 3.6.6
- Refactor `gensim.doctest` to work for gensim 4.0.0 and up
(#2914)
- Add Precision, Recall, F-measure, Confusion Matrix to Taggers
(#2862)
- Added warnings if .zip files exist without any corresponding
.csv files. (#2908)
- Fix `FileNotFoundError` when the `download_dir` is
a non-existing nested folder (#2910)
- Rename omw to omw-1.4 (#2907)
- Resolve ReDoS opportunity by fixing incorrectly specified
regex (#2906, bsc#1191030, CVE-2021-3828).
- Support OMW 1.4 (#2899)
- Deprecate Tree get and set node methods (#2900)
- Fix broken inaugural test case (#2903)
- Use Multilingual Wordnet Data from OMW with newer Wordnet
versions (#2889)
- Keep NLTKs "tokenize" module working with pathlib (#2896)
- Make prettyprinter to be more readable (#2893)
- Update links to the nltk book (#2895)
- Add `CITATION.cff` to nltk (#2880)
- Resolve serious ReDoS in PunktSentenceTokenizer (#2869)
- Delete old CI config files (#2881)
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-nltk?expand=0&rev=36
- Update to v3.5
* add support for Python 3.8
* drop support for Python 2
* create NLTK's own Tokenizer class distinct from the Treebank
reference tokeniser
* update Vader sentiment analyser
* fix JSON serialization of some PoS taggers
* minor improvements in grammar.CFG, Vader, pl196x corpus reader,
StringTokenizer
* change implementation <= and >= for FreqDist so they are partial
orders
* make FreqDist iterable
* correctly handle Penn Treebank trees with a unlabeled branching
top node
OBS-URL: https://build.opensuse.org/request/show/812413
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-nltk?expand=0&rev=10
- Update to v3.5
* add support for Python 3.8
* drop support for Python 2
* create NLTK's own Tokenizer class distinct from the Treebank
reference tokeniser
* update Vader sentiment analyser
* fix JSON serialization of some PoS taggers
* minor improvements in grammar.CFG, Vader, pl196x corpus reader,
StringTokenizer
* change implementation <= and >= for FreqDist so they are partial
orders
* make FreqDist iterable
* correctly handle Penn Treebank trees with a unlabeled branching
top node
OBS-URL: https://build.opensuse.org/request/show/812178
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-nltk?expand=0&rev=33
- Update to 3.4.4:
* fix bug in plot function (probability.py)
* add improved PanLex Swadesh corpus reader
* add Text.generate()
* add QuadgramAssocMeasures
* add SSP to tokenizers
* return confidence of best tag from AveragedPerceptron
* make plot methods return Axes objects
* don't require list arguments to PositiveNaiveBayesClassifier.train
* fix Tree classes to work with native Python copy library
* fix inconsistency for NomBank
* fix random seeding in LanguageModel.generate
* fix ConditionalFreqDist mutation on tabulate/plot call
* fix broken links in documentation
* fix misc Wordnet issues
* update installation instructions
OBS-URL: https://build.opensuse.org/request/show/717915
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-nltk?expand=0&rev=5
* fix bug in plot function (probability.py)
* add improved PanLex Swadesh corpus reader
* add Text.generate()
* add QuadgramAssocMeasures
* add SSP to tokenizers
* return confidence of best tag from AveragedPerceptron
* make plot methods return Axes objects
* don't require list arguments to PositiveNaiveBayesClassifier.train
* fix Tree classes to work with native Python copy library
* fix inconsistency for NomBank
* fix random seeding in LanguageModel.generate
* fix ConditionalFreqDist mutation on tabulate/plot call
* fix broken links in documentation
* fix misc Wordnet issues
* update installation instructions
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-nltk?expand=0&rev=22
- Update to v3.4
+ Support Python 3.7
+ New Language Modeling package
+ Cistem Stemmer for German
+ Support Russian National Corpus incl POS tag model
+ Krippendorf Alpha inter-rater reliability test
+ Comprehensive code clean-ups
+ Switch continuous integration from Jenkins to Travis
- from v3.3
+ Support Python 3.6
+ New interface to CoreNLP
+ Support synset retrieval by sense key
+ Minor fixes to CoNLL Corpus Reader
+ AlignedSent
+ Fixed minor inconsistencies in APIs and API documentation
+ Better conformance to PEP8
+ Drop Moses Tokenizer (incompatible license)
OBS-URL: https://build.opensuse.org/request/show/673106
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-nltk?expand=0&rev=17
- Update to version 3.2.5:
* Arabic stemmers (ARLSTem, Snowball)
* NIST MT evaluation metric and added NIST
international_tokenize
* Moses tokenizer
* Document Russian tagger
* Fix to Stanford segmenter
* Improve treebank detokenizer, VerbNet, Vader
* Misc code and documentation cleanups
* Implement fixes suggested by LGTM
- Convert specfile to python single-spec style.
- Drop unneeded BuildRequires: python-PyYAML, python-xml,
python-devel; not required for building.
- Change existing Requires to Recommends: these are really needed
for additional features, and not required for basic nltk usage.
- Add new Recommends: python-scipy, python-matplotlib,
python-pyparsing, and python-gensim; enables other optional
features.
- Run fdupes to link-up duplicate files.
- Remove exec permissions for a file not intended to be executed
(not in exec path, no hashbang, etc.)
- Remove hashbangs from non-executable files.
- Run tests following the suggestion from
http://www.nltk.org/install.html.
OBS-URL: https://build.opensuse.org/request/show/558587
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-nltk?expand=0&rev=9