- Update to 1.4.6:
* API classes now support C++11 move semantics when using a compiler which
we are confident supports them (currently compilers which define
__cplusplus >= 201103 plus a special check for MSVC 2015 or later).
C++11 move semantics provide a clean and efficient way for threaded code to
hand-off Xapian objects to worker threads, but in this case it's very
unhelpful for availability of these semantics to vary by compiler as it
quietly leads to a build with non-threadsafe behaviour. To address this,
user code can #define XAPIAN_MOVE_SEMANTICS before #include <xapian.h> to
force this on, and will then get a compilation failure if the compiler
lacks suitable support.
* MSet::snippet():
+ We were only escaping output for HTML/XML in some cases, which would
potentially allow HTML to be injected into output (this fixes
bnc#1099925, CVE-2018-0499).
+ Include certain leading non-word characters in snippets. Previously we
started the snippet at the start of the first actual word, but there are
various cases where including non-word characters in front of the actual
word adds useful context or otherwise aids comprehension.
* Add MSetIterator::get_sort_key() method. The sort key has always been
available internally, but wasn't exposed via the public API before, which
seems like an oversight as the collapse key has long been available.
* Database::compact():
+ Allow Compactor::resolve_duplicate_metadata() implementations to delete
entries. Previously if an implementation returned an empty string this
would result in a user meta-data entry with an empty value, which isn't
normally achievable (empty meta-data values aren't stored), and so will
cause odd behaviour. We now handle an empty returned value by
interpreting it in the natural way - it means that the merged result is
to not set a value for that key in the output database.
OBS-URL: https://build.opensuse.org/request/show/620422
OBS-URL: https://build.opensuse.org/package/show/server:search/xapian-core?expand=0&rev=80
- Update to 1.4.5:
* Add Database::get_total_length() method. Previously you had to calculate
this from get_avlength() and get_doccount(), taking into account rounding
issues. But even then you couldn't reliably get the exact value when total
length is large since a double's mantissa has more limited precision than an
unsigned long long.
* Add Xapian::iterator_rewound() for bidirectional iterators, to test if the
iterator is at the start (useful for testing whether we're done when
iterating backwards).
* DatabaseOpeningError exceptions now provide errno via get_error_string()
rather than turning it into a string and including it in the exception
message.
* WritableDatabase::replace_document(): when passed a Document object which
came from a database and has unmodified values, we used to always read
those values into a memory structure. Now we only do this if the document
is being replaced to the same document ID which it came from, which should
make other cases a bit more efficient.
* Enquire::get_eset(): When approximating term frequencies we now round to the
nearest integer - previously we always rounded down.
* See also https://xapian.org/docs/xapian-core-1.4.5/NEWS
OBS-URL: https://build.opensuse.org/request/show/557120
OBS-URL: https://build.opensuse.org/package/show/server:search/xapian-core?expand=0&rev=78
- Update to 1.4.4:
* Database::check():
+ Fix checking a single table - changes in 1.4.2 broke such checks unless
you specified the table without any extension.
+ Errors from failing to find the file specified are now thrown as
DatabaseOpeningError (was DatabaseError, of which DatabaseOpeningError is
a subclass so existing code should continue to work). Also improved the
error message when the file doesn't exist is better.
* Drop OP_SCALE_WEIGHT over OP_VALUE_RANGE, OP_VALUE_GE and OP_VALUE_LE in
the Query constructor. These operators always return weight 0 so
OP_SCALE_WEIGHT over them has no effect. Eliminating it at query
construction time is cheap (we only need to check the type of the
subquery), eliminates the confusing "0 * " from the query description,
and means the OP_SCALE_WEIGHT Query object can be released sooner.
Inspired by Shivanshu Chauhan asking about the query description on IRC.
* Drop OP_SCALE_WEIGHT on the right side of OP_AND_NOT in the Query
constructor. OP_AND_NOT takes no weight from the right so OP_SCALE_WEIGHT
has no effect there. Eliminating it at query construction time is cheap
(just need to check the subquery's type), eliminates the confusing "0 * "
from the query description, and means the OP_SCALE_WEIGHT object can be
released sooner.
* See also https://xapian.org/docs/xapian-core-1.4.4/NEWS
OBS-URL: https://build.opensuse.org/request/show/507409
OBS-URL: https://build.opensuse.org/package/show/server:search/xapian-core?expand=0&rev=76
- Update to 1.4.3:
* MSet::snippet(): Favour candidate snippets which contain more of a diversity
of matching terms by discounting the relevance of repeated terms using an
exponential decay. A snippet which contains more terms from the query is
likely to be better than one which contains the same term or terms multiple
times, but a repeated term is still interesting, just less with each
additional appearance. Diversity issue highlighted by Robert Stepanek's
patch in https://github.com/xapian/xapian/pull/117 - testcases taken from his
patch.
* MSet::snippet(): New flag SNIPPET_EMPTY_WITHOUT_MATCH to get an empty snippet
if there are no matches in the text passed in. Implemented by Robert
Stepanek.
* Round MSet::get_matches_estimated() to an appropriate number of significant
figures. The algorithm used looks at the lower and upper bound and where the
estimate sits between them, and then picks an appropriate number of
significant figures. Thanks to Sébastien Le Callonnec for help sorting out a
portability issue on OS X.
* Add Database::locked() method - where possible this non-invasively checks if
the database is currently open for writing, which can be useful for
dashboards and other status reporting tools.
* See also https://xapian.org/docs/xapian-core-1.4.3/NEWS
- Update to 1.4.2:
* Add XAPIAN_AT_LEAST(A,B,C) macro.
* MSet::snippet(): Optimise snippet generation - it's now ~46% faster in a
simple test.
* Add Xapian::DOC_ASSUME_VALID flag which tells Database::get_document() that
it doesn't need to check that the passed docid is valid. Fixes#739,
reported by Germán M. Bravo.
OBS-URL: https://build.opensuse.org/request/show/453942
OBS-URL: https://build.opensuse.org/package/show/server:search/xapian-core?expand=0&rev=74
- Update to 1.4.1
* Constructing a Query for a non-reference counted PostingSource object will
now try to clone the PostingSource object (as happened in 1.3.4 and
earlier). This clone code was removed as part of the changes in 1.3.5 to
support optional reference counting of PostingSource objects, but that breaks
the case when the PostingSource object is on the stack and goes out of scope
before the Query object is used. Issue reported by Till Schäfer and analysed
by Daniel Vrátil in a bug report against Akonadi:
https://bugs.kde.org/show_bug.cgi?id=363741
* Add BM25PlusWeight class implementing the BM25+ weighting scheme, implemented
by Vivek Pal (https://github.com/xapian/xapian/pull/104).
* Add PL2PlusWeight class implementing the PL2+ weighting scheme, implemented
by Vivek Pal (https://github.com/xapian/xapian/pull/108).
* LMWeight: Implement Dir+ weighting scheme as DIRICHLET_PLUS_SMOOTHING.
Patch from Vivek Pal.
* Add CoordWeight class implementing coordinate matching. This can be useful
for specialised uses - e.g. to implement sorting by the number of matching
filters.
* DLHWeight,DPHWeight,PL2Weight: With these weighting schemes, the formulae
can give a negative weight contribution for a term in extreme cases. We
used to try to handle this by calculating a per-term lower bound on the
contribution and subtracting this from the contribution, but this idea
is fundamentally flawed as the total offset it adds to a document depends on
what combination of terms that document matches, meaning in general the
offset isn't the same for every matching document. So instead we now clamp
each term's weight contribution to be >= 0.
* TfIdfWeight: Always scale term weight by wqf - this seems the logical
approach as it matches the weighting we'd get if we weighted every non-unique
term in the query, as well as being explicit in the Piv+ formula.
OBS-URL: https://build.opensuse.org/request/show/439920
OBS-URL: https://build.opensuse.org/package/show/server:search/xapian-core?expand=0&rev=72
- update to 1.2.8:
* Add support to TermGenerator and QueryParser for indexing and searching CJK
text using n-grams. Currently this is only enabled when the environmental
variable XAPIAN_CJK_NGRAM is set to a non-empty value.
* overview.html,quickstart.html: Fix several factual errors.
* Improve documentation comments for several methods.
* Add documentation for function parameters which didn't have it.
OBS-URL: https://build.opensuse.org/request/show/98786
OBS-URL: https://build.opensuse.org/package/show/server:search/xapian-core?expand=0&rev=42