forked from pool/lucene
facet and expressions enabled OBS-URL: https://build.opensuse.org/request/show/1219020 OBS-URL: https://build.opensuse.org/package/show/Java:packages/lucene?expand=0&rev=90
1518 lines
76 KiB
Plaintext
1518 lines
76 KiB
Plaintext
-------------------------------------------------------------------
|
|
Tue Oct 29 12:04:26 UTC 2024 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Enable build and packaging of modules facet and expressions
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Oct 29 07:19:57 UTC 2024 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Upgrade to version 8.11.4
|
|
* Bug Fixes
|
|
+ LUCENE-9580: Fix bug in the polygon tessellator when
|
|
introducing collinear edges during polygon splitting.
|
|
+ LUCENE-10470: Check if polygon has been successfully
|
|
tessellated before we fail (we are failing some valid
|
|
tessellations) and allow filtering edges that fold on top of
|
|
the previous one.
|
|
+ LUCENE-10563: Fix failure to tessellate complex polygon.
|
|
+ LUCENE-10678: Fix potential overflow when building a BKD tree
|
|
with more than 4 billion points. The overflow occurs when
|
|
computing the partition point.
|
|
+ GITHUB#11986: Fix algorithm that chooses the bridge between a
|
|
polygon and a hole when there is common vertex.
|
|
+ GITHUB#12020: Fixes bug whereby very flat polygons can
|
|
incorrectly contain intersecting geometries.
|
|
+ GITHUB#12352: [Tessellator] Improve the checks that validate
|
|
the diagonal between two polygon nodes so the resulting
|
|
polygons are valid counter clockwise polygons.
|
|
* Optimizations
|
|
+ GITHUB#12604: Estimate the block size of FST BytesStore in
|
|
BlockTreeTermsWriter to reduce GC load during indexing.
|
|
- Modified patch:
|
|
* s2-geometry-library-java-2.0.0.patch
|
|
+ rediff to changed context
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Oct 29 06:42:43 UTC 2024 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Buld and distribute additional modules:
|
|
* analyzers-icu,
|
|
* analyzers-phonetic,
|
|
* spatial-extras and
|
|
* suggest
|
|
- Added patch:
|
|
* s2-geometry-library-java-2.0.0.patch
|
|
+ build against the com.google.geometry:s2-geometry instead of
|
|
the io.sgr:s2-geometry-library-java fork
|
|
|
|
-------------------------------------------------------------------
|
|
Wed Feb 21 10:49:02 UTC 2024 - Gus Kenion <gus.kenion@suse.com>
|
|
|
|
- Use %patch -P N instead of deprecated %patchN.
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Sep 19 17:09:29 UTC 2023 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Upgrade to version 8.11.2
|
|
* API Changes
|
|
+ LUCENE-9265: SimpleFSDirectory is deprecated in favor of
|
|
NIOFSDirectory.
|
|
+ LUCENE-9304: Removed ability to set
|
|
DocumentsWriterPerThreadPool on IndexWriterConfig.
|
|
The DocumentsWriterPerThreadPool is a packaged protected final
|
|
class which made it impossible to customize.
|
|
+ LUCENE-9339: MergeScheduler#merge doesn't accept a parameter
|
|
if a new merge was found anymore.
|
|
+ LUCENE-9330: SortFields are now responsible for writing
|
|
themselves into index headers if they are used as index sorts.
|
|
+ LUCENE-9340: Deprecate SimpleBindings#add(SortField).
|
|
+ LUCENE-9345: MergeScheduler is now decoupled from IndexWriter.
|
|
Instead it accepts a MergeSource interface that offers the
|
|
basic methods to acquire pending merges, run the merge and do
|
|
accounting around it.
|
|
+ LUCENE-9349: QueryVisitor.consumeTermsMatching() now takes a
|
|
Supplier<ByteRunAutomaton> to enable queries that build large
|
|
automata to provide them lazily. TermsInSetQuery switches to
|
|
using this method to report matching terms.
|
|
+ LUCENE-9366: DocValues.emptySortedNumeric() no longer takes a
|
|
maxDoc parameter
|
|
+ LUCENE-7822: CodecUtil#checkFooter(IndexInput, Throwable) now
|
|
throws a CorruptIndexException if checksums mismatch or if
|
|
checksums can't be verified.
|
|
+ LUCENE-7020: TieredMergePolicy#setMaxMergeAtOnceExplicit is
|
|
deprecated and the number of segments that get merged via
|
|
explicit merges is unlimited by default.
|
|
+ LUCENE-9437: Lucene's facet module's
|
|
DocValuesOrdinalsReader.decode method is now public, making it
|
|
easier for applications to decode facet ordinals into their
|
|
corresponding labels
|
|
+ LUCENE-9449: Field comparators for numeric fields and _doc
|
|
were moved to their own package. TopFieldCollector sets
|
|
TotalHits.relation to GREATER_THAN_OR_EQUAL_TO, as soon as the
|
|
requested total hits threshold is reached, even though in some
|
|
cases no skipping optimization is applied and all hits are
|
|
collected.
|
|
+ LUCENE-9515: IndexingChain now accepts individual primitives
|
|
rather than a DocumentsWriterPerThread instance in order to
|
|
create a new DocConsumer.
|
|
+ LUCENE-9680: Removed deprecation warning from
|
|
IndexWriter#getFieldNames().
|
|
+ LUCENE-9902: Change the getValue method from IntTaxonomyFacets
|
|
to be protected instead of private. Users can now access the
|
|
count of an ordinal directly without constructing an extra
|
|
FacetLabel. Also use variable length arguments for the
|
|
getOrdinal call in TaxonomyReader.
|
|
+ LUCENE-9962: DrillSideways allows sub-classes to provide
|
|
"drill down" FacetsCollectors. They may provide a null
|
|
collector if they choose to bypass "drill down" facet
|
|
collection.
|
|
+ LUCENE-10027: Add a new Directory reader open API from
|
|
indexCommit and a custom comparator for sorting leaf readers
|
|
+ LUCENE-10036: Replaced the ScoreCachingWrappingScorer ctor
|
|
with a static factory method that ensures unnecessary wrapping
|
|
doesn't occur.
|
|
* New Features
|
|
+ LUCENE-7889: Grouping by range based on values from
|
|
DoubleValuesSource and LongValuesSource
|
|
+ LUCENE-8962: Add IndexWriter merge-on-commit feature to
|
|
selectively merge small segments on commit, subject to a
|
|
configurable timeout, to improve search performance by
|
|
reducing the number of small segments for searching
|
|
+ LUCENE-8962: Add IndexWriter merge-on-refresh feature to
|
|
selectively merge small segments on getReader, subject to a
|
|
configurable timeout, to improve search performance by
|
|
reducing the number of small segments for searching.
|
|
+ LUCENE-9378: Doc values now allow configuring how to trade
|
|
compression for retrieval speed.
|
|
+ LUCENE-9385: Add FacetsConfig option to control which
|
|
drill-down terms are indexed for a FacetLabel
|
|
+ LUCENE-9386: RegExpQuery added case insensitive matching
|
|
option.
|
|
+ LUCENE-9413: Add CJKWidthCharFilter and its factory
|
|
+ LUCENE-9444: Add utility class to retrieve facet labels from
|
|
the taxonomy index for a facet field so such fields do not
|
|
also have to be redundantly stored
|
|
+ LUCENE-9484: Allow sorting an index after it was created.
|
|
With SortingCodecReader, existing unsorted segments can be
|
|
wrapped and merged into a fresh index using
|
|
IndexWriter#addIndices API.
|
|
+ LUCENE-9507: Custom order for leaves in IndexReader and
|
|
IndexWriter
|
|
+ LUCENE-9537: Added smoothingScore method and default
|
|
implementation to Scorable abstract class. The smoothing score
|
|
allows scorers to calculate a score for a document where the
|
|
search term or subquery is not present. The smoothing score
|
|
acts like an idf so that documents that do not have terms or
|
|
subqueries that are more frequent in the index are not
|
|
penalized as much as documents that do not have less frequent
|
|
terms or subqueries and prevents scores which are the product
|
|
or terms or subqueries from going to zero. Added the
|
|
implementation of the Indri AND and the
|
|
IndriDirichletSimilarity from the academic Indri search
|
|
engine: http://www.lemurproject.org/indri.php.
|
|
+ LUCENE-9552: New LatLonPoint query that accepts an array of
|
|
LatLonGeometries.
|
|
+ LUCENE-9553: New XYPoint query that accepts an array of
|
|
XYGeometries.
|
|
+ LUCENE-9572: TypeAsSynonymFilter has been enhanced support
|
|
ignoring some types, and to allow the generated synonyms to
|
|
copy some or all flags from the original token
|
|
+ LUCENE-9574 A token filter to drop tokens that match all
|
|
specified flags.
|
|
+ LUCENE-9575: PatternTypingFilter has been added to allow
|
|
setting a type attribute on tokens based on a configured set
|
|
of regular expressions
|
|
+ LUCENE-9594: FeatureField supports newLinearQuery that for
|
|
scoring uses raw indexed values of features without any
|
|
transformation.
|
|
+ LUCENE-9641: LatLonPoint query support for spatial
|
|
relationships.
|
|
+ LUCENE-9694: New tool for creating a deterministic index to
|
|
enable benchmarking changes on a consistent multi-segment
|
|
index even when they require re-indexing.
|
|
+ LUCENE-9950: New facet counting implementation for general
|
|
string doc value fields (SortedSetDocValues / SortedDocValues)
|
|
not created through FacetsConfig
|
|
+ LUCENE-10035: The SimpleText codec now writes skip lists.
|
|
+ LUCENE-10083: Analyzer and stemmer for Telugu language
|
|
* Improvements
|
|
+ LUCENE-9276: Use same code-path for updateDocuments and
|
|
updateDocument in IndexWriter and DocumentsWriter.
|
|
+ LUCENE-9279: Update dictionary version for Ukrainian analyzer
|
|
to 4.9.1
|
|
+ LUCENE-8050: PerFieldDocValuesFormat should not get the
|
|
DocValuesFormat on a field that has no doc values.
|
|
+ LUCENE-9304: Removed ThreadState abstraction from
|
|
DocumentsWriter which allows pooling of DWPT directly and
|
|
improves the approachability of the IndexWriter code.
|
|
+ LUCENE-9324: Add an ID to SegmentCommitInfo in order to
|
|
compare commits for equality and make snapshots incremental on
|
|
generational files.
|
|
+ LUCENE-9342: TotalHits' relation will be EQUAL_TO when the
|
|
number of hits is lower than TopDocsColector's numHits
|
|
+ LUCENE-9353: Metadata of the terms dictionary moved to its own
|
|
file, with the '.tmd' extension. This allows checksums of
|
|
metadata to be verified when opening indices and helps save
|
|
seeks when opening an index.
|
|
+ LUCENE-9359: SegmentInfos#readCommit now always returns a
|
|
CorruptIndexException if the content of the file is invalid.
|
|
+ LUCENE-9393: Make FunctionScoreQuery use ScoreMode.COMPLETE
|
|
for creating the inner query weight when ScoreMode.TOP_DOCS
|
|
is requested.
|
|
+ LUCENE-9392: Make FacetsConfig.DELIM_CHAR publicly accessible
|
|
+ LUCENE-9397: UniformSplit supports encodable fields metadata.
|
|
+ LUCENE-9396: Improved truncation detection for points.
|
|
+ LUCENE-9402: Let MultiCollector handle minCompetitiveScore
|
|
+ LUCENE-8574: Add a new ExpressionValueSource which will
|
|
enforce only one value per name per hit in dependencies,
|
|
ExpressionFunctionValues will no longer recompute already
|
|
computed values
|
|
+ LUCENE-9416: Fix CheckIndex to print an invalid non-zero norm
|
|
as unsigned long when detecting corruption.
|
|
+ LUCENE-9440: FieldInfo#checkConsistency called twice from
|
|
Lucene50(60)FieldInfosFormat#read; Removed the (redundant?)
|
|
assert and do these checks for real.
|
|
+ LUCENE-9446: In BooleanQuery rewrite, always remove
|
|
MatchAllDocsQuery filter clauses when possible.
|
|
+ LUCENE-9501: Improve coverage for Asserting* test classes:
|
|
make sure to handle singleton doc values, and sometimes
|
|
exercise Weight#scorer instead of Weight#bulkScorer for
|
|
top-level queries.
|
|
+ LUCENE-9511: Include StoredFieldsWriter in DWPT accounting
|
|
to ensure that it's heap consumption is taken into account
|
|
when IndexWriter stalls or should flush DWPTs.
|
|
+ LUCENE-9514: Include TermVectorsWriter in DWPT accounting to
|
|
ensure that it's heap consumption is taken into account when
|
|
IndexWriter stalls or should flush DWPTs.
|
|
+ LUCENE-9523: In query shapes over shape fields, skip points
|
|
while traversing the BKD tree when the relationship with the
|
|
document is already known.
|
|
+ LUCENE-9539: Use more compact datastructures to represent
|
|
sorted doc-values in memory when sorting a segment before
|
|
flush and in SortingCodecReader.
|
|
+ LUCENE-9458: WordDelimiterGraphFilter should order tokens at
|
|
the same position by endOffset to emit longer tokens first.
|
|
The same graph is produced.
|
|
+ LUCENE-5309: Optimize facet counting for single-valued
|
|
SSDV / StringValueFacetCounts.
|
|
+ LUCENE-9023: GlobalOrdinalsWithScore should not compute
|
|
occurrences when the provided min is 1.
|
|
+ LUCENE-9177: ICUNormalizer2CharFilter no longer requires
|
|
normalization-inert characters as boundaries for incremental
|
|
processing, vastly improving worst-case performance.
|
|
+ LUCENE-9455: ExitableTermsEnum should sample timeout and
|
|
interruption check before calling next().
|
|
+ LUCENE-9662: Make CheckIndex concurrent by parallelizing
|
|
index check across segments.
|
|
+ LUCENE-9663: Add compression to terms dict from
|
|
SortedSet/Sorted DocValues.
|
|
+ LUCENE-9675: Binary doc values fields now expose their
|
|
configured compression mode in the attributes of the field
|
|
info.
|
|
+ LUCENE-9725: BM25FQuery was extended to handle similarities
|
|
beyond BM25Similarity. It was renamed to CombinedFieldQuery to
|
|
reflect its more general scope.
|
|
+ LUCENE-9877: Reduce index size by increasing allowable
|
|
exceptions in PForUtil from 3 to 7.
|
|
+ LUCENE-9687: Hunspell support improvements: add API for
|
|
spell-checking and suggestions, support compound words, fix
|
|
various behavior differences between Java and C++
|
|
implementations, improve performance
|
|
+ LUCENE-9917: The BEST_SPEED compression mode now trades more
|
|
compression ratio in exchange of faster reads.
|
|
+ LUCENE-9935: Enable bulk merge for stored fields with index sort.
|
|
+ LUCENE-9944: Allow DrillSideways users to provide their own
|
|
CollectorManager without also requiring them to provide an
|
|
ExecutorService.
|
|
+ LUCENE-9945: Extend DrillSideways to support exposing
|
|
FacetCollectors directly.
|
|
+ LUCENE-9946: Support for multi-value fields in
|
|
LongRangeFacetCounts and DoubleRangeFacetCounts.
|
|
+ LUCENE-9965: Added QueryProfilerIndexSearcher and
|
|
ProfilerCollector to support debugging query execution
|
|
strategy and timing.
|
|
+ LUCENE-9981: Operations.getCommonSuffix/Prefix(Automaton) is
|
|
now much more efficient, from a worst case exponential down to
|
|
quadratic cost in the number of states + transitions in the
|
|
Automaton. These methods no longer use the costly determinize
|
|
method, removing the risk of TooComplexToDeterminizeException
|
|
+ LUCENE-9981: Operations.determinize now throws
|
|
TooComplexToDeterminizeException based on too much "effort"
|
|
spent determinizing rather than a precise state count on the
|
|
resulting returned automaton, to better handle adversarial
|
|
cases like det(rev(regexp("(.*a){2000}"))) that spend lots of
|
|
effort but result in smallish eventual returned automata.
|
|
+ LUCENE-9983: Stop sorting determinize powersets unnecessarily.
|
|
+ LUCENE-10030: Lazily evaluate score in
|
|
DrillSidewaysScorer.doQueryFirstScoring
|
|
+ LUCENE-10043: Decrease default for LRUQueryCache's
|
|
skipCacheFactor to 10. This prevents caching a query clause
|
|
when it is much more expensive than running the top-level
|
|
query.
|
|
+ LUCENE-10103: Make QueryCache respect Accountable queries
|
|
* Optimizations
|
|
+ LUCENE-9254: UniformSplit keeps FST off-heap.
|
|
+ LUCENE-8103: DoubleValuesSource and QueryValueSource now use a
|
|
TwoPhaseIterator if one is provided by the Query.
|
|
+ LUCENE-9287: UsageTrackingQueryCachingPolicy no longer caches
|
|
DocValuesFieldExistsQuery.
|
|
+ LUCENE-9286: FST.Arc.BitTable reads directly FST bytes. Arc is
|
|
lightweight again and FSTEnum traversal faster.
|
|
+ LUCENE-7788: fail precommit on unparameterised log messages
|
|
and examine for wasted work/objects
|
|
+ LUCENE-9273: Speed up geometry queries by specialising
|
|
Component2D spatial operations. Instead of using a generic
|
|
relate method for all relations, we use specialize methods for
|
|
each one. In addition, the type of triangle is computed at
|
|
deserialization time, therefore we can be more selective when
|
|
decoding points of a triangle.
|
|
+ LUCENE-9087: Build always trees with full leaves and lower the
|
|
default value for maxPointsPerLeafNode to 512.
|
|
+ LUCENE-9148: Points now write their index in a separate file.
|
|
+ LUCENE-9280: Add an ability for field comparators to skip
|
|
non-competitive documents. Creating a TopFieldCollector with
|
|
totalHitsThreshold less than Integer.MAX_VALUE instructs
|
|
Lucene to skip non-competitive documents whenever possible.
|
|
For numeric sort fields the skipping functionality works when
|
|
the same field is indexed both with doc values and points.
|
|
To indicate that the same data is stored in these points and
|
|
doc values SortField#setCanUsePoints method should be used.
|
|
+ LUCENE-9395: ConstantValuesSource now shares a single
|
|
DoubleValues instance across all segments
|
|
+ LUCENE-9447, LUCENE-9486: Stored fields now get higer
|
|
compression ratios on highly compressible data.
|
|
+ LUCENE-9373: FunctionMatchQuery now accepts a "matchCost"
|
|
optimization hint.
|
|
+ LUCENE-9510: Indexing with an index sort is now faster by not
|
|
compressing temporary representations of the data.
|
|
+ LUCENE-9449: Enhance DocComparator to provide an iterator over
|
|
competitive documents when searching with "after". This
|
|
iterator can quickly position on the desired "after" document
|
|
skipping all documents and segments before "after".
|
|
+ LUCENE-9021: QueryParser: re-use the LookaheadSuccess
|
|
exception.
|
|
+ LUCENE-9346: WANDScorer now supports queries that have a
|
|
'minimumNumberShouldMatch' configured.
|
|
+ LUCENE-9536: Reduced memory usage for OrdinalMap when a
|
|
segment has all values.
|
|
+ LUCENE-9636: Faster decoding of postings for some numbers of
|
|
bits per value.
|
|
+ LUCENE-9673: Substantially improve RAM efficiency of how
|
|
MemoryIndex stores postings in memory, and reduced a bit of
|
|
RAM overhead in IndexWriter's internal postings book-keeping
|
|
+ LUCENE-9827: Speed up merging of stored fields and term
|
|
vectors for smaller segments.
|
|
+ LUCENE-9932: Performance improvement for BKD index building
|
|
+ LUCENE-9996: Improved memory efficiency of IndexWriter's RAM
|
|
buffer, in particular in the case of many fields and many
|
|
indexing threads.
|
|
+ LUCENE-10014: Lucene90DocValuesFormat was using too many bits
|
|
per value when compressing via gcd, unnecessarily wasting
|
|
index storage.
|
|
+ LUCENE-10022: Rewrite empty DisjunctionMaxQuery to
|
|
MatchNoDocsQuery.
|
|
+ LUCENE-10031: Slightly faster segment merging for sorted
|
|
indices.
|
|
+ LUCENE-10196: Improve IntroSorter with 3-ways partitioning
|
|
+ LUCENE-10481: FacetsCollector will not request scores if it
|
|
does not use them
|
|
* Bug Fixes
|
|
+ LUCENE-9300: Fix corruption of the new gen field infos when
|
|
doc values updates are applied on a segment created externally
|
|
and added to the index with IndexWriter#addIndexes(Directory).
|
|
+ LUCENE-9350: Partial reversion of LUCENE-9068; holding
|
|
levenshtein automata on FuzzyQuery can end up blowing up query
|
|
caches which use query objects as cache keys, so building the
|
|
automata is now delayed to search time again.
|
|
+ LUCENE-9259: Fix wrong NGramFilterFactory argument name for
|
|
preserveOriginal option
|
|
+ LUCENE-8849: DocValuesRewriteMethod.visit wasn't visiting its
|
|
embedded query
|
|
+ LUCENE-9258: DocTermsIndexDocValues assumed it was operating
|
|
on a SortedDocValues (single valued) field when it could be
|
|
multi-valued used with a SortedSetSelector
|
|
+ LUCENE-9164: Ensure IW processes all internal events before it
|
|
closes itself on a rollback.
|
|
+ LUCENE-8908: Return default value from objectVal when doc
|
|
doesn't match the query in QueryValueSource
|
|
+ LUCENE-9133: Fix for potential NPE in TermFilteredPresearcher
|
|
for empty fields
|
|
+ LUCENE-9309: Wait for #addIndexes merges when aborting merges.
|
|
+ LUCENE-9337: Ensure CMS updates it's thread accounting
|
|
datastructures consistently. CMS today releases it's lock
|
|
after finishing a merge before it re-acquires it to update the
|
|
thread accounting datastructures. This causes threading issues
|
|
where concurrently finishing threads fail to pick up pending
|
|
merges causing potential thread starvation on forceMerge calls
|
|
+ LUCENE-9314: Single-document monitor runs were using the less
|
|
efficient MultiDocumentBatch implementation.
|
|
+ LUCENE-9362: Fix equality check in
|
|
ExpressionValueSource#rewrite. This fixes rewriting of inner
|
|
value sources.
|
|
+ LUCENE-9405: IndexWriter incorrectly calls closeMergeReaders
|
|
twice when the merged segment is 100% deleted.
|
|
+ LUCENE-9400: Tessellator might build illegal polygons when
|
|
several holes share the shame vertex.
|
|
+ LUCENE-9417: Tessellator might build illegal polygons when
|
|
several holes share are connected to the same vertex.
|
|
+ LUCENE-9418: Fix ordered intervals over interleaved terms
|
|
+ LUCENE-9443: The UnifiedHighlighter was closing the underlying
|
|
reader when there were multiple term-vector fields. This was a
|
|
regression in 8.6.0.
|
|
+ LUCENE-9478: Prevent DWPTDeleteQueue from referencing itself
|
|
and leaking memory. The queue passed an implicit this
|
|
reference to the next queue instance on flush which leaked
|
|
about 500byte of memory on each full flush, commit or
|
|
getReader call.
|
|
+ LUCENE-9427: Fix a regression where the unified highlighter
|
|
didn't produce highlights on fuzzy queries that correspond to
|
|
exact matches.
|
|
+ LUCENE-9467: Fix NRTCachingDirectory to use
|
|
Directory#fileLength to check if a file already exists instead
|
|
of opening an IndexInput on the file which might throw a
|
|
AccessDeniedException in some Directory implementations.
|
|
+ LUCENE-9501: Fix a bug in
|
|
IndexSortSortedNumericDocValuesRangeQuery where it could
|
|
violate the DocIdSetIterator contract.
|
|
+ LUCENE-9401: Include field in ComplexPhraseQuery's toString()
|
|
+ LUCENE-9578: Fix TermRangeQuery when there is no upper bound
|
|
and the lower bound is the empty string excluded. This would
|
|
previously match no strings at all while it should match all
|
|
non-empty strings.
|
|
+ LUCENE-9524: Fix NPE in SpanWeight#explain when no scoring is
|
|
required and SpanWeight has null Similarity.SimScorer.
|
|
+ LUCENE-9508: DocumentsWriter was only stalling threads for 1
|
|
second allowing documents to be indexed even the
|
|
DocumentsWriter wasn't able to keep up flushing. Unless IW
|
|
can't make progress due to an ill behaving DWPT this issue
|
|
was barely noticeable.
|
|
+ LUCENE-9581: Japanese tokenizer should discard the compound
|
|
token instead of disabling the decomposition of long tokens
|
|
when discardCompoundToken is activated.
|
|
+ LUCENE-9595: Make Component2D#withinPoint implementations
|
|
consistent with ShapeQuery logic.
|
|
+ LUCENE-9606: Wrap boolean queries generated by shape fields
|
|
with a Constant score query.
|
|
+ LUCENE-9617: Fix per-field memory leak in
|
|
IndexWriter.deleteAll(). Reset next available internal field
|
|
number to 0 on FieldInfos.clear(), to avoid wasting FieldInfo
|
|
references.
|
|
+ LUCENE-9635: BM25FQuery - Mask encoded norm long value in
|
|
array lookup.
|
|
+ LUCENE-9642: When encoding triangles in ShapeField, make sure
|
|
generated triangles are CCW by rotating triangle points before
|
|
checking triangle orientation.
|
|
+ LUCENE-9661: Fix deadlock in TermsEnum.EMPTY that occurs when
|
|
trying to initialize TermsEnum and BaseTermsEnum at the same
|
|
time
|
|
+ LUCENE-9744: NPE on a degenerate query in
|
|
MinimumShouldMatchIntervalsSource
|
|
$MinimumMatchesIterator.getSubMatches().
|
|
+ LUCENE-9762: DoubleValuesSource.fromQuery (also used by
|
|
FunctionScoreQuery.boostByQuery) could throw an exception when
|
|
the query implements TwoPhaseIterator and when the score is
|
|
requested repeatedly.
|
|
+ LUCENE-9791: BytesRefHash.equals/find is now thread safe,
|
|
fixing a Luwak/Monitor bug causing registered queries to
|
|
sometimes fail to match.
|
|
+ LUCENE-9870: Fix Circle2D intersectsLine t-value (distance)
|
|
range clamp
|
|
+ LUCENE-9887: Fixed parameter use in RadixSelector.
|
|
+ LUCENE-9953: LongValueFacetCounts should count each document
|
|
at most once when determining the total count for a dimension.
|
|
Prior to this fix, multi-value docs could contribute a > 1
|
|
count to the dimension count.
|
|
+ LUCENE-9958: Fixed performance regression for boolean queries
|
|
that configure a minimum number of matching clauses.
|
|
+ LUCENE-9963: FlattenGraphFilter is now more robust when
|
|
handling incoming holes in the input token graph
|
|
+ LUCENE-9964: Duplicate long values in a document field should
|
|
only be counted once when using SortedNumericDocValuesFields
|
|
+ LUCENE-9967: Do not throw NullPointerException while trying
|
|
to handle another exception in ReplicaNode.start
|
|
+ LUCENE-9988: Fix DrillSideways correctness bug introduced in
|
|
LUCENE-9944
|
|
+ LUCENE-9991: Fix edge case failure in
|
|
TestStringValueFacetCounts
|
|
+ LUCENE-9999: CombinedFieldQuery can fail with an exception
|
|
when document is missing some fields.
|
|
+ LUCENE-10008: Respect ignoreCase in CommonGramsFilterFactory
|
|
+ LUCENE-10020: DocComparator should not skip docs with the same
|
|
docID on multiple sorts with search after
|
|
+ LUCENE-10026: Fix CombinedFieldQuery equals and hashCode,
|
|
which ensures query rewrites don't drop CombinedFieldQuery
|
|
clauses.
|
|
+ LUCENE-10039: Correct CombinedFieldQuery scoring when there is
|
|
a single field.
|
|
+ LUCENE-10046: Counting bug fixed in StringValueFacetCounts.
|
|
+ LUCENE-10060: Ensure DrillSidewaysQuery instances never get
|
|
cached.
|
|
+ LUCENE-10070 Skip deleted docs when accumulating facet counts
|
|
for all docs
|
|
+ LUCENE-10081: KoreanTokenizer should check the max backtrace
|
|
gap on whitespaces.
|
|
+ LUCENE-10106: Sort optimization can wrongly skip the first
|
|
document of each segment
|
|
+ LUCENE-10110: MultiCollector now handles single leaf collector
|
|
that wants to skip low-scoring hits but the combined score
|
|
mode doesn't allow it
|
|
+ LUCENE-10111: Missing calculating the bytes used of
|
|
DocsWithFieldSet in NormValuesWriter
|
|
+ LUCENE-10116: Missing calculating the bytes used of
|
|
DocsWithFieldSet and currentValues in
|
|
SortedSetDocValuesWriter
|
|
+ LUCENE-10119: Sort optimization with search_after can wrongly
|
|
skip documents whose values are equal to the last value of the
|
|
previous page
|
|
+ LUCENE-10126: Sort optimization with a chunked bulk scorer can
|
|
wrongly skip documents
|
|
+ LUCENE-10134: ConcurrentSortedSetDocValuesFacetCounts
|
|
shouldn't share liveDocs Bits across threads
|
|
+ LUCENE-10154: NumericLeafComparator to define getPointValues
|
|
+ LUCENE-10208: Ensure that the minimum competitive score does
|
|
not decrease in concurrent search
|
|
+ LUCENE-10477: Highlighter:
|
|
WeightedSpanTermExtractor.extractWeightedSpanTerms to
|
|
Query#rewrite multiple times if necessary
|
|
+ LUCENE-10564: Make sure SparseFixedBitSet#or updates
|
|
ramBytesUsed
|
|
* Documentation
|
|
+ LUCENE-9424: Add a performance warning to
|
|
AttributeSource.captureState javadocs
|
|
* Changes in runtime behaviour
|
|
+ LUCENE-9539: SortingCodecReader now doesn't cache doc values
|
|
fields anymore. Previously, SortingCodecReader used to cache
|
|
all doc values fields after they were loaded into memory.
|
|
This reader should only be used to sort segments after the
|
|
fact using IndexWriter#addIndices.
|
|
* Other
|
|
+ LUCENE-9257: Always keep FST off-heap. FSTLoadMode, Reader
|
|
attributes and openedFromWriter removed.
|
|
+ LUCENE-9272: Checksums of the terms index are now verified
|
|
when LeafReader#checkIntegrity is called rather than when
|
|
opening the index.
|
|
+ LUCENE-9270: Update Javadoc about normalizeEntry in the
|
|
Kuromoji DictionaryBuilder.
|
|
+ LUCENE-9275: Make TestLatLonMultiPolygonShapeQueries more
|
|
resilient for CONTAINS queries.
|
|
+ LUCENE-9244: Adjust
|
|
TestLucene60PointsFormat#testEstimatePointCount2Dims so it
|
|
does not fail when a point is shared by multiple leaves.
|
|
+ LUCENE-9271: ByteBufferIndexInput was refactored to work on
|
|
top of the ByteBuffer API.
|
|
+ LUCENE-9191: Make LineFileDocs's random seeking more
|
|
efficient, making tests using LineFileDocs faster
|
|
+ LUCENE-9338: Refactors SimpleBindings to improve type safety
|
|
and cycle detection
|
|
+ LUCENE-9358: Change the way the multi-dimensional BKD tree
|
|
builder generates the intermediate tree representation to be
|
|
equal to the one dimensional case to avoid unnecessary tree
|
|
and leaves rotation.
|
|
+ LUCENE-9288: poll_mirrors.py release script can handle HTTPS
|
|
mirrors.
|
|
+ LUCENE-9232: Fix or suppress 13 resource leak precommit
|
|
warnings in lucene/replicator
|
|
+ LUCENE-9398: Always keep BKD index off-heap. BKD reader does
|
|
not implement Accountable any more.
|
|
+ LUCENE-9292: Refactor BKD point configuration into its own
|
|
class.
|
|
+ LUCENE-9470: Make TestXYMultiPolygonShapeQueries more
|
|
resilient for CONTAINS queries.
|
|
+ LUCENE-9512: Move LockFactory stress test to be a
|
|
unit/integration test.
|
|
+ LUCENE-9637: Removes some unused code and replaces the Point
|
|
implementation on ShapeField/ShapeQuery random tests.
|
|
+ LUCENE-9836: Removed the pure Maven build. It is no longer
|
|
possible to build artifacts using Maven (this feature was no
|
|
longer working correctly). Due to migration to Gradle for
|
|
Lucene/Solr 9.0, the maintenance of the Maven build was no
|
|
longer reasonable. POM files are generated for deployment to
|
|
Maven Central only. Please use "ant generate-maven-artifacts"
|
|
to produce and deploy artifacts to any repository.
|
|
+ LUCENE-9836: Migrate Maven tasks to use
|
|
"maven-resolver-ant-tasks" instead of the no longer maintained
|
|
"maven-ant-tasks".
|
|
+ LUCENE-9985: Upgrade jetty to 9.4.41
|
|
+ LUCENE-9976: Fix WANDScorer assertion error.
|
|
+ LUCENE-10098: Add docs/links to GermanAnalyzer describing how
|
|
to decompound nouns
|
|
+ SOLR-14995: Update Jetty to 9.4.34
|
|
* Build
|
|
+ Upgrade forbiddenapis to version 3.0.1.
|
|
+ LUCENE-9376: Fix or suppress 20 resource leak precommit
|
|
warnings in lucene/search
|
|
+ LUCENE-9380: Fix auxiliary class warnings in Lucene
|
|
+ LUCENE-9389: Enhance gradle logging calls validation:
|
|
eliminate getMessage()
|
|
+ Upgrade forbiddenapis to version 3.1.
|
|
+ LUCENE-10104, SOLR-15631: Upgrade forbiddenapis to version 3.2
|
|
- Removed patch:
|
|
* lucene-java8compat.patch
|
|
+ not needed in this version, since the compatibility is handled
|
|
by --release option for javac versions that support it
|
|
- Added patch:
|
|
* lucene-timestamps.patch
|
|
+ use SOURCE_DATE_EPOCH for timestamps and for pseudo-random
|
|
seeds
|
|
+ improves reproducibility of builds using lucene for indexing
|
|
- Modified patches:
|
|
* lucene-missing-dependencies.patch
|
|
* lucene-nodoclint.patch
|
|
* lucene-osgi-manifests.patch
|
|
+ rediff to changed context
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Aug 21 23:13:03 UTC 2023 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Avoid xerces-j2 on classpath
|
|
* fixes build after apache-ivy upgrade to 2.5.2
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Jul 24 19:46:27 UTC 2023 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Do not depend on jtidy, since it is not used during build
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Mar 20 16:11:33 UTC 2022 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Added patch:
|
|
* lucene-nodoclint.patch
|
|
+ Do not abort compilation on html5 errors with javadoc 17
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Apr 9 09:20:58 UTC 2020 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Upgrade to version 8.5.0
|
|
* API Changes:
|
|
+ LUCENE-9093: Change in behavior of the UnifiedHighlighter's
|
|
LengthGoalBreakIterator that will yield Passages sized a
|
|
little different due to the fact that the sizing pivot is now
|
|
the center of the first match and not its left edge.
|
|
+ LUCENE-9116: PostingsWriterBase and PostingsReaderBase no
|
|
longer support setting a field's metadata via a 'long[]'.
|
|
+ LUCENE-9116: The FSTOrd postings format has been removed.
|
|
+ LUCENE-8369: Remove obsolete spatial module.
|
|
+ LUCENE-8621: Refactor LatLonShape, XYShape, and all query and
|
|
utility classes to core.
|
|
+ LUCENE-9218: XY geometries API works in float space.
|
|
+ LUCENE-9212: Intervals.multiterm() takes CompiledAutomaton
|
|
rather than plain Automaton
|
|
+ LUCENE-9150: Restore support for dynamic PlanetModel in
|
|
spatial3d.
|
|
+ LUCENE-9171: QueryBuilder.newTermQuery() and
|
|
.newSynonymQuery() now take boost parameters.
|
|
+ LUCENE-9029: Deprecate SloppyMath toRadians/toDegrees in
|
|
favor of Java Math.
|
|
+ LUCENE-8620: Add CONTAINS support for LatLonShape and XYShape.
|
|
+ LUCENE-9050: MultiTermIntervalsSource.visit() was not calling
|
|
back to its visitor.
|
|
+ LUCENE-8909: IndexWriter#getFieldNames() method is used to
|
|
get fields present in index. After LUCENE-8316, this method is
|
|
no longer required. Hence, deprecate
|
|
IndexWriter#getFieldNames() method.
|
|
+ LUCENE-8755: SpatialPrefixTreeFactory now consumes the
|
|
"version" parsed with Lucene's Version class. The quad and
|
|
packed quad prefix trees are sensitive to this. It's
|
|
recommended to pass the version like you should do likewise
|
|
for analysis components for tokenized text, or else changes to
|
|
the encoding in future versions may be incompatible with older
|
|
indexes.
|
|
+ LUCENE-8956: QueryRescorer now only sorts the first topN hits
|
|
instead of all initial hits.
|
|
+ LUCENE-8921: IndexSearcher.termStatistics() no longer takes a
|
|
TermStates; it takes the docFreq and totalTermFreq. And don't
|
|
call if docFreq <= 0. The previous implementation survives as
|
|
deprecated and final. It's removed in 9.0.
|
|
+ LUCENE-8990: PointValues#estimateDocCount(visitor) estimates
|
|
the number of documents that would be matched by the given
|
|
IntersectVisitor. THe method is used to compute the cost() of
|
|
ScorerSuppliers instead of
|
|
PointValues#estimatePointCount(visitor).
|
|
+ LUCENE-8865: IndexSearcher now uses Executor instead of
|
|
ExecutorService. This change is fully backwards compatible
|
|
since ExecutorService directly implements Executor.
|
|
+ LUCENE-8856: Intervals queries have moved from the sandbox to
|
|
the queries module.
|
|
+ LUCENE-8893: Intervals.wildcard() and Intervals.prefix()
|
|
methods now take BytesRef rather than String.
|
|
+ LUCENE-3041: A query introspection API has been added.
|
|
Queries should implement a visit() method, taking a
|
|
QueryVisitor, and either pass the visitor down to any child
|
|
queries, or call a visitX() or consumeX() method on it. All
|
|
locations in the code that called Weight.extractTerms() have
|
|
been changed to use this API, and the extractTerms() method
|
|
has been deprecated.
|
|
+ LUCENE-8735: Directory.getPendingDeletions is now abstract to
|
|
ensure subclasses override it. FilterDirectory now delegates
|
|
the call, ensuring correct default behaviour for subclasses.
|
|
+ LUCENE-8662: TermsEnum.seekExact(BytesRef) to abstract and
|
|
delegate seekExact(BytesRef) in
|
|
FilterLeafReader.FilterTermsEnum.
|
|
+ LUCENE-8469: Deprecated StringHelper.compare has been removed.
|
|
+ LUCENE-8039: Introduce a "delta distance" method set to
|
|
GeoDistance. This allows distance calculations, especially for
|
|
paths, to take into account an "excursion" to include the
|
|
specified point.
|
|
+ LUCENE-8007: Index statistics Terms.getSumDocFreq(),
|
|
Terms.getDocCount() are now required to be stored by codecs.
|
|
Additionally, TermsEnum.totalTermFreq() and
|
|
Terms.getSumTotalTermFreq() are now required: if frequencies
|
|
are not stored they are equal to TermsEnum.docFreq() and
|
|
Terms.getSumDocFreq(), respectively, because all freq() values
|
|
equal 1.
|
|
+ LUCENE-8038: Deprecated PayloadScoreQuery constructors have
|
|
been removed
|
|
+ LUCENE-8014: Similarity.computeSlopFactor() and
|
|
Similarity.computePayloadFactor() have been removed
|
|
+ LUCENE-7996: Queries are now required to produce positive
|
|
scores.
|
|
+ LUCENE-8099: CustomScoreQuery, BoostedQuery and BoostingQuery
|
|
have been removed
|
|
+ LUCENE-8012: Explanation now takes Number rather than float
|
|
+ LUCENE-8116: SimScorer now only takes a frequency and a norm
|
|
as per-document scoring factors.
|
|
+ LUCENE-8113: TermContext has been renamed to TermStates, and
|
|
can now be constructed lazily if term statistics are not
|
|
required
|
|
+ LUCENE-8242: Deprecated method
|
|
IndexSearcher#createNormalizedWeight() has been removed
|
|
+ LUCENE-8267: Memory codecs removed from the codebase
|
|
(MemoryPostings, MemoryDocValues).
|
|
+ LUCENE-8144: Moved QueryCachingPolicy.ALWAYS_CACHE to the
|
|
test framework.
|
|
+ LUCENE-8356: StandardFilter and StandardFilterFactory have
|
|
been removed
|
|
+ LUCENE-8373: StandardAnalyzer.ENGLISH_STOP_WORD_SET has been
|
|
removed
|
|
+ LUCENE-8388: Unused PostingsEnum#attributes() method has been
|
|
removed
|
|
+ LUCENE-8405: TopDocs.maxScore is removed. IndexSearcher and
|
|
TopFieldCollector no longer have an option to compute the
|
|
maximum score when sorting by field.
|
|
+ LUCENE-8411: TopFieldCollector no longer takes a fillFields
|
|
option, it now always fills fields.
|
|
+ LUCENE-8412: TopFieldCollector no longer takes a
|
|
trackDocScores option. Scores need to be set on top hits via
|
|
TopFieldCollector#populateScores instead.
|
|
+ LUCENE-6228: A new Scorable abstract class has been added,
|
|
containing only those methods from Scorer that should be
|
|
called from Collectors. LeafCollector.setScorer() now takes a
|
|
Scorable rather than a Scorer.
|
|
+ LUCENE-8475: Deprecated constants have been removed from
|
|
RamUsageEstimator.
|
|
+ LUCENE-8483: Scorers may no longer take null as a Weight
|
|
+ LUCENE-8352: TokenStreamComponents is now final, and can take
|
|
a Consumer<Reader> in its constructor
|
|
+ LUCENE-8498: LowerCaseTokenizer has been removed, and
|
|
CharTokenizer no longer takes a normalizer function.
|
|
+ LUCENE-7875: Moved MultiFields static methods out of the
|
|
class. getLiveDocs is now in MultiBits which is now public.
|
|
getMergedFieldInfos and getIndexedFields are now in
|
|
FieldInfos. getTerms is now in MultiTerms.
|
|
getTermPositionsEnum and getTermDocsEnum were collapsed and
|
|
renamed to just getTermPostingsEnum and moved to MultiTerms.
|
|
+ LUCENE-8513: MultiFields.getFields is now removed. Please
|
|
avoid this class, and Fields in general, when possible.
|
|
+ LUCENE-8497: MultiTermAwareComponent has been removed, and in
|
|
its place TokenFilterFactory and CharFilterFactory now expose
|
|
type-safe normalize() methods. This decouples normalization
|
|
from tokenization entirely.
|
|
+ LUCENE-8597: IntervalIterator now exposes a gaps() method
|
|
that reports the number of gaps between its component
|
|
sub-intervals. This can be used in a new filter available via
|
|
Intervals.maxgaps().
|
|
+ LUCENE-8609: Remove IndexWriter#numDocs() and
|
|
IndexWriter#maxDoc() in favor of IndexWriter#getDocStats().
|
|
* Changes in Runtime Behavior
|
|
+ LUCENE-8671: Load FST off-heap also for ID-like fields if
|
|
reader is not opened from an IndexWriter.
|
|
+ LUCENE-8730: WordDelimiterGraphFilter always emits its
|
|
original token first. This brings its behaviour into line with
|
|
the deprecated WordDelimiterFilter, so that the only
|
|
difference in output between the two is in the position length
|
|
attribute.
|
|
+ LUCENE-7386: Disjunctions nested in disjunctions are now
|
|
flattened. This might trigger changes in the produced scores
|
|
due to changes to the order in which scores of sub clauses are
|
|
summed up.
|
|
+ LUCENE-8756: MoreLikeThisQuery now respects custom term
|
|
frequencies (TermFrequencyAttribute) at search time
|
|
+ LUCENE-8333: Switch MoreLikeThis.setMaxDocFreqPct to use
|
|
maxDoc instead of numDocs.
|
|
+ LUCENE-7837: Indices that were created before the previous
|
|
major version will now fail to open even if they have been
|
|
merged with the previous major version.
|
|
+ LUCENE-8020: Similarities are no longer passed terms that
|
|
don't exist by queries such as SpanOrQuery, so scoring
|
|
formulas no longer require divide-by-zero hacks.
|
|
IndexSearcher.termStatistics/collectionStatistics return null
|
|
instead of returning bogus values for a non-existent term or
|
|
field.
|
|
+ LUCENE-7996: FunctionQuery and FunctionScoreQuery now return
|
|
a score of 0 when the function produces a negative value.
|
|
+ LUCENE-8116: Similarities now score fields that omit norms as
|
|
if the norm was 1. This might change score values on fields
|
|
that omit norms.
|
|
+ LUCENE-8134: Index options are no longer automatically
|
|
downgraded.
|
|
+ LUCENE-8031: Length normalization correctly reflects omission
|
|
of term frequencies.
|
|
+ LUCENE-7444: StandardAnalyzer no longer defaults to removing
|
|
English stopwords
|
|
+ LUCENE-8060: IndexSearcher's search and searchAfter methods
|
|
now only compute total hit counts accurately up to 1,000 in
|
|
order to enable top-hits optimizations such as block-max WAND
|
|
(LUCENE-8135).
|
|
+ LUCENE-8505: IndexWriter#addIndices will now fail if the
|
|
target index is sorted but the candidate is not.
|
|
+ LUCENE-8535: Highlighter and FVH doesn't support ToParent and
|
|
ToChildBlockJoinQuery out of the box anymore. In order to
|
|
highlight on Block-Join Queries a custom
|
|
WeightedSpanTermExtractor / FieldQuery should be used.
|
|
+ LUCENE-8563: BM25 scores don't include the (k1+1) factor in
|
|
their numerator anymore. This doesn't affect ordering as this
|
|
is a constant factor which is the same for every document.
|
|
+ LUCENE-8509: WordDelimiterGraphFilter will no longer set the
|
|
offsets of internal tokens by default, preventing a number of
|
|
bugs when the filter is chained with tokenfilters that change
|
|
the length of their tokens
|
|
+ LUCENE-8633: IntervalQuery scores do not use term weighting
|
|
any more, the score is instead calculated as a function of the
|
|
sloppy frequency of the matching intervals.
|
|
+ LUCENE-8635: FSTs can now remain off-heap, accessed via
|
|
IndexInput, and the default codec's term dictionary
|
|
(BlockTreeTermsReader) will now leave the FST for the terms
|
|
index off-heap for non-primary-key fields using MMapDirectory,
|
|
reducing heap usage for such fields.
|
|
* New Features:
|
|
+ LUCENE-8903: Add LatLonShape and XYShape point query.
|
|
+ LUCENE-8707: Add LatLonShape and XYShape distance query.
|
|
+ LUCENE-9238: New XYPointField field and Queries for indexing,
|
|
searching and sorting cartesian points.
|
|
+ LUCENE-8936: Add SpanishMinimalStemFilter
|
|
+ LUCENE-8764 LUCENE-8945: Add "export all terms and doc freqs"
|
|
feature to Luke with delimiters.
|
|
+ LUCENE-8747: Composite Matches from multiple subqueries now
|
|
allow access to their submatches, and a new NamedMatches API
|
|
allows marking of subqueries and a simple way to find which
|
|
subqueries have matched on a given document
|
|
+ LUCENE-8769: Introduce Range Query For Multiple Connected
|
|
Ranges
|
|
+ LUCENE-8960: Introduce LatLonDocValuesPointInPolygonQuery for
|
|
LatLonDocValuesField
|
|
+ LUCENE-8753: New UniformSplitPostingsFormat (name
|
|
"UniformSplit") primarily benefiting in simplicity and
|
|
extensibility. New STUniformSplitPostingsFormat (name
|
|
"SharedTermsUniformSplit") that shares a single internal term
|
|
dictionary across fields.
|
|
+ LUCENE-8632: New XYShape Field and Queries for indexing and
|
|
searching general cartesian geometries.
|
|
+ LUCENE-8891: Snowball stemmer/analyzer for the Estonian
|
|
language.
|
|
+ LUCENE-8815: Provide a DoubleValues implementation for
|
|
retrieving the value of features without requiring a separate
|
|
numeric field. Note that as feature values are stored with
|
|
only 8 bits of mantissa the values returned may have a delta
|
|
from the original values indexed.
|
|
+ LUCENE-8803: Provide a FeatureSortfield to allow sorting
|
|
search hits by descending value of a feature. This is exposed
|
|
via the factory method FeatureField#newFeatureSort.
|
|
+ LUCENE-8784: The KoreanTokenizer now preserves punctuations
|
|
if discardPunctuation is set to false (defaults to true).
|
|
+ LUCENE-8812: Add new KoreanNumberFilter that can change
|
|
Hangul character to number and process decimal point. It is
|
|
similar to the JapaneseNumberFilter.
|
|
+ LUCENE-8362: Add doc-value support to range fields.
|
|
+ LUCENE-8766: Add monitor subproject (previously Luwak
|
|
monitoring library). This allows a stream of documents to be
|
|
matched against a set of registered queries in an efficien
|
|
manner, for use as a monitoring or classification tool.
|
|
+ LUCENE-7714: Add a numeric range query in sandbox that takes
|
|
advantage of index sorting.
|
|
+ LUCENE-8859: The completion suggester's postings format now
|
|
have an option to load its internal FST off-heap.
|
|
+ LUCENE-2562: The well-known graphical user interface for
|
|
inspecting Lucene indexes "Luke" was added as a Lucene module.
|
|
It can be started from the binary distribution by calling the
|
|
shell scripts in the module folder or from the source checkout
|
|
by using 'ant -f lucene/luke/build.xml run'. Luke provides a
|
|
Swing-based user interface and can be used to open Lucene or
|
|
Solr (or Elasticsearch) indexes, inspect documents, check
|
|
index commits and segments, or test (custom) analyzers. It
|
|
also has maintenance functions to check index structures and
|
|
force merge indexes for archival.
|
|
+ LUCENE-8340: LongPoint#newDistanceFeatureQuery may be used to
|
|
boost scores based on how close a value of a long field is
|
|
from a configurable origin. This is typically useful to boost
|
|
by recency.
|
|
+ LUCENE-8482: LatLonPoint#newDistanceFeatureQuery may be used
|
|
to boost scores based on the haversine distance of a
|
|
LatLonPoint field to a provided point. This is typically
|
|
useful to boost by distance.
|
|
+ LUCENE-8216: Added a new BM25FQuery in sandbox to blend
|
|
statistics across several fields using the BM25F formula.
|
|
+ LUCENE-8564: GraphTokenFilter is an abstract class useful for
|
|
token filters that need to read-ahead in the token stream and
|
|
take into account graph structures. This also changes
|
|
FixedShingleFilter to extend GraphTokenFilter
|
|
+ LUCENE-8612: Intervals.extend() treats an interval as if it
|
|
covered a wider span than it actually does, allowing users to
|
|
force minimum gaps between intervals in a phrase.
|
|
+ LUCENE-8629: New interval functions: Intervals.before(),
|
|
Intervals.after(), Intervals.within() and
|
|
Intervals.overlapping().
|
|
+ LUCENE-8622: Adds a minimum-should-match interval function
|
|
that produces intervals spanning a subset of a set of sources.
|
|
+ LUCENE-8645: Intervals.fixField() allows you to report
|
|
intervals from one field as if they came from another.
|
|
+ LUCENE-8646: New interval functions: Intervals.prefix() and
|
|
Intervals.wildcard()
|
|
+ LUCENE-8655: Add a getter in FunctionScoreQuery class in
|
|
order to access to the underlying DoubleValuesSource.
|
|
+ LUCENE-8697: GraphTokenStreamFiniteStrings correctly handles
|
|
side paths containing gaps
|
|
+ LUCENE-8702: Simplify intervals returned from vararg
|
|
Intervals factory methods
|
|
* Improvements:
|
|
+ LUCENE-9149: Increase data dimension limit in BKD.
|
|
+ LUCENE-9102: Add maxQueryLength option to DirectSpellchecker.
|
|
+ LUCENE-9091: UnifiedHighlighter HTML escaping should only
|
|
escape essentials
|
|
+ LUCENE-9105: UniformSplit postings format detects corrupted
|
|
index and better handles IO exceptions.
|
|
+ LUCENE-9106: UniformSplit postings format allows extension of
|
|
block/line serializers.
|
|
+ LUCENE-9093: UnifiedHighlighter's LengthGoalBreakIterator has
|
|
a new fragmentAlignment option to better center the first
|
|
match in the passage. Also the sizing point now pivots at the
|
|
center of the first match term and not its left edge. This
|
|
yields Passages that won't be identical to the previous
|
|
behavior.
|
|
+ LUCENE-9153: Allow WhitespaceAnalyzer to set a maxTokenLength
|
|
other than the default of 255
|
|
+ LUCENE-9152: Improve line intersections with polygons when
|
|
they are touching from the outside.
|
|
+ LUCENE-9123: Add new JapaneseTokenizer constructors with
|
|
discardCompoundToken option that controls whether the
|
|
tokenizer emits original (compound) tokens when the mode is
|
|
not NORMAL.
|
|
+ UCENE-9253: KoreanTokenizer now supports custom
|
|
dictionaries(system, unknown).
|
|
+ LUCENE-9171: QueryBuilder can now use BoostAttributes on
|
|
input token streams to selectively boost particular terms or
|
|
synonyms in parsed queries.
|
|
+ LUCENE-9002: Skip costly caching clause in LRUQueryCache if
|
|
it makes the query many times slower.
|
|
+ LUCENE-9006: WordDelimiterGraphFilter's catenateAll token is
|
|
now ordered before any token parts, like WDF did.
|
|
+ LUCENE-9028: introducing Intervals.multiterm()
|
|
+ LUCENE-9018: ConcatenateGraphFilter now has a configurable
|
|
separator.
|
|
+ LUCENE-9036: ExitableDirectoryReader may interupt scaning
|
|
over DocValues
|
|
+ LUCENE-9062: QueryVisitor now has a consumeTermsMatching()
|
|
method, allowing queries that match a class of terms to pass a
|
|
ByteRunAutomaton matching those that class back to the visitor.
|
|
+ LUCENE-9073: IntervalQuery to respond field on toString() and
|
|
explain()
|
|
+ LUCENE-8874: Show SPI names instead of class names in Luke
|
|
Analysis tab.
|
|
+ LUCENE-8894: Add APIs to find SPI names for
|
|
Tokenizer/CharFilter/TokenFilter factory classes.
|
|
+ LUCENE-8914: move the logic for discarding inner modes in
|
|
FloatPointNearestNeighbor to the IntersectVisitor so we take
|
|
advantage of the change introduced in LUCENE-7862.
|
|
+ LUCENE-8955: move the logic for discarding inner modes in
|
|
LatLonPoint NearestNeighbor to the IntersectVisitor so we take
|
|
advantage of the change introduced in LUCENE-7862.
|
|
+ LUCENE-8918: PhraseQuery throws exceptions at construction
|
|
time if it is passed null arguments.
|
|
+ LUCENE-8916: GraphTokenStreamFiniteStrings preserves all
|
|
Token attributes through its finite strings TokenStreams
|
|
+ LUCENE-8933: Check kuromoji user dictionary beforehand to
|
|
avoid unexpected runtime exceptions. (Tomoko Uchida
|
|
+ LUCENE-8906: Expose Lucene50PostingsFormat.IntBlockTermState
|
|
as public so that other postings formats can re-use it.
|
|
+ LUCENE-8942: Remove redundant parameters and improve
|
|
visibility strictness in LRUQueryCache
|
|
+ SOLR-13663: Introduce <SpanPositionRange> into XML Query
|
|
Parser
|
|
+ LUCENE-8952: Use a sort key instead of true distance in
|
|
NearestNeighbor
|
|
+ LUCENE-8620: Tessellator labels the edges of the generated
|
|
triangles whether they belong to the original polygon. This
|
|
information is added to the triangle encoding.
|
|
+ LUCENE-8964: Fix geojson shape parsing on string arrays in
|
|
properties
|
|
+ LUCENE-8976: Use exact distance between point and bounding
|
|
rectangle in FloatPointNearestNeighbor.
|
|
+ LUCENE-8966: The Korean analyzer now splits tokens on
|
|
boundaries between digits and alphabetic characters.
|
|
+ LUCENE-8984: MoreLikeThis MLT is biased for uncommon fields
|
|
+ LUCENE-7840: Non-scoring BooleanQuery now removes SHOULD
|
|
clauses before building the scorer supplier as opposed to
|
|
eliminating them during scoring construction.
|
|
+ LUCENE-8770: BlockMaxConjunctionScorer now leverages
|
|
two-phase iterators in order to avoid executing the second
|
|
phase when scorers don't intersect.
|
|
+ LUCENE-8781: FST lookup performance has been improved in many
|
|
cases by encoding Arcs using full-sized arrays with gaps. The
|
|
new encoding is enabled for postings in the default codec and
|
|
for suggesters.
|
|
+ LUCENE-8818: Fix smokeTestRelease.py encoding bug
|
|
+ LUCENE-8845: Allow Intervals.prefix() and
|
|
Intervals.wildcard() to specify their maximum allowed expansions
|
|
+ LUCENE-8875: Introduce a Collector optimized for use cases
|
|
when large number of hits are requested
|
|
+ LUCENE-8848 LUCENE-7757 LUCENE-8492: The UnifiedHighlighter
|
|
now detects that parts of the query are not understood by it,
|
|
and thus it should not make optimizations that result in no
|
|
highlights or slow highlighting. This generally works best for
|
|
WEIGHT_MATCHES mode. Consequently queries produced by
|
|
ComplexPhraseQueryParser and the surround QueryParser will now
|
|
highlight correctly.
|
|
+ LUCENE-8793: Luke enhanced UI for CustomAnalyzer: show
|
|
detailed analysis steps.
|
|
+ LUCENE-8855: Add Accountable to some Query implementations
|
|
+ LUCENE-8673: Use radix partitioning when merging dimensional
|
|
points instead of sorting all dimensions before hand.
|
|
+ LUCENE-8687: Optimise radix partitioning for points on heap.
|
|
+ LUCENE-8699: Change HeapPointWriter to use a single byte
|
|
array instead to a list of byte arrays. In addition a new
|
|
interface PointValue is added to abstract out the different
|
|
formats between offline and on-heap writers.
|
|
+ LUCENE-8703: Build point writers in the BKD tree only when
|
|
they are needed.
|
|
+ LUCENE-8652: SynonymQuery can now deboost the document
|
|
frequency of each term when blending the score of the synonym.
|
|
+ LUCENE-8631: The Korean's user dictionary now picks the
|
|
longest-matching word and discards the other matches.
|
|
+ LUCENE-8732: ConstantScoreQuery can now early terminate the
|
|
query if the minimum score is greater than the constant score
|
|
and total hits are not requested.
|
|
+ LUCENE-8750: Implements setMissingValue() on sort fields
|
|
produced from DoubleValuesSource and LongValuesSource
|
|
+ LUCENE-8701: ToParentBlockJoinQuery now creates a child
|
|
scorer that disallows skipping over non-competitive documents
|
|
if the score of a parent depends on the score of multiple
|
|
children (avg, max, min). Additionally the score mode 'none'
|
|
that assigns a constant score to each parent can early
|
|
terminate top scores's collection.
|
|
+ LUCENE-8751: Weight#matches now use the ScorerSupplier to
|
|
build scorers with a lead cost of 1 (single document).
|
|
+ LUCENE-8752: Japanese new era name '令和' (Reiwa) is added to
|
|
the dictionary used in JapaneseTokenizer so that the analyzer
|
|
handles the era name correctly. Reiwa is set to replace the
|
|
Heisei Era on May 1, 2019.
|
|
+ LUCENE-8671: Introduced reader attributes allows a per
|
|
IndexReader configuration of codec internals. This enables a
|
|
per reader configuration if FSTs are on- or off-heap on a per
|
|
field basis
|
|
+ LUCENE-8787: spatial-extras DateRangePrefixTree used to only
|
|
parse ISO-8601 timestamps with 0 or 3 digits of milliseconds
|
|
precision but now parses other lengths (although > 3 not
|
|
used).
|
|
+ LUCENE-7997: Add BaseSimilarityTestCase to sanity check
|
|
similarities. SimilarityBase switches to 64-bit doubles
|
|
internally to help avoid common numeric issues. Add missing
|
|
range checks for similarity parameters. Improve BM25 and
|
|
ClassicSimilarity's explanations.
|
|
+ LUCENE-8011: Improved similarity explanations.
|
|
+ LUCENE-4198: Codecs now have the ability to index score
|
|
impacts.
|
|
+ LUCENE-8135: Boolean queries now implement the block-max WAND
|
|
algorithm in order to speed up selection of top scored
|
|
documents.
|
|
+ LUCENE-8279: CheckIndex now cross-checks terms with norms.
|
|
+ LUCENE-8660: TopDocsCollectors now return an accurate count
|
|
(instead of a lower bound) if the total hit count is equal to
|
|
the provided threshold.
|
|
* Optimizations
|
|
+ LUCENE-9211: Add compression for Binary doc value fields.
|
|
+ LUCENE-4702: Better compression of terms dictionaries.
|
|
+ LUCENE-9228: Sort dvUpdates in the term order before applying
|
|
if they all update a single field to the same value. This
|
|
optimization can reduce the flush time by around 20% for the
|
|
docValues update user cases.
|
|
+ LUCENE-9245: Reduce AutomatonTermsEnum memory usage.
|
|
+ LUCENE-9237: Faster UniformSplit intersect TermsEnum.
|
|
+ LUCENE-9068: FuzzyQuery builds its Automaton up-front
|
|
+ LUCENE-9113: Faster merging of SORTED/SORTED_SET doc values.
|
|
+ LUCENE-9125: Optimize Automaton.step() with binary search and
|
|
introduce Automaton.next().
|
|
+ LUCENE-9147: The index of stored fields and term vectors in
|
|
now off-heap.
|
|
+ LUCENE-8928: When building a kd-tree for dimensions n > 2,
|
|
compute exact bounds for an inner node every N splits to
|
|
improve the quality of the tree. N is defined by
|
|
SPLITS_BEFORE_EXACT_BOUNDS which is set to 4.
|
|
+ BaseDirectoryReader no longer sums up the
|
|
'LeafReader#numDocs' of its leaves eagerly. This especially
|
|
helps when creating views of readers that hide documents,
|
|
since computing the number of live documents is an expensive
|
|
operation.
|
|
+ LUCENE-8992: TopFieldCollector and TopScoreDocCollector can
|
|
now share minimum scores across leaves concurrently.
|
|
+ LUCENE-8932: BKDReader's index is now stored off-heap when
|
|
the IndexInput is an instance of ByteBufferIndexInput.
|
|
+ LUCENE-9024: IntroSelector now falls back to the median of
|
|
medians algorithm instead of sorting when the maximum
|
|
recursion level is exceeded, providing better worst-case
|
|
runtime.
|
|
+ LUCENE-8920: The denser arcs of FST now index labels with a
|
|
bitset in order to provide near constant time access.
|
|
+ LUCENE-9027: Use SIMD instructions to decode postings.
|
|
+ LUCENE-9049: Remove FST cached root arcs now redundant with
|
|
labels indexed by bitset. This frees some on-heap FST space.
|
|
+ LUCENE-9045: Do not use TreeMap/TreeSet in BlockTree and
|
|
PerFieldPostingsFormat.
|
|
+ LUCENE-8922: DisjunctionMaxQuery more efficiently leverages
|
|
impacts to skip non-competitive hits.
|
|
+ LUCENE-8935: BooleanQuery with no scoring clause can now
|
|
early terminate the query when the total hits is not requested.
|
|
+ LUCENE-8941: Matches on wildcard queries will defer building
|
|
their full disjunction until a MatchesIterator is pulled
|
|
+ LUCENE-8755: spatial-extras quad and packed quad prefix trees
|
|
now index points faster.
|
|
+ LUCENE-8860: add additional leaf node level optimizations in
|
|
LatLonShapeBoundingBoxQuery.
|
|
+ LUCENE-8968: Improve performance of WITHIN and DISJOINT
|
|
queries for Shape queries by doing just one pass whenever
|
|
possible.
|
|
+ LUCENE-8939: Introduce shared count based early termination
|
|
across multiple slices
|
|
+ LUCENE-8980: Blocktree's seekExact now short-circuits false
|
|
if the term isn't in the min-max range of the segment. Large
|
|
perf gain for ID/time like data when populated sequentially.
|
|
+ LUCENE-8796: Use exponential search instead of binary search
|
|
in IntArrayDocIdSet#advance method
|
|
+ LUCENE-8865: Use incoming thread for execution if
|
|
IndexSearcher has an executor. Now caller threads execute at
|
|
least one search on an index even if there is an executor
|
|
provided to minimize thread context switching.
|
|
+ LUCENE-8868: New storing strategy for BKD tree leaves with
|
|
low cardinality. It stores the distinct values once with the
|
|
cardinality value reducing the storage cost.
|
|
+ LUCENE-8885: Optimise BKD reader by exploiting cardinality
|
|
information stored on leaves.
|
|
+ LUCENE-8896: Override default implementation of
|
|
IntersectVisitor#visit(DocIDSetBuilder, byte[]) for several queries.
|
|
+ LUCENE-8901: Load frequencies lazily only when needed in
|
|
BlockDocsEnum and BlockImpactsEverythingEnum
|
|
+ LUCENE-8888: Optimize distribution of points with data
|
|
dimensions in BKD tree leaves.
|
|
+ LUCENE-8311: Phrase queries now leverage impacts.
|
|
+ LUCENE-8040: Optimize IndexSearcher.collectionStatistics,
|
|
avoiding MultiFields/MultiTerms
|
|
+ LUCENE-4100: Disjunctions now support faster collection of
|
|
top hits when the total hit count is not required.
|
|
+ LUCENE-7993: Phrase queries are now faster if total hit
|
|
counts are not required.
|
|
+ LUCENE-8109: Boolean queries propagate information about the
|
|
minimum competitive score in order to make collection faster
|
|
if there are disjunctions or phrase queries as sub queries,
|
|
which know how to leverage this information to run faster.
|
|
+ LUCENE-8439: Disjunction max queries can skip blocks to
|
|
select the top documents if the total hit count is not required.
|
|
+ LUCENE-8204: Boolean queries with a mix of required and
|
|
optional clauses are now faster if the total hit count is not
|
|
required.
|
|
+ LUCENE-8448: Boolean queries now propagates the mininum score
|
|
to their sub-scorers.
|
|
+ LUCENE-8511: MultiFields.getIndexedFields is now optimized;
|
|
does not call getMergedFieldInfos
|
|
+ LUCENE-8507: TopFieldCollector can now update the minimum
|
|
competitive score if the primary sort is by relevancy and the
|
|
total hit count is not required.
|
|
+ LUCENE-8464: ConstantScoreScorer now implements
|
|
setMinCompetitveScore in order to early terminate the iterator
|
|
if the minimum score is greater than the constant score.
|
|
+ LUCENE-8607: MatchAllDocsQuery can shortcut when total hit
|
|
count is not required
|
|
+ LUCENE-8585: Index-time jump-tables for DocValues, for O(1)
|
|
advance when retrieving doc values.
|
|
* Bug Fixes
|
|
+ LUCENE-9084: Fix potential deadlock due to circular
|
|
synchronization in AnalyzingInfixSuggester
|
|
+ LUCENE-9115: NRTCachingDirectory no longer caches files of
|
|
unknown size.
|
|
+ LUCENE-9144: Fix error message on OneDimensionBKDWriter when
|
|
too many points are added to the writer.
|
|
+ LUCENE-9135: Make UniformSplit FieldMetadata counters long.
|
|
+ LUCENE-9200: Fix TieredMergePolicy to use double (not float)
|
|
math to make its merging decisions, fixing a corner-case bug
|
|
uncovered by fun randomized tests
|
|
+ LUCENE-9099: Unordered and Ordered interval queries now
|
|
correctly handle repeated subterms - ordered intervals could
|
|
supply an 'extra' minimized interval, resulting in odd
|
|
matches when combined with eg CONTAINS queries; and unordered
|
|
intervals would match duplicate subterms on the same position,
|
|
so an query for UNORDERED(foo, foo) would match a document
|
|
containing 'foo' only once.
|
|
+ LUCENE-9250: Add support for Circle2d#intersectsLine around
|
|
the dateline.
|
|
+ LUCENE-9243: Add fudge factor when creating a bounding box of
|
|
a XYCircle.
|
|
+ LUCENE-9239: Circle2D#WithinTriangle detects properly if a
|
|
triangle is Within distance.
|
|
+ LUCENE-9251: Fix bug in the polygon tessellator where edges
|
|
with different value on #isEdgeFromPolygon were bot filtered
|
|
out properly.
|
|
+ LUCENE-9263: Fix wrong transformation of distance in meters
|
|
to radians in Geo3DPoint.
|
|
+ LUCENE-9001: Fix race condition in SetOnce.
|
|
+ LUCENE-9030: Fix WordnetSynonymParser behaviour so it behaves
|
|
similar to SolrSynonymParser.
|
|
+ LUCENE-9054: Fix reproduceJenkinsFailures.py to not overwrite
|
|
junit XML files when retrying
|
|
+ LUCENE-9031: UnsupportedOperationException on
|
|
MatchesIterator.getQuery()
|
|
+ LUCENE-8996: maxScore was sometimes missing from distributed
|
|
grouped responses.
|
|
+ LUCENE-9055: Fix the detection of lines crossing triangles
|
|
through edge points.
|
|
+ LUCENE-9103: Disjunctions can miss some hits in some rare
|
|
conditions.
|
|
+ LUCENE-8755: spatial-extras quad and packed quad prefix trees
|
|
could throw a NullPointerException for certain cell edge
|
|
coordinates
|
|
+ LUCENE-9005: BooleanQuery.visit() would pull subVisitors from
|
|
its parent visitor, rather than from a visitor for its own
|
|
specific query. This could cause problems when BQ was nested
|
|
under another BQ. Instead, we now pull a MUST subvisitor, pass
|
|
it to any MUST subclauses, and then pull SHOULD, MUST_NOT and
|
|
FILTER visitors from it rather than from the parent.
|
|
+ LUCENE-8831: Fixed LatLonShapeBoundingBoxQuery .hashCode
|
|
methods.
|
|
+ LUCENE-8775: Improve tessellator to handle better cases where
|
|
a hole share a vertex with the polygon.
|
|
+ LUCENE-8785: Ensure new threadstates are locked before
|
|
retrieving the number of active threadstates. This causes
|
|
assertion errors and potentially broken field attributes in
|
|
the IndexWriter when IndexWriter#deleteAll is called while
|
|
actively indexing.
|
|
+ LUCENE-8804: Forbid calls to putAttribute on frozen FieldType
|
|
instances.
|
|
+ LUCENE-8828: Removes the buggy 'disallow overlaps' boolean
|
|
from Intervals.unordered(), and replaces it with a new
|
|
Intervals.unorderedNoOverlaps() method
|
|
+ LUCENE-8843: Don't ignore exceptions that are thrown when
|
|
trying to open a file in IOUtils#fsync.
|
|
+ LUCENE-8835: FileSwitchDirectory now respects the file
|
|
extension when listing directory contents to ensure we don't
|
|
expose pending deletes if both directory point to the same
|
|
underlying filesystem directory.
|
|
+ LUCENE-8853: FileSwitchDirectory now applies best effort to
|
|
place tmp files in the same directory as the target files.
|
|
+ LUCENE-8892: Add missing closing parentheses in
|
|
MultiBoolFunction's description()
|
|
+ LUCENE-8736: LatLonShapePolygonQuery returns incorrect WITHIN
|
|
results with shared boundaries. Point in Polygon now correctly
|
|
includes boundary points. Box and Polygon relations with
|
|
triangles have also been improved to correctly include
|
|
boundary points.
|
|
+ LUCENE-8712: Polygon2D does not detect crossings through
|
|
segment edges.
|
|
+ LUCENE-8720: NameIntCacheLRU (in the facets module) had an
|
|
int overflow bug that disabled cleaning of the cache
|
|
+ LUCENE-8726: ValueSource.asDoubleValuesSource() could leak a
|
|
reference to IndexSearcher
|
|
+ LUCENE-8719: FixedShingleFilter can miss shingles at the end
|
|
of a token stream if there are multiple paths with different
|
|
lengths.
|
|
+ LUCENE-8688: TieredMergePolicy#findForcedMerges now tries to
|
|
create the cheapest merges that allow the index to go down to
|
|
'maxSegmentCount' segments or less.
|
|
+ LUCENE-8477: Interval disjunctions could miss valid hits if
|
|
some of the clauses of the disjunction are minimized away. We
|
|
now rewrite intervals if a source contains a disjunction and
|
|
the internal gaps matter for matching. This behaviour can be
|
|
disabled if users are more interested in speed rather than
|
|
accuracy of matching.
|
|
+ LUCENE-8741: ValueSource.fromDoubleValuesSource() was casting
|
|
to Scorer instead of Scorable, leading to ClassCastExceptions
|
|
+ LUCENE-8754: Fix ConcurrentModificationException in
|
|
SegmentInfo if attributes are accessed in MergePolicy while
|
|
the merge is running
|
|
+ LUCENE-8765: Fixed validation of the number of added points
|
|
in KD trees.
|
|
* Other
|
|
+ LUCENE-9109: Backport some changes from master (except
|
|
StackWalker) to improve TestSecurityManager
|
|
+ LUCENE-9110: Backport refactored stack analysis in tests to
|
|
use generalized LuceneTestCase methods
|
|
+ LUCENE-9141: Simplify LatLonShapeXQuery API by adding a new
|
|
abstract class called LatLonGeometry. Queries are executed
|
|
with input objects that extend such interface.
|
|
+ LUCENE-9194: Simplify XYShapeXQuery API by adding a new
|
|
abstract class called XYGeometry. Queries are executed with
|
|
input objects that extend such interface.
|
|
+ LUCENE-9096: Simplification of
|
|
CompressingTermVectorsWriter#flushOffsets.
|
|
+ LUCENE-9225: Rectangle extends LatLonGeometry so it can be
|
|
used in a geometry collection.
|
|
+ LUCENE-8979: Code Cleanup: Use entryset for map iteration
|
|
wherever possible. - Part 2
|
|
+ LUCENE-8746: Refactor EdgeTree - Introduce a Component tree
|
|
that represents the tree of components (e.g polygons). Edge
|
|
tree is now just a tree of edges.
|
|
+ LUCENE-8994: Code Cleanup - Pass values to list constructor
|
|
instead of empty constructor followed by addAll().
|
|
+ LUCENE-9046: Fix wrong example in Javadoc of TermInSetQuery
|
|
+ LUCENE-8983: Add sandbox PhraseWildcardQuery to control
|
|
multi-terms expansions in a phrase.
|
|
+ LUCENE-9067: Polygon2D#contains() is now thread safe.
|
|
+ LUCENE-8778 LUCENE-8911 LUCENE-8957: Define analyzer SPI
|
|
names as static final fields and document the names in Javadocs.
|
|
+ LUCENE-8758: QuadPrefixTree: removed levelS and levelN fields
|
|
which weren't used.
|
|
+ LUCENE-8975: Code Cleanup: Use entryset for map iteration
|
|
wherever possible.
|
|
+ LUCENE-8993, LUCENE-8807: Changed all repository and download
|
|
references in build files to HTTPS.
|
|
+ LUCENE-8998: Fix OverviewImplTest.testIsOptimized
|
|
reproducible failure.
|
|
+ LUCENE-8999: LuceneTestCase.expectThrows now propogates
|
|
assert/assumption failures up to the test w/o wrapping in a
|
|
new assertion failure unless the caller has explicitly
|
|
expected them
|
|
+ LUCENE-8062: GlobalOrdinalsWithScoreQuery is no longer
|
|
eligible for query caching.
|
|
+ LUCENE-8847: Code Cleanup: Remove StringBuilder.append with
|
|
concatenated strings.
|
|
+ LUCENE-8861: Script to find open Github PRs that needs
|
|
attention
|
|
+ LUCENE-8852: ReleaseWizard tool for release managers
|
|
+ LUCENE-8838: Remove support for Steiner points on Tessellator.
|
|
+ LUCENE-8879: Improve BKDRadixSelector tests.
|
|
+ LUCENE-8886: Fix TestMutablePointsReaderUtils tests.
|
|
+ LUCENE-8680: Refactor EdgeTree#relateTriangle method.
|
|
+ LUCENE-8685: Refactor LatLonShape tests.
|
|
+ LUCENE-8713: Add Line2D tests.
|
|
+ LUCENE-8729: Workaround: Disable accessibility doclints (Java
|
|
13+), so compilation with recent JDK succeeds.
|
|
+ LUCENE-8725: Make TermsQuery.SeekingTermSetTermsEnum a top
|
|
level class and public
|
|
* Build
|
|
+ Upgrade forbiddenapis to version 2.7; upgrade Groovy to
|
|
2.4.17.
|
|
+ LUCENE-9041: Upgrade ecj to 3.19.0 to fix sporadic precommit
|
|
javadoc issues
|
|
* Test Framework
|
|
+ LUCENE-8825: CheckHits now display the shard index in case of
|
|
mismatch between top hits.
|
|
- Modified patches:
|
|
* 0001-Disable-ivy-settings.patch
|
|
* 0002-Dependency-generation.patch
|
|
* lucene-java8compat.patch
|
|
* lucene-osgi-manifests.patch
|
|
+ rediff to changed context
|
|
- Added patch:
|
|
* lucene-missing-dependencies.patch
|
|
+ patch out dependencies that are not needed for modules
|
|
that we distribute
|
|
+ patch out dependencies on jars that we don't build
|
|
+ add target for the new monitor jars
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Mar 23 11:35:30 UTC 2020 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Modified patch:
|
|
* lucene-osgi-manifests.patch
|
|
+ add the OSGi manifest to queryparser module too
|
|
|
|
-------------------------------------------------------------------
|
|
Fri Oct 11 13:39:04 UTC 2019 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Modified patch:
|
|
* lucene-osgi-manifests.patch
|
|
+ add the OSGi manifests also to modules that are currently
|
|
not built due to missing dependencies
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Oct 1 11:25:47 UTC 2019 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Remove a bogus log4j build dependency
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Sep 26 18:45:50 UTC 2019 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Fix property Provides and Obsoletes in order to make upgrade
|
|
smooth
|
|
- Added patch:
|
|
* lucene-osgi-manifests.patch
|
|
+ Patch the build to produce OSGi manifests needed by eclipse
|
|
- Install the artifacts to "lucene" subdirectory and create
|
|
compatibility symlinks
|
|
- Install lucene-misc as archful artifact, since it contains
|
|
JNI code
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Sep 26 07:26:14 UTC 2019 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Upgrade to version 7.1.0
|
|
- Added patches:
|
|
* 0001-Disable-ivy-settings.patch
|
|
* 0002-Dependency-generation.patch
|
|
+ Sync with Fedora's 7.1.0
|
|
* lucene-java8compat.patch
|
|
+ Avoid using java9+ only functions
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Jun 24 12:26:21 UTC 2019 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Remove the parent references from the pom files, since we are not
|
|
building lucene using maven.
|
|
- Overhaul the packaging to distribute the artifacts and the
|
|
corresponding metadata and pom files in the same package
|
|
- Specify runtime dependencies of the different packages
|
|
- Remove version information from the artifact names
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Jun 24 10:44:08 UTC 2019 - Ismail Dönmez <idonmez@suse.com>
|
|
|
|
- Remove the JPP prefix from pom filenames
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Feb 12 16:41:36 UTC 2019 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Remove dependency on jline, because nothing in the build uses it
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Dec 22 05:31:12 UTC 2018 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Require the different apache-commons-* packages instead of
|
|
jakarta-commons-*
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Nov 1 12:55:48 UTC 2018 - Fridrich Strba <fstrba@suse.com>
|
|
|
|
- Do not require asm to build. Nothing depends on it
|
|
|
|
-------------------------------------------------------------------
|
|
Fri Sep 29 08:44:29 UTC 2017 - fstrba@suse.com
|
|
|
|
- Minimum supported java is 1.8
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Jul 10 14:07:49 UTC 2017 - jengelh@inai.de
|
|
|
|
- Remove unused "%package javadoc" declaration block.
|
|
- Trim filler words from descriptions.
|
|
Say a thing about features.
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Jun 29 16:17:26 UTC 2017 - badshah400@gmail.com
|
|
|
|
- Update to version 6.6.0:
|
|
+ See https://lucene.apache.org/core/6_6_0/changes/Changes.html
|
|
for a full list of changes.
|
|
- Drop patches that are no longer applicable or needed:
|
|
+ lucene-no-classpath-in-manifest.patch
|
|
+ lucene-no-get.patch
|
|
+ lucene-2.3.0-db-javadoc.patch
|
|
- Add BuildRequires: antlr-java, apache-commons-codec, apache-ivy,
|
|
asm, fdupes, git
|
|
- Replace SOURCE0 by full source URL.
|
|
- Update to changed list of non-core modules:
|
|
+ Update source URL's for corresponding pom files.
|
|
+ Update %%install section to reflect changed list
|
|
+ Each module corresponds to a subpackage, named according to
|
|
its jar file (except lucene which corresponds to the main
|
|
jar file lucene-core-%{version}.jar).
|
|
- Adapt file list to changes.
|
|
|
|
-------------------------------------------------------------------
|
|
Fri May 19 09:11:42 UTC 2017 - dziolkowski@suse.com
|
|
|
|
- New build dependency: javapackages-local
|
|
|
|
-------------------------------------------------------------------
|
|
Wed Mar 18 09:46:17 UTC 2015 - tchvatal@suse.com
|
|
|
|
- Fix build with new javapackages-tools
|
|
|
|
-------------------------------------------------------------------
|
|
Fri Jun 27 14:02:20 UTC 2014 - tchvatal@suse.com
|
|
|
|
- Remove java-javdoc to build on sle11 again as the javadoc is
|
|
also pulled in regardless.
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Sep 10 14:00:29 UTC 2013 - mvyskocil@suse.com
|
|
|
|
- use add_maven_depmap from javapackages-tools
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Sep 9 11:06:13 UTC 2013 - tchvatal@suse.com
|
|
|
|
- Move from jpackage-utils to javapackage-tools
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Jun 26 13:53:26 UTC 2012 - mvyskocil@suse.cz
|
|
|
|
- build require java-javadoc >= 1.6.0
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Dec 10 13:23:15 UTC 2009 - mvyskocil@suse.cz
|
|
|
|
- refreshed patches
|
|
* lucene-2.3.0-db-javadoc.patch
|
|
* lucene-no-get.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Sep 29 12:57:08 UTC 2009 - mvyskocil@suse.cz
|
|
|
|
- fixed requires
|
|
|
|
-------------------------------------------------------------------
|
|
Tue May 26 13:59:07 CEST 2009 - mvyskocil@suse.cz
|
|
|
|
- fixed bnc#507014: removed all jars from source tarball
|
|
|
|
-------------------------------------------------------------------
|
|
Tue May 12 10:03:38 CEST 2009 - mvyskocil@suse.cz
|
|
|
|
- Initial SUSE packaging of lucene 2.4.1 (from jpp 5.0)
|
|
|