This is a follow-up commit for commit
6dbf103e10 ("Use html.escape instead
removed cgi.escape"), which breaks the python2 backward compatibility
(since the "html" module is not available by default) and also breaks
the code in general (due to missing html imports).
The fix is based on the proposed fix in [1].
Fixes: boo#1166537 ("osc rq accept - forwarding request causes backtrace")
[1] https://github.com/openSUSE/osc/pull/764
The correct zst magic is b'(\xb5/\xfd' (4 bytes) (that's what obs-build
is also using).
Kudos to Tobias Ellinghaus for spotting this.
Fixes: #756 ("zst detection fails")
In some rare cases the chardet encoding detection detects
a wrong encoding standard. Then we switch to latin-1 which
covers most if utf-8 does not work.
No functional changes. Note that we cannot simply decode the control's
fields as ascii/utf-8 because a field is not necessarily a valid
ascii/utf-8 encoding (it is possible to register _arbitrary_ custom
fields via a 'register-custom-fields' hook when building a deb
package).
Note: DebQuery.debvercmp really deserves a cleanup:/
cmp(a, b) returns
-1 if a < b
0 if a == 0
1 if a > b
This is needed since python3 has no cmp function anymore.
All credits for this go to Marco Strigl <mstrigl@suse.com> (see
PR#483 [1]).
[1] https://github.com/openSUSE/osc/pull/483
The None argument is always <= than the other argument. We need this
in case of a broken/pathological package where version() or release()
return None (see vercmp (which calls rpmvercmp)).
Returning None breaks ArchQuery.vercmp. Returning b'0' is ok because
an epoch, if present, is always supposed to be an integer (at least
in a "valid" arch package (see scripts/libmakepkg/lint_pkgbuild/epoch.sh.in
in the pacman sources)). Hence, if we compare the epoch of a package,
which has no explicit epoch set, with the epoch of a package, which
has an explicit epoch set, we always have a <= relation.
Now, CpioWrite provides a bytes-only API. It would be also possible
that the API accepts bytes and str (we would need to explicitly
encode the latter) but this would be a bit inconsistent wrt.
cpio.CpioRead (which is bytes-only).
Also, by using a bytesarray instead of a [] we avoid several
intermediate ''.join(...)s.
This is a bytes only API because a filename in a cpio archive can
contain, for instance, illegal utf-8 sequences. A user can decode
the filename/content as she wishes.
A ValueError is more appropriate because there is no issue with the
ar archive itself. Also, the old codepath never worked because the
fn parameter was missing.
Since an ar archive can contain arbitary filenames (that is a
filename can be an invalid utf-8 encoding (for instance,
"foo\xff\xffbar")), the ar module provides a bytes only API. A
user can decode filenames as she wishes.
Note: if a "fn" parameter is passed to Ar.__init__ it should be a
bytes (a str is also ok, but then be aware that an ArError's file
attribute might be a str or a bytes).
There is no need to unpack a single byte because it is not
affected by (byte) endianness (and that's what struct.unpack is
about). Moreover, rpmquery.unpack_string now supports an optional
encoding parameter, which could be used by the python3 port to
decode a string. Note: in general we cannot assume that all strings
in a rpm are utf-8 encoded (it is possible to build a rpm that
contains illegal utf-8 sequences).
This functions are used in the whole code and are
mandatory for the python3 support to work. In python2
case nothing is touched.
* cmp_to_key:
converts a cmp= into a key= function
* decode_list:
decodes each element of a list. This is needed if
we have a mixed list with strings and bytes.
* decode_it:
Takes the input and checks if it is not a string.
Then it uses chardet to get the encoding.
Storing the error encoding in an "encoding" attribute "breaks" the
python3 "input" function: In essence, builtin_input_impl does a
getattr(sys.stdout, 'encoding'), which returns our error encoding
instead of the "real" stdout encoding. In order to avoid this, we
store the error encoding in an "_encoding" attribute.
Making SafeWriter a new-style class simplifies the code a lot.
The following abstract methods are added to the PackageQueryResult
class: recommends(), suggests(), supplements(), and enhances().
Note that not all package/metadata formats have a notion of these
weak dependencies.
rpm rpmmd deb arch
recommends x x x
suggests x x x x
supplements x x
enhances x x x
(where "x" represents "supported"). In case of an unsupported weak
dependency, the implementation returns an empty list.
We need the weak dependency support in order to fix#363 ("osc build
-p ../rpms/tw doesnt send recommends to the server which makes client
side build behave differently to server side build").
Similar to recent fixes in libsolv and obs-build. Since tarfile
on python2 doesn't do lzma, decompress the file into memory and
feed it as a fake file via StringIO to tarfile
The most visible change in python3 - removal of print statement and all
the crufty
print >> sys.stderr, foo,
The from __future__ import print_function makes it available in python
2.6
Some modules (httplib, StringIO, ...) were renamed in python3. This
patch try to import the proper symbols from python3 and then fallback to
python2 in a case ImportError will appear.
There is one exception, python 2.7 got the io module with StringIO, but
it allow unicode arguments only. Therefor the old module is poked before
new one.
this patch
1.) removes the iteritems/itervalues, which were dropped in py3
items/values are used instead
2.) add an extra list() in a cases the list-based access is needed
(included appending, indexing and so)
3.) changes a sorting idiom in few places
instead of
foo = dict.keys()
foo.sort()
for i in foo:
there is a recommended
for i in sorted(dict.keys()):
4.) in one occassion it removes a if dict.has_key() by simpler
dict.get(key, default)
- util/rpmquery:
* added new methods "is_src", "is_nosrc" to check if the package is
a src rpm or nosrc rpm
* fixed "canonname": this never worked for src- or nosrc rpms
- minor code restructuring
Note:
in order to fetch the cpio archives osc uses "getbinarylist". The
drawback is that "getbinarylist" doesn't generate an ".errors" file
if we're requesting a non-existent filename.
Any directory passed to --prefer-pkgs will be searched for a repodata
directory. If the directory does not contain a repodata directory, then
each ancestor directory is checked. This allows for the user error of
specifying an individual architecture directory (e.g. x86_64) instead of the
parent repository directory that contains the repodata:
repository/
x86_64/
*.rpm
repodata/
*.xml.gz
The use case for this feature is it allows snapshots of the OBS repositories
to be offloaded to an network-attached filesystem. repodata directories are
used as the xml.gz files are faster to read than the 100s of rpms in a given
snapshot. These snapshots are used to track older rpm sets that may be
deployed for testing.
- util/packagequery.py: added vercmp(pkgq) method
- util/debquery.py: currently vercmp(degq) is only a dummy method. The real implementation will follow soon.
* util/packagequery.py: it's used to query a RPM or DEB package. It also contains a
base class for all package types (PackageQuery())
* util/debquery.py: query a DEB package (name, version, release, provides, requires etc.)
- adapted util/rpmquery.py to use PackageQuery() as a base class
- minor changes in util/ar.py