In commit 276d6e2439 ("Do not use the
chardet module in util.helper.decode_it") util.helper.decode_it was
changed to always decode the passed object if it has a decode method.
Since a python2 str has a decode method, the new code tries to utf-8
decode the passed str. As a result, a unicode object is returned (if
the decoding worked). Since a unicode object is not an instance of
type str, all subsequent isinstance(decoded_obj, str) checks evaluate
to False, which break some codepaths.
In order to fix this, restore the old python2 behavior (that is, if
the passed object is a str, it is not decode it). This change does not
affect the python3 codepaths.
Fixes: #814 ("osc log | fails")
Improve "osc rdiff --issues-only ..." output: now, it shows the added,
deleted and changed issues. Also, add a new "osc rdiff --xml ..." option,
which only works in combination with the "--issues-only" option: it prints
the raw xml.
Note: server_diff_noex has no option for the "full" parameter. Hence,
with the addition of the "xml=False" parameter, the signatures of the
server_diff_noex and server_diff functions are going to differ "forever".
That's OK (IMHO) because it is probably more sane to simply specify the
additional args via the kwargs syntax.
Remove dead code from the fetch module. Actually, it should have
been removed in commit 95ec7dee7b
('- fixed#590606 ("osc/fetch.py does not support authenticated URLs")').
In general, decode_it is used to get a str from an arbitrary bytes
instance. For this, decode_it used the chardet module (if present)
to detect the underlying encoding (if the bytes instance corresponds
to a "supported" encoding). The drawback of this detection is that
it can take quite some time in case of a large bytes instance, which
represents no "supported" encoding (see #669 and #746).
Instead of doing a potentially "time consuming" detection, either
assume an utf-8 encoding or a latin-1 encoding. Rationale: it is just
not worth the effort to detect a _potential_ encoding because we have
no clue what the _correct_ encoding is. For instance, consider the
following bytes instance:
b'This character group is not supported: [abc\xc3\xbf]'
It represents a valid utf-8 and latin-1 encoding. What is the "correct"
one? We don't know... Even if you interpret the bytes instance as a
human you cannot give a definite answer (implicit assumption: there is
no additional context available).
That is, if we cannot give a definite answer in case of two potential
encodings, there is no point in bringing even more potential encodings
into play. Hence, do not use the chardet module.
Note: the rationale for trying utf-8 first is that utf-8 is pretty
much in vogue these days and, hence, the chances are "high" that we
guess the "correct" encoding.
Fixes: #669 ("check in huge shell archives is insanely slow")
Fixes: #746 ("Very slow local buildlog parsing")
Old xml.etree.cElementTree versions (python2) reorder the attributes
while recent xml.etree.cElementTree versions (python3) keep the
document order.
Example:
python3:
>>> ET.tostring(ET.fromstring('<foo y="foo" x="bar"/>'))
b'<foo y="foo" x="bar" />'
>>>
python2:
>>> ET.tostring(ET.fromstring('<foo y="foo" x="bar"/>'))
'<foo x="bar" y="foo" />'
>>>
So far, the testsuite compared two serialized XML documents via a simple
string comparison. For instance via,
self.assertEqual(actual_serialized_xml, expected_serialized_xml) where
the expected_serialized_xml is, for instance, a hardcoded str. Obviously,
this would only work for python2 or python3.
In order to support both python versions, we first parse both XML
documents and then compare the corresponding trees (this is OK because
we do not care about comments etc.).
A related issue is the way how the testsuite compares data that is "send"
to the API. So far, this was a plain bytes comparison. Again, this won't
work in case of XML documents (see above). Moreover, we have currently
no notion to "indicate" that the transmitted data is an XML document.
As a workaround, we keep the plain bytes comparison and in case it fails,
we try an xml comparison (see above) as a last resort. Strictly speaking,
this is "wrong" (there might be cases (in the future) where we want to
ensure that the transmitted XML data is bit identical to a fixture file)
but a reasonable comprise for now.
Fixes: #751 ("[python3.8] Testsuite fails")
For now, we assume that if the "exp" keyword argument is specified,
then it is a str. In this case, we simply encode it (using the utf-8
encoding).
Also, simplify the code a bit (get rid of the if-statement that is
always executed).
Importing `cElementTree` has been deprecated since Python 3.3 -
importing `ElementTree` automatically uses the fastest
implementation available - and is finally removed in Python 3.9.
Importing cElementTree directly (not as part of xml) is an even
older relic, it's for Ye Time Before ElementTree Was Added To
Python and it was instead an external module...which was before
Python 2.5.
We still need to work with Python 2.7 for now, so we use a try/
except to handle both 2.7 and 3.9 cases. Also, let's not repeat
this import 12 times in one file for some reason.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This checks if the filename of a downloaded file has
been modified (for example by a MITM attack) to contain
slashes. This could mean that the file is compromised
and that the attacker tries to overwrite system files.
split the code of do_dependson into two separate commands (just for
the osc help overview)
They are doing the opposite of each other.
Duplicate code was moved to _dependson()
do_whatdependson and do_dependson just call _dependson with an option
reverse set to None or 1.
add new regex and check for missing arguments.
The error message in python3 differs from the one in python2.
python3:
do_api() missing 1 required positional argument: 'url'
python2:
do_api() takes exactly 4 arguments (3 given)
To be compatible with python2 two checks are needed.
Callers of core.print_buildlog should properly quote the project, package,
repo, and arch parameters. Actually, doing this at the caller's level
is pretty insane but this way, we do not break the existing API (of
core.print_buildlog).
In the future, we should move all the quoting logic into core.makeurl
(even if this breaks the API).
Add ccache option to the oscrc (default: disabled). If enabled, the
--ccache option is always passed to the build script (if invoked via
osc.build.main).
Improve deployment via travis. Unfortunately, style + semantics changes
are in a single commit... but let's not be too picky. See also the
discussion in [1].
[1] https://github.com/openSUSE/osc/pull/739
* `.travis.yml`
- Reformatted for easier reading
- Used `before_*` statements instead of script chains
- Publish only source packages
* `setup.py`
- Reformatted for easier reading
- Use README contents for `long_description` to have a nice description on PyPI
- Added classifiers
- Added explicit package dependencies
* fixes#658
* fixes#708