The old code only supports a file whose size is less then or equal
to INT_MAX (due to a reasonable(!) limit in M2Crypto). The actual
issue is in core.http_request which mmap(...)s the file, wraps it
into a memoryview/buffer and then passes the memoryview/buffer to
urlopen. Eventually, the whole memoryview/buffer is read into memory
(see m2_PyObject_GetBufferInt). If the file is too large (> INT_MAX),
m2_PyObject_GetBufferInt raises a ValueError (which is perfectly
fine!).
Reading a whole file into memory is completely insane. In order to
avoid this, we now simply pass a file-like object to urlopen (more
precisely, the file-like object is associated with the Request
instance that is passed to urlopen). The advantange is that the
file-like object is processed in chunks of 8192 bytes (see
http.client.HTTPConnection) (that is, only 8192 bytes are read into
memory (instead of the whole file)).
There are two pitfalls when passing a file-like object to urlopen:
* By default, a chunked Transfer-Encoding is applied. It seems that
some servers (like api.o.o) do not like this (PUTing a file with
a chunked Transfer-Encoding to api.o.o results in status 400). In
order to avoid a chunked Transfer-Encoding, we explicitly set a
Content-Length header (we also do this in the non-file case (just
for the sake of completeness)).
* If the request fails with status 401, it is retried with an
appropriate Authorization header. When retrying the request, the
file's offset has to be repositioned to the beginning of the file
(otherwise, a 0-length body is sent which most likely does not
match the Content-Length header).
Note: core.http_request's "data" and "file" parameters are now mutually
exclusive because specifying both makes no sense (only one of them
is considered) and it simplifies the implementation a bit.
Fixes: #202 ("osc user authentification seems to be broken with last
commit")
Fixes: #304 ("osc ci - cannot handle more than 2 GB file uploads")
This kind of guessing can not really work here and leads to failing
builds when using KVM. (eg. when using a preinstallimage)
Removing the code, since we have a now a way to allow the user to
specify building as user via su-wrapper config
Element.getchildren is deprecated and not available on python39
anymore. Instead, iterate over the element itself (which iterates
over the element's children).
Fixes: #903 ("AttributeError: 'xml.etree.ElementTree.Element' object
has no attribute 'getchildren'")
Most osc commands support slash notation for the specification of
a project package pair. That is, "osc <cmd> prj/pkg" has the same
semantics as "osc <cmd> prj pkg" (in most cases).
For consistency reasons, "osc creq" should also support the slash
notation for the action type's arguments. That is, for instance,
"osc creq -a submit src_prj/src_pkg dst_prj/dst_pkg" should have the
same effect as "osc creq -a submit src_prj src_pkg dst_prj dst_pkg".
Proposed-by: darix
If there are existing requests that should be superseded, the old
code stores the Request instances in the myreqs list, which is
returned to the caller. However, the caller expects only request
ids instead of instances of class Request. Eventually, this results
in a type error - excerpt:
...
File "/usr/lib/python3.8/site-packages/osc/commandline.py", line 1892, in do_createrequest
change_request_state(apiurl, srid, 'superseded',
File "/usr/lib/python3.8/site-packages/osc/core.py", line 4322, in change_request_state
u = makeurl(apiurl,
File "/usr/lib/python3.8/site-packages/osc/core.py", line 3326, in makeurl
return urlunsplit((scheme, netloc, '/'.join([path] + list(l)), query, ''))
TypeError: sequence item 2: expected str instance, Request found
Hence, simply return the request ids instead of the Request instances.
Note: this changes the API of the Osc._submit_request method but
this is OK because it is not part of the public API.
When calling "osc creq -a prj1 foo prj2 bar -a submit prj1 bar prj2 bar",
the requests that could be superseded are calculated two times for the
prj2/bar package. Hence, they could end up two times in the "supersede"
list (see do_createrequest) In order to avoid duplicates, use a set
instead of a list.
Kudos to darix for pointing this out!
Note: it is a bit questionable if osc's current semantics makes sense
in the above example.
When creating a new request via the core.Request.create method, there is
no need to escape the data that is assigned to the "description" attribute
of a core.Request instance. Internally, core.Request.create ensures that
the data, which is POSTed to the api, is correctly escaped (the escaping
is implicitly done by ET (see core.Request.to_str)). Manually escaping the
description results in a double escaping (the escaped description is
escaped by ET again) - this is not the desired behavior.
Analogously, there is no need to escape the data that is passed to the
message parameter of the core.create_submit_request function because
core.create_submit_request takes care of escaping it.
Fixes: #869 ("Silly encoding of htmlencodable entities")
Using os.getcwd() in combination with a subsequent .encode() is error
prone:
marcus@linux:~> mkdir illegal_utf-8_encoding_$'\xff'_dir
marcus@linux:~> cd illegal_utf-8_encoding_$'\xff'_dir/
marcus@linux:~/illegal_utf-8_encoding_ÿ_dir> python3
Python 3.8.6 (default, Nov 09 2020, 12:09:06) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.getcwd().encode()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' in position 36: surrogates not allowed
>>>
Hence, use os.getcwdb(), which returns a bytes, instead of
os.getcwd().encode().
Fixes: commit 36f7b8ffe9 ("Fix a
potential TypeError in CpioRead.copyin and CpioRead.copyin_file")
If no dir is passed to util.ArFile.saveTo, dir is set to os.getcwd(),
which returns a str. Since self.name is a bytes, the subsequent
os.path.join(dir, self.name) results in a TypeError.
To fix this, use os.getcwdb(), which returns a bytes instead of a
str.
This allows to utilise support for systemd-nspawn backend in build engine.
Like LXC, systemd-nspawn creates isolated lightweight container.
Signed-off-by: Oleg Girko <ol@infoserver.lv>
If no "dest" argument is specified when calling CpioRead.copyin or
CpioRead.copyin_file, a TypeError occurs in CpioRead._copyin_file
because os.getcwd(), which returns a str, is used as dest and, hence,
the subsequent os.path.join(...) fails (because it tries to join a
str and a bytes).
In order to avoid this, encode the result of os.getcwd().
Note that the existing
archive.copyin_file(hdr.filename,
os.path.dirname(tmpfile),
os.path.basename(tmpfile))
was OK because CpioRead._copyin_file os.path.join()s "dest" and
"new_fn", which are both str. It is just changed to stress that
CpioRead is a bytes-only API.
Fixes: #865 ("Traceback in osc/util/cpio.py line 128: TypeError:
Can't mix strings and bytes in path components")
Currently, when trying to initialize a non existent (server-side)
project via "osc init <prj>", osc errors out (after creating the wc)
because it fails to retrieve the package list. However, there is no
need to retrieve the package list in the "osc init <prj>" case. Hence,
skip the package list retrieval. As a result, osc does not error out.
For the background, see the discussion in #858 ("osc fails to check
out an empty project as project") [1].
[1] https://github.com/openSUSE/osc/issues/858#issuecomment-722330024
If meta=True is passed to checkRevision, the meta parameter is used
as a revision in the show_upstream_rev call. Instead, it should be
bound to show_upstream_rev's meta parameter.
Some services expect "old" service files (that is, files from a
previous service run) to be present in an ".old" dir. Hence, osc
should support that.
Instead of removing all files from a previous service run, move them
to the ".old" dir, run the services, and, finally, remove the ".old"
dir.
Unfortunately, the location of the ".old" dir is hardcoded in the
specific services. That is, we have to be careful if an ".old" dir
exists (in this case, we error out).
Based on [1].
[1] https://github.com/openSUSE/osc/pull/846
Currently, if the --offline option is passed to "osc build ...", a
preinstallimage is not used (even if it exists). Instead, a
preinstallimage should be used (if it exists) even if the --offline
option is specified.
This is faster in best case since the binary search does not need
to be executed on the server.
It also finds package names where no binary with that name exists.
(as for some multibuild cases)
Replace usage with better explanation. It was missing that it requires a
prefixed hostarch. Also workername is instead called workerid in the
API.
Usage help was before: osc workerinfo WORKER
Add actual example.
See also the fix for this in OBS API docs:
https://github.com/openSUSE/open-build-service/pull/10024
In the API a new request action release was implemented. This changes
enables the user to create a release request for non-maintenance projects
and to review / view the release requests
Without this patch, running an individual service that has parameters
defined in the _service file fails:
$ osc service run obs_scm
Please specify valid --scm=... options
Aborting: service call failed: /usr/lib/obs/service/obs_scm --outdir [snipped]
This is because although the service is defined in the _service file and
the "scm" parameter is set in it, the service wasn't being found in the
data structure and so the service executable wasn't being called with
the parameters supplied in the _service file. This patch corrects the
issue with the services data structure so that the service data isn't
overridden if it is defined in the _service file.
A side effect of this correction for services defined in the _service
file is that instead of overriding the service mode with '', the mode is
taken from the _service file. When using the "run" command, this would
mean that the call mode of None may not be in agreement with the service
mode defined in the file, e.g. "manual", and so the "run" command would
no longer cause it to run when it would before. We can take this
opportunity to define this as the correct behavior - the "run" command
now only runs services with "trylocal", "localonly", or no mode set -
and also ensure that other call mode commands result in sensible
behavior when called with a service name, for instance "osc service
manualrun download_files" will run only services with mode="manual" and
name="download"files" instead of all services with mode="manual".
Additionally, services that aren't defined in the _service file can be
called with a call mode command and will use that call mode rather than
None.
Add a "manual" service mode. A service with mode "manual" is not executed
by default (that is, via "osc service run"). As of now, "manual" is in
some sense just an alias for "disabled". However, this might change in the
future [1]. Also, "localrun" now executes services with mode "serveronly".
Moreover, the documentation of "disabledrun" is updated ("disabledrun"
never executed services with mode "serveronly"). Additionally, "localrun"
and "disabledrun" are marked as "[n]ot for common usage anymore" in the
service command's description.
The rationale for these changes is (partly) described in [1]. The main
motivation is to add some clarity (in contrast to the "disabled" mode,
it is probably easier to get/guess the semantics of the "manual" mode).
[1] https://github.com/openSUSE/osc/pull/826
The "disabledrun" service commands is marked as deprecated but has no
explicit replacement. It is still a useful command for updating packages
manually or through a CI system without being forced to run all defined
services with the "runall" command. This change adds a new command
"manualrun" and a new mode "manual" which behave the same as the
deprecated "disabledrun" command and "disabled" mode but have clearer
meaning. "manualrun" does not attempt backwards-compatible behavior with
the "disabledrun" mode for "disabled" services because "disabled" mode
may eventually be removed or change meaning. The "localrun" command is
enhanced to consider the "serveronly" mode. Since "disabledrun" never
executed services with mode "serveronly", its docs are updated
accordingly.
Improve error message in do_service a bit. The old "Too few arguments."
was misleading (for instance, if a non-remote command was not executed
in a package wc).
Note: with the new logic we could also get rid of the
"raise oscerr.WrongArgs('Local directory is no package')" statement.
Add a --status-filter option to "osc results" that can be used to
show, for instance, only the repos where a package failed to build. As
a short circuit, a -f/--failed option is added, too.
Add a --brief option to "osc prjresults" and "osc results" that can be
used to get a more compact representation of the results. In case of
"osc results", --brief is ignored if the results for a package are
requested.
in do_results:
* add --brief option on prj level:
[packagename] [repo] [arch] [buildstatus]
* filter by --status-filter <long status name>
works on prj and pkg level
in do_prjresults:
* --brief
* assume len(state)>1 as long state
core.py
* filter packages by build status
* long status handling in get_prj_results
* brief output generation in get_prj_results
Interprete unicode escape characters in a "--message" option. In some
cases, this breaks the existing UI (but that's OK because it can be
fixed by properly escaping the escape character(s)).
Note: if we are going to do more advanced stuff in the future, we should
move the logic into a separate function.
In commit 276d6e2439 ("Do not use the
chardet module in util.helper.decode_it") util.helper.decode_it was
changed to always decode the passed object if it has a decode method.
Since a python2 str has a decode method, the new code tries to utf-8
decode the passed str. As a result, a unicode object is returned (if
the decoding worked). Since a unicode object is not an instance of
type str, all subsequent isinstance(decoded_obj, str) checks evaluate
to False, which break some codepaths.
In order to fix this, restore the old python2 behavior (that is, if
the passed object is a str, it is not decode it). This change does not
affect the python3 codepaths.
Fixes: #814 ("osc log | fails")
Improve "osc rdiff --issues-only ..." output: now, it shows the added,
deleted and changed issues. Also, add a new "osc rdiff --xml ..." option,
which only works in combination with the "--issues-only" option: it prints
the raw xml.
Note: server_diff_noex has no option for the "full" parameter. Hence,
with the addition of the "xml=False" parameter, the signatures of the
server_diff_noex and server_diff functions are going to differ "forever".
That's OK (IMHO) because it is probably more sane to simply specify the
additional args via the kwargs syntax.
Remove dead code from the fetch module. Actually, it should have
been removed in commit 95ec7dee7b
('- fixed#590606 ("osc/fetch.py does not support authenticated URLs")').
In general, decode_it is used to get a str from an arbitrary bytes
instance. For this, decode_it used the chardet module (if present)
to detect the underlying encoding (if the bytes instance corresponds
to a "supported" encoding). The drawback of this detection is that
it can take quite some time in case of a large bytes instance, which
represents no "supported" encoding (see #669 and #746).
Instead of doing a potentially "time consuming" detection, either
assume an utf-8 encoding or a latin-1 encoding. Rationale: it is just
not worth the effort to detect a _potential_ encoding because we have
no clue what the _correct_ encoding is. For instance, consider the
following bytes instance:
b'This character group is not supported: [abc\xc3\xbf]'
It represents a valid utf-8 and latin-1 encoding. What is the "correct"
one? We don't know... Even if you interpret the bytes instance as a
human you cannot give a definite answer (implicit assumption: there is
no additional context available).
That is, if we cannot give a definite answer in case of two potential
encodings, there is no point in bringing even more potential encodings
into play. Hence, do not use the chardet module.
Note: the rationale for trying utf-8 first is that utf-8 is pretty
much in vogue these days and, hence, the chances are "high" that we
guess the "correct" encoding.
Fixes: #669 ("check in huge shell archives is insanely slow")
Fixes: #746 ("Very slow local buildlog parsing")
Importing `cElementTree` has been deprecated since Python 3.3 -
importing `ElementTree` automatically uses the fastest
implementation available - and is finally removed in Python 3.9.
Importing cElementTree directly (not as part of xml) is an even
older relic, it's for Ye Time Before ElementTree Was Added To
Python and it was instead an external module...which was before
Python 2.5.
We still need to work with Python 2.7 for now, so we use a try/
except to handle both 2.7 and 3.9 cases. Also, let's not repeat
this import 12 times in one file for some reason.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This checks if the filename of a downloaded file has
been modified (for example by a MITM attack) to contain
slashes. This could mean that the file is compromised
and that the attacker tries to overwrite system files.
split the code of do_dependson into two separate commands (just for
the osc help overview)
They are doing the opposite of each other.
Duplicate code was moved to _dependson()
do_whatdependson and do_dependson just call _dependson with an option
reverse set to None or 1.
add new regex and check for missing arguments.
The error message in python3 differs from the one in python2.
python3:
do_api() missing 1 required positional argument: 'url'
python2:
do_api() takes exactly 4 arguments (3 given)
To be compatible with python2 two checks are needed.
The repodata.RepoDataQueryResult is supposed to be a bytes API and
that's what our users (see build module) expect.
Note that the repodata.RepoDataQueryResult.path method still returns
a str. That's what the rpmquery.RpmQuery, debquery.DebQuery, and
archquery.ArchQuery classes also do (if the "path" was initially
passed as a str).
Fixes: #760 ("osc build fails when called with --prefer-pkgs where the
passed directory is a repodata repository or a subdirectory of one")
The packagequery.PackageQueryResult class is supposed to provide a
bytes API. Hence, packagequery.PackageQueryResult.evr() should return
bytes instead of a str. Also, adjust the single caller in the build
module.
This is a follow-up commit for commit
6dbf103e10 ("Use html.escape instead
removed cgi.escape"), which breaks the python2 backward compatibility
(since the "html" module is not available by default) and also breaks
the code in general (due to missing html imports).
The fix is based on the proposed fix in [1].
Fixes: boo#1166537 ("osc rq accept - forwarding request causes backtrace")
[1] https://github.com/openSUSE/osc/pull/764
Fixes:
`Traceback (most recent call last):
File "/usr/bin/osc", line 41, in <module>
r = babysitter.run(osccli)
File "/usr/lib/python3.8/site-packages/osc/babysitter.py", line 64, in run
return prg.main(argv)
File "/usr/lib/python3.8/site-packages/osc/cmdln.py", line 344, in main
return self.cmd(args)
File "/usr/lib/python3.8/site-packages/osc/cmdln.py", line 367, in cmd
retval = self.onecmd(argv)
File "/usr/lib/python3.8/site-packages/osc/cmdln.py", line 501, in onecmd
return self._dispatch_cmd(handler, argv)
File "/usr/lib/python3.8/site-packages/osc/cmdln.py", line 1232, in _dispatch_cmd
return handler(argv[0], opts, *args)
File "/usr/lib/python3.8/site-packages/osc/commandline.py", line 1458, in do_submitrequest
result = create_submit_request(apiurl,
File "/usr/lib/python3.8/site-packages/osc/core.py", line 4244, in create_submit_request
cgi.escape(message))
AttributeError: module 'cgi' has no attribute 'escape'
`
`cgi.escape` was deprecated in python 3.2
On Tumbleweed, `zypper in python-keyring` installs python2 version, while `osc` runs on python3.
After this change, user will be pointed to the correct version.
The correct zst magic is b'(\xb5/\xfd' (4 bytes) (that's what obs-build
is also using).
Kudos to Tobias Ellinghaus for spotting this.
Fixes: #756 ("zst detection fails")
osc importsrcpkg -n <pacname> does not work. If the option is supplied, osc
mistakenly trys to "decode" the pac object. This patch limit the decode
call when pac is not a string.
Refactored fix based on suggestions from marcus-h