github.com_openSUSE_osc

mirror of https://github.com/openSUSE/osc.git synced 2024-09-20 09:16:16 +02:00

Author	SHA1	Message	Date
Marcus Huewe	1933da5bcc	Use os.getcwdb() instead of os.getcwd().encode() in util.cpio.CpioRead Using os.getcwd() in combination with a subsequent .encode() is error prone: marcus@linux:~> mkdir illegal_utf-8_encoding_$'\xff'_dir marcus@linux:~> cd illegal_utf-8_encoding_$'\xff'_dir/ marcus@linux:~/illegal_utf-8_encoding_ÿ_dir> python3 Python 3.8.6 (default, Nov 09 2020, 12:09:06) [GCC] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.getcwd().encode() Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' in position 36: surrogates not allowed >>> Hence, use os.getcwdb(), which returns a bytes, instead of os.getcwd().encode(). Fixes: commit `36f7b8ffe9` ("Fix a potential TypeError in CpioRead.copyin and CpioRead.copyin_file")	2020-11-22 17:39:54 +01:00
Marcus Huewe	674ea78815	Avoid a potential TypeError in util.ArFile.saveTo If no dir is passed to util.ArFile.saveTo, dir is set to os.getcwd(), which returns a str. Since self.name is a bytes, the subsequent os.path.join(dir, self.name) results in a TypeError. To fix this, use os.getcwdb(), which returns a bytes instead of a str.	2020-11-22 17:36:17 +01:00
Marcus Huewe	36f7b8ffe9	Fix a potential TypeError in CpioRead.copyin and CpioRead.copyin_file If no "dest" argument is specified when calling CpioRead.copyin or CpioRead.copyin_file, a TypeError occurs in CpioRead._copyin_file because os.getcwd(), which returns a str, is used as dest and, hence, the subsequent os.path.join(...) fails (because it tries to join a str and a bytes). In order to avoid this, encode the result of os.getcwd(). Note that the existing archive.copyin_file(hdr.filename, os.path.dirname(tmpfile), os.path.basename(tmpfile)) was OK because CpioRead._copyin_file os.path.join()s "dest" and "new_fn", which are both str. It is just changed to stress that CpioRead is a bytes-only API. Fixes: #865 ("Traceback in osc/util/cpio.py line 128: TypeError: Can't mix strings and bytes in path components")	2020-11-20 09:55:09 +01:00
Marcus Huewe	d85030b72d	Fix python2 regression in util.helper.decode_it In commit `276d6e2439` ("Do not use the chardet module in util.helper.decode_it") util.helper.decode_it was changed to always decode the passed object if it has a decode method. Since a python2 str has a decode method, the new code tries to utf-8 decode the passed str. As a result, a unicode object is returned (if the decoding worked). Since a unicode object is not an instance of type str, all subsequent isinstance(decoded_obj, str) checks evaluate to False, which break some codepaths. In order to fix this, restore the old python2 behavior (that is, if the passed object is a str, it is not decode it). This change does not affect the python3 codepaths. Fixes: #814 ("osc log \| fails")	2020-06-25 15:38:14 +02:00
Marcus Huewe	276d6e2439	Do not use the chardet module in util.helper.decode_it In general, decode_it is used to get a str from an arbitrary bytes instance. For this, decode_it used the chardet module (if present) to detect the underlying encoding (if the bytes instance corresponds to a "supported" encoding). The drawback of this detection is that it can take quite some time in case of a large bytes instance, which represents no "supported" encoding (see #669 and #746). Instead of doing a potentially "time consuming" detection, either assume an utf-8 encoding or a latin-1 encoding. Rationale: it is just not worth the effort to detect a _potential_ encoding because we have no clue what the _correct_ encoding is. For instance, consider the following bytes instance: b'This character group is not supported: [abc\xc3\xbf]' It represents a valid utf-8 and latin-1 encoding. What is the "correct" one? We don't know... Even if you interpret the bytes instance as a human you cannot give a definite answer (implicit assumption: there is no additional context available). That is, if we cannot give a definite answer in case of two potential encodings, there is no point in bringing even more potential encodings into play. Hence, do not use the chardet module. Note: the rationale for trying utf-8 first is that utf-8 is pretty much in vogue these days and, hence, the chances are "high" that we guess the "correct" encoding. Fixes: #669 ("check in huge shell archives is insanely slow") Fixes: #746 ("Very slow local buildlog parsing")	2020-06-04 13:12:22 +02:00
Adam Williamson	13a13a87c4	Fix ElementTree imports for Python 3.9 Importing `cElementTree` has been deprecated since Python 3.3 - importing `ElementTree` automatically uses the fastest implementation available - and is finally removed in Python 3.9. Importing cElementTree directly (not as part of xml) is an even older relic, it's for Ye Time Before ElementTree Was Added To Python and it was instead an external module...which was before Python 2.5. We still need to work with Python 2.7 for now, so we use a try/ except to handle both 2.7 and 3.9 cases. Also, let's not repeat this import 12 times in one file for some reason. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2020-06-02 15:13:10 -07:00
Marcus Huewe	55aef1a014	Convert repodata.RepoDataQueryResult to a bytes API The repodata.RepoDataQueryResult is supposed to be a bytes API and that's what our users (see build module) expect. Note that the repodata.RepoDataQueryResult.path method still returns a str. That's what the rpmquery.RpmQuery, debquery.DebQuery, and archquery.ArchQuery classes also do (if the "path" was initially passed as a str). Fixes: #760 ("osc build fails when called with --prefer-pkgs where the passed directory is a repodata repository or a subdirectory of one")	2020-03-15 18:30:28 +01:00
Marcus Huewe	cd51f47a77	Return bytes in packagequery.PackageQueryResult.evr() instead of a str The packagequery.PackageQueryResult class is supposed to provide a bytes API. Hence, packagequery.PackageQueryResult.evr() should return bytes instead of a str. Also, adjust the single caller in the build module.	2020-03-15 18:30:00 +01:00
Marcus Huewe	33bbc57b5f	Fix the previously introduced escaping via the html module This is a follow-up commit for commit `6dbf103e10` ("Use html.escape instead removed cgi.escape"), which breaks the python2 backward compatibility (since the "html" module is not available by default) and also breaks the code in general (due to missing html imports). The fix is based on the proposed fix in [1]. Fixes: boo#1166537 ("osc rq accept - forwarding request causes backtrace") [1] https://github.com/openSUSE/osc/pull/764	2020-03-12 23:00:47 +01:00
Marcus Huewe	4e8e0492e8	Fix arch zst magic in util.packagequery The correct zst magic is b'(\xb5/\xfd' (4 bytes) (that's what obs-build is also using). Kudos to Tobias Ellinghaus for spotting this. Fixes: #756 ("zst detection fails")	2020-02-26 20:04:26 +01:00
lethliel	95c68dc3f0	import oscerr in helper.py	2020-02-20 08:45:02 +01:00
Adrian Schröter	5f2721d8f6	- support zstd arch linux files in local build Note: This requires a tar executable supporting zstd	2020-01-09 15:49:54 +01:00
lethliel	c9d85ac248	move raw_input function to helper module	2019-08-27 15:17:53 +02:00
Marcus Huewe	e5c4a10673	Merge branch 'dont_decode_None' of https://github.com/lethliel/osc Do not try to decode None in decode_it (in this case None is returned).	2019-07-26 14:35:03 +02:00
lethliel	a802df15ad	return the obj if None type is passed to decode_it If a obj of type None is passed to decode_it just return it and do not try to decode it as this will fail	2019-07-26 14:22:26 +02:00
lethliel	2aa6e998d2	fix and unify building of local package cache * all filename functions now return bytes-like objects * the caller does the decoding * the caller in build.py passes encoded arguments	2019-07-26 13:38:45 +02:00
lethliel	5841bf759f	add exception if encoding fails and try ISO-8859-1 In some rare cases the chardet encoding detection detects a wrong encoding standard. Then we switch to latin-1 which covers most if utf-8 does not work.	2019-04-16 14:40:13 +02:00
Marco Strigl	71770555ac	Merge pull request #526 from lethliel/python3_fix_debquery_decoding [python3] fix decoding issue in debquery.py	2019-04-15 15:10:21 +02:00
Marco Strigl	6c074fce20	Merge pull request #524 from lethliel/python3_packagequery_fix_decoding [python3] fix decoding for packageqeury.py	2019-04-15 15:09:59 +02:00
Marco Strigl	41ced89fcc	Merge pull request #522 from lethliel/python3_repodata_module [python3] fix epoch encoding in repodata.py	2019-04-15 15:07:50 +02:00
Marco Strigl	0086bcfa64	Merge pull request #483 from lethliel/python3_rpmquery_module [python3] rpmquery.py now python3 ready	2019-04-15 15:02:59 +02:00
Marco Strigl	d5108c7536	Merge pull request #464 from lethliel/python3_utils_helper [python3] add helper functions for python3 support	2019-04-15 15:00:58 +02:00
lethliel	60c6ec2b52	[python3] fix decoding issue in debquery.py name, version, release and arch are strings, not bytes	2019-04-07 14:44:25 -05:00
lethliel	c235148180	[python3] now python3 ready: * new function cmp (not available in python3) * fix decoding in canonname function	2019-04-07 11:05:13 -05:00
lethliel	c6d3870942	[python3] fix decoding for packageqeury.py name, arch, version and release need to be decoded	2019-04-07 10:31:23 -05:00
lethliel	87628a4150	[python3] fix epoch encoding in repodata.py other.epoch() needs to be encoded to work with the vercmp callers.	2019-04-07 10:19:47 -05:00
Marcus Huewe	c534d7e990	Fix logic error in DebQuery.vercmp res is never None, because DebQuery.rpmvercmp always returns -1, 0, or 1.	2019-01-27 19:43:38 +01:00
Marcus Huewe	cd5f46984d	Port debquery module to python3 No functional changes. Note that we cannot simply decode the control's fields as ascii/utf-8 because a field is not necessarily a valid ascii/utf-8 encoding (it is possible to register _arbitrary_ custom fields via a 'register-custom-fields' hook when building a deb package). Note: DebQuery.debvercmp really deserves a cleanup:/	2019-01-27 19:31:47 +01:00
Marcus Huewe	bb9f9a7fde	Refactor DebQuery.__parse_control a bit No functional changes. This just simplifies the upcoming python3 port a bit.	2019-01-27 17:35:47 +01:00
Marcus Huewe	f63a0957af	Remove superfluous try-except block in the archquery module ArchQuery.query never raises an ArchError exception.	2019-01-27 16:51:58 +01:00
Marcus Huewe	2074a1c01d	Make ArchQuery.canonname more robust against None values Use ArchQuery.filename to construct the filename and raise an ArchError exception if we are unable to construct a filename.	2019-01-27 16:46:52 +01:00
Marcus Huewe	8c1cb190bd	Port the missing pieces of the archquery module to python3 This is a follow-up commit for commit `21eca9e3f1` ("[python3] switch ArchQuery to bytestrings").	2019-01-27 16:27:30 +01:00
Marcus Huewe	2d0c974296	Add cmp function to packagequery module cmp(a, b) returns -1 if a < b 0 if a == 0 1 if a > b This is needed since python3 has no cmp function anymore. All credits for this go to Marco Strigl <mstrigl@suse.com> (see PR#483 [1]). [1] https://github.com/openSUSE/osc/pull/483	2019-01-27 16:12:57 +01:00
Marcus Huewe	a3720c5286	Fix ArchQuery.rpmvercmp if one of its arguments is None The None argument is always <= than the other argument. We need this in case of a broken/pathological package where version() or release() return None (see vercmp (which calls rpmvercmp)).	2019-01-27 15:50:35 +01:00
Marcus Huewe	5c639db805	ArchQuery.epoch should never return None Returning None breaks ArchQuery.vercmp. Returning b'0' is ok because an epoch, if present, is always supposed to be an integer (at least in a "valid" arch package (see scripts/libmakepkg/lint_pkgbuild/epoch.sh.in in the pacman sources)). Hence, if we compare the epoch of a package, which has no explicit epoch set, with the epoch of a package, which has an explicit epoch set, we always have a <= relation.	2019-01-27 15:39:07 +01:00
Marcus Huewe	deee8ef6cb	Fix logic error in ArchQuery.vercmp res is never None, because ArchQuery.rpmvercmp always returns -1, 0, or 1.	2019-01-27 15:00:36 +01:00
Marcus Huewe	562374f045	Simplify ArchQuery.read a bit No functional changes - just to improve readability.	2019-01-27 14:57:47 +01:00
Marcus Huewe	e580769757	Merge branch 'python3_archquery_module' of https://github.com/lethliel/osc Initial port of the archquery module to python3 (ArchQuery.__init__, ArchQuery.read, and ArchQuery.canonname are ported - the rest is missing).	2019-01-27 14:55:01 +01:00
lethliel	21eca9e3f1	[python3] switch ArchQuery to bytestrings decode explicit (ascii)	2019-01-23 22:59:55 +01:00
Marco Strigl	f233066448	Merge pull request #482 from lethliel/python3_packagequery_module [python3] magic is now a bytestring in python3	2019-01-18 14:34:43 +01:00
Marcus Huewe	e60af6f120	Use with statement in CpioRead._copyin_file This makes sure that the file is closed in case of an exception.	2019-01-15 20:49:26 +01:00
Marcus Huewe	5387744d36	Port CpioWrite to python3 Now, CpioWrite provides a bytes-only API. It would be also possible that the API accepts bytes and str (we would need to explicitly encode the latter) but this would be a bit inconsistent wrt. cpio.CpioRead (which is bytes-only). Also, by using a bytesarray instead of a [] we avoid several intermediate ''.join(...)s.	2019-01-15 20:48:42 +01:00
Marcus Huewe	3e326b1bb4	Port CpioRead and CpioHdr to python3 This is a bytes only API because a filename in a cpio archive can contain, for instance, illegal utf-8 sequences. A user can decode the filename/content as she wishes.	2019-01-15 20:05:47 +01:00
Marcus Huewe	54ac438eb0	Do not mmap a cpio archive There is simply no need for a mmap.	2019-01-15 19:47:27 +01:00
Marcus Huewe	1c4385a579	Run a small demo when the cpio module is invoked as a script It just reads in a cpio archive and print the headers.	2019-01-15 19:46:00 +01:00
Marcus Huewe	5c19425c9b	Use with statement in ArFile.saveTo This makes sure that the file is closed in case of an exception.	2019-01-15 17:18:50 +01:00
Marcus Huewe	b26a4a967d	Raise a ValueError if neither fn nor fh is passed to Ar.__init__ A ValueError is more appropriate because there is no issue with the ar archive itself. Also, the old codepath never worked because the fn parameter was missing.	2019-01-15 17:18:50 +01:00
Marcus Huewe	6fdce86fc9	Port the ar module to python3 Since an ar archive can contain arbitary filenames (that is a filename can be an invalid utf-8 encoding (for instance, "foo\xff\xffbar")), the ar module provides a bytes only API. A user can decode filenames as she wishes. Note: if a "fn" parameter is passed to Ar.__init__ it should be a bytes (a str is also ok, but then be aware that an ArError's file attribute might be a str or a bytes).	2019-01-15 17:18:37 +01:00
Marcus Huewe	68cf974c78	Do not mmap the ar archive There is really no need for a mmap here. Also, the comment in the docstr does not apply/is nonsense (there is no performance gain).	2019-01-15 17:18:19 +01:00
Marcus Huewe	e12181b11d	An ext fn header in an ar file has no mode Use a dummy mode of 0 in this case (internally, the mode is never used).	2019-01-15 17:18:19 +01:00

1 2 3

122 Commits