1
0
mirror of https://github.com/openSUSE/osc.git synced 2025-01-14 09:36:21 +01:00
github.com_openSUSE_osc/osc
Marcus Huewe 276d6e2439 Do not use the chardet module in util.helper.decode_it
In general, decode_it is used to get a str from an arbitrary bytes
instance. For this, decode_it used the chardet module (if present)
to detect the underlying encoding (if the bytes instance corresponds
to a "supported" encoding). The drawback of this detection is that
it can take quite some time in case of a large bytes instance, which
represents no "supported" encoding (see #669 and #746).
Instead of doing a potentially "time consuming" detection, either
assume an utf-8 encoding or a latin-1 encoding. Rationale: it is just
not worth the effort to detect a _potential_ encoding because we have
no clue what the _correct_ encoding is. For instance, consider the
following bytes instance:

b'This character group is not supported: [abc\xc3\xbf]'

It represents a valid utf-8 and latin-1 encoding. What is the "correct"
one? We don't know... Even if you interpret the bytes instance as a
human you cannot give a definite answer (implicit assumption: there is
no additional context available).
That is, if we cannot give a definite answer in case of two potential
encodings, there is no point in bringing even more potential encodings
into play. Hence, do not use the chardet module.

Note: the rationale for trying utf-8 first is that utf-8 is pretty
much in vogue these days and, hence, the chances are "high" that we
guess the "correct" encoding.

Fixes: #669 ("check in huge shell archives is insanely slow")
Fixes: #746 ("Very slow local buildlog parsing")
2020-06-04 13:12:22 +02:00
..
util Do not use the chardet module in util.helper.decode_it 2020-06-04 13:12:22 +02:00
__init__.py replace urlgrabber to enable python3 compatibility 2018-10-19 09:31:37 +02:00
.gitignore convert svn:ignore to gitignore 2009-12-03 19:19:53 +01:00
babysitter.py custom exception if importing m2crypto fails 2020-02-18 17:35:29 +00:00
build.py Fix ElementTree imports for Python 3.9 2020-06-02 15:13:10 -07:00
checker.py Resolve PEP8 issue W291 2014-08-12 15:01:16 +02:00
cmdln.py add regex for python3 missing arguments err 2020-05-18 19:46:22 +02:00
commandline.py Merge branch 'ccache' of https://github.com/sjamgade/osc 2020-05-27 16:09:06 +02:00
conf.py Add ccache argument for oscrc 2020-04-14 14:50:24 +10:00
core.py Fix ElementTree imports for Python 3.9 2020-06-02 15:13:10 -07:00
credentials.py fix list of backends for old python-keyring 2020-02-14 09:35:07 +01:00
fetch.py fix security issue (bsc#1122675) no / in filename 2020-05-27 11:17:40 +02:00
grabber.py fix broken URLError handling in OscMirrorGroup.urlgrab() 2018-11-06 13:29:17 +01:00
meter.py Fix ZeroDivisionException in meter.PBTextMeter 2019-01-23 15:37:01 +01:00
OscConfigParser.py Rename SafeConfigParser to ConfigParser 2020-02-07 12:04:49 +01:00
oscerr.py Improve password handling 2019-08-29 16:11:17 +02:00
oscssl.py Use correct appname for trusted-certs store 2019-07-28 14:59:14 +02:00
oscsslexcp.py - remove shebang line to make rpmlint happy 2010-03-21 22:57:06 +01:00