* speedup unaligned copies by always using word shifts (in
combination with builtin byte swap 64 when available) when
bit-endianness and machine byte-order are opposite
* add ``HAVE_BUILTIN_BSWAP64`` to header
* avoid misaligned pointers when casting to ``(uint64_t *)``
* add tests
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-bitarray?expand=0&rev=46