python - Efficient byte by float multiplication -
on input have signed array of bytes barr
(usually little endian, doesn't matter) , float f
multiply barr
with.
my approach convert barr
integer val
(using int.from_bytes
function), multiply it, perform overflow checks , "crop" multiplied val
if needed, convert array of bytes.
def multiply(barr, f): val = int.from_bytes(barr, byteorder='little', signed=true) val *= f val = int (val) val = cropint(val, bitlen = barr.__len__()*8) barr = val.to_bytes(barr.__len__(), byteorder='little', signed=true) return barr def cropint(integer, bitlen, signed = true): maxvalue = (2**(bitlen-1)-1) if signed else (2**(bitlen)-1) minvalue = -maxvalue-1 if signed else 0 if integer > maxvalue: integer = maxvalue if integer < minvalue: integer = minvalue return integer
however process extremely slow when processing large amount of data. there better, more efficient way that?
pure python rather innefective numeric calculations - because due each number being treated object, each operation involves lot of "under hood" steps.
on other hand, python can effective numeric calculation if use appropriate set of third party libraries.
in case, sice performance matters, can make use of numpy
- de facto python package numeric processing.
with casting, multiplication , recasting done in native code in 1 pass each (and after knowing better numpy do, less steps) - , should give improvement of 3-4 orders of magnitude in speed task:
import numpy np def multiply(all_bytes, f, bitlen, signed=true): # works 8, 16, 32 , 64 bit integers: dtype = "%sint%d" % ("" if signed else "", bitlen) max_value = 2 ** (bitlen- (1 if signed else 0)) - 1 input_data = np.frombuffer(all_bytes, dtype=dtype) processed = np.clip(input_data * f, 0, max_value) return bytes(processed.astype(dtype))
please not example takes byte-data @ once, not 1 @ time pass original "multiply" function. threfore, have pass size in bits of integers.
the line goes dtype = "%sint%d" % ("" if signed else "", bitlen)
creates data-type name, used numpy number of bits passed in. since name string, interpolates string adding or not "u" prefix, depending on datatype being unsigned, , put number of bits @ end. numpy datatypes can checked at: https://docs.scipy.org/doc/numpy/user/basics.types.html
running array of 500000 8bit signed integers these timings:
in [99]: %time y = numpy_multiply(data, 1.7, 8) cpu times: user 3.01 ms, sys: 4.96 ms, total: 7.97 ms wall time: 7.38 ms
in [100]: %time x = original_multiply(data, 1.7, 8) cpu times: user 11.3 s, sys: 1.86 ms, total: 11.3 s wall time: 11.3 s
(that after modifying function operate on bytes @ time well) - speedup of 1500 times, i've stated on first draft.
Comments
Post a Comment