c - Upgrading Image analysis from 8->16 bit quanta; store calculations in Int32, Float or Double? -
i doing scientific calculations on images in c , use imagemagick read file , output 8 bit quanta raw rgb file analyze. seeing recurring squares in output , garbled, fuzzy regions. suspect integer truncation causing artifacts.
after read entire 8 bit, rgb file memory, allocate parallel array of unsigned shorts hold intermediate results during analysis. calculations done doubles , stored unsigned shorts can used in other calculations. @ end, shorts dumped disk , read either photoshop or imagemagick raw, 16 bit rgb images.
with 8 bits of input data, granularity chunky on darker areas single digit rgb values common. when working 8 neighboring data points in operation such averaging them all, can store full values of points , still have 16 - 8 - 3 = 5 bits left before overflowing unsigned shorts. experience, ugly, muddy, mangled areas when overflow causes loss of higher level bits.
i plan on upgrading input full, 16 bit, unsigned short, tiff type quanta. upgrading parallel, working array full, 32 bit integers make possibility of overflow vanishingly remote. doubles work fine, soooo large , seem overkill. doubling memory requirements working 2 byte data rather 8 bit , doubles pile 8 bytes on top of per channel per pixel. dreading performance hit bound cause.
i think floats might suffice. 16 bit raw data being stored in 23 bit significands, 7 bits of space available. using current next-neighbor, 8 pixel cache, still have 4 bits spare space before overflowing. operation simple averaging 2 neighbors deep 24 total neighbors require 5 bits leaving few 2 bits spare before overflowing.
the current memory requirements 36 mpix image 36 * (3 bytes image + 6 bytes calculations) = 36x9 = 324 mb, 108 mb read disk , 216 written different file, both on fast ssd.
upgrading 16 bit input , using 4 byte int32 or float parallel: 36 mpix (6 bytes/pixel primary +12 bytes/pixel parallel) = 36*18 = 648mb. plus 216 mb disk read/write
with doubles: 36 * (6 + 24) = 1080mb. 3.33 x current memory. plus 216 mb disk read/write
doing purely integer operations now, can crunch 108 mb, 36 mpix file in 3.41 seconds gcc/32. adding pow() computational mix raises time 28.48s same binary
short of writing 3 variations , benchmarking them all, ideas on storage fastest, int32, float or double??? better compiler?
=========================================================================== results current 8 bit input, 16 bit storage , 16 bit output. fastest in integer slowest in floating point. ran 12 times, 4 binaries, tcc, gcc32, gcc64 , msvc test "0" purely integer, test "1" uses integer , double pow() 1498 sec 12 runs -> %time binary test average compiler + optimizations 22.815% -> raw2cv.gcc32 1 28.48s 22.804% -> raw2cv.gcc64 1 28.47s 19.246% -> raw2cv.tcc 1 24.03s 18.489% -> raw2cv 1 23.08s 6.513% -> raw2cv.tcc 0 8.13s tcc, no optimizations 4.661% -> raw2cv 0 5.82s msvc /o2 2.741% -> raw2cv.gcc64 0 3.42s mingw -o3 -ffast-math -m64 2.732% -> raw2cv.gcc32 0 3.41s mingw -o3 -ffast-math tcc version 0.9.26 (x86-64 win64) gcc version 4.7.3 (release patches / build 20130526 strawberryperl.com) microsoft (r) 32-bit c/c++ optimizing compiler version 16.00.40219.01 80x86
Comments
Post a Comment