Thursday, 15 August 2013

android - Why Exynos Octa 5420 is unusually slow -



android - Why Exynos Octa 5420 is unusually slow -

my code:

#include<ctime> #include<cstdio> int main(){ struct timespec t,mt1,mt2; unsigned long long int mt; clock_gettime(clock_thread_cputime_id,&mt1); //measured block begin for(int i=0;i<1000000;i++) clock_gettime(clock_thread_cputime_id,&t); //measured block end clock_gettime(clock_thread_cputime_id,&mt2); mt = (mt2.tv_sec - mt1.tv_sec)*1000000000ll + mt2.tv_nsec - mt1.tv_nsec; printf("%lld\n",mt); homecoming 0; }

i'm using standalone arm-v7a toolchain generated android ndk r9d resides under /opt/android-toolchain.

configuration 1:

these default flags generated toolchain file in https://github.com/taka-no-me/android-cmake.

compiler configuration:

/opt/android-toolchain/bin/arm-linux-androideabi-g++ \ -dandroid -wno-psabi --sysroot=/opt/android-toolchain/sysroot \ -fpic -funwind-tables -finline-limit=64 -fsigned-char \ -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp \ -mfpu=vfpv3-d16 -fdata-sections -ffunction-sections \ -wa,--noexecstack -mthumb -fomit-frame-pointer \ -fno-strict-aliasing -o3 -dndebug \ -isystem /opt/android-toolchain/sysroot/usr/include \ -isystem /opt/android-toolchain/include/c++/4.8 \ -isystem /opt/android-toolchain/include/c++/4.8/arm-linux-androideabi/armv7-a \ -o my-object-file.o -c my-source-file.cpp

linker configuration:

/opt/android-toolchain/bin/arm-linux-androideabi-gcc \ -wno-psabi --sysroot=/opt/android-toolchain/sysroot \ -fpic -funwind-tables -finline-limit=64 -fsigned-char \ -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp \ -mfpu=vfpv3-d16 -fdata-sections -ffunction-sections \ -wa,--noexecstack -mthumb -fomit-frame-pointer \ -fno-strict-aliasing -o3 -dndebug -wl,--fix-cortex-a8 \ -wl,--no-undefined -wl,-allow-shlib-undefined -wl,--gc-sections \ -wl,-z,noexecstack -wl,-z,relro -wl,-z,now \ -wl,-z,nocopyreloc my-object-file.o -o my-executable \ -l/libs/armeabi-v7a -rdynamic \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libstdc++.a" \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libsupc++.a" \ -lm samsung galaxy note 10.1 2014 edition exynos octa 5420 @1.9 ghz running samsung stock 4.4.2 rom, code takes 2.0 seconds samsung galaxy note ii exynos 4412 @1.6 ghz running cyanogenmod 11 based on android 4.4.4, code takes 0.75 seconds samsung galaxy s3 exynos 4412 @1.4 ghz running cyanogenmod 11 based on android 4.4.4, code takes 1.1 seconds

configuration 2:

nearly flags before removed.

compiler configuration:

/opt/android-toolchain/bin/arm-linux-androideabi-g++ \ -dandroid --sysroot=/opt/android-toolchain/sysroot \ -o3 -dndebug \ -isystem /opt/android-toolchain/sysroot/usr/include \ -isystem /opt/android-toolchain/include/c++/4.8 \ -isystem /opt/android-toolchain/include/c++/4.8/arm-linux-androideabi/armv7-a \ -o my-object-file.o -c my-source-file.cpp

linker configuration:

/opt/android-toolchain/bin/arm-linux-androideabi-gcc \ --sysroot=/opt/android-toolchain/sysroot -o3 -dndebug \ -wl,-z,nocopyreloc my-object-file.o -o my-executable \ -l/libs/armeabi-v7a -rdynamic \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libstdc++.a" \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libsupc++.a" \ -lm samsung galaxy note 10.1 2014 edition exynos octa 5420 @1.9 ghz running samsung stock 4.4.2 rom, code takes 2.2 seconds samsung galaxy note ii exynos 4412 @1.6 ghz running cyanogenmod 11 based on android 4.4.4, code takes 0.94 seconds samsung galaxy s3 exynos 4412 @1.4 ghz running cyanogenmod 11 based on android 4.4.4, code takes 1.1 seconds

notes both configurations:

i set the lowest cpu clock frequency highest possible, i.e 1.9 ghz, cpu tuning app.

i made sure there no background processes hogging cpu.

i tried -mcpu=cortex-a15 flag, doesn't alter execution time significantly.

also tried -mfpu=neon -marm -mtune=cortex-a15, doesn't alter execution time significantly.

clock_gettime() not culprit, code visibly slower.

other pieces of code tried, including parts of opencv imgproc , stl calls such std::map::find() , std::sort() visibly , clock_gettime()-measurably slower on exynos octa 5420 compared 2 others listed above.

my hypotheses:

my thread somehow getting stuck on 1 of cortex-a7 cores instead of hopping on 1 of cortex-a15 ones. if might case, can create sure case or how can forcefulness threads onto cortex-a15 cores?

i failed set cpu clock frequency lower limit , cpu beingness throttled. if might case, how can create sure case?

samsung's kernel somehow worse compared cm's. can cause much difference in execution time?

i'm pretty much stumped @ point. advices , insights can money's worth out of device?

edit: flashed custom tweaked kernel (http://forum.xda-developers.com/showthread.php?t=2725193) , set governor performance , execution time went downwards 1.3 seconds, think 3rd hypothesis bit stronger now. still slower older cpus though...

android c++ performance android-ndk arm

No comments:

Post a Comment