android - Why Exynos Octa 5420 is unusually slow -
my code:
#include<ctime> #include<cstdio> int main(){ struct timespec t,mt1,mt2; unsigned long long int mt; clock_gettime(clock_thread_cputime_id,&mt1); //measured block begin for(int i=0;i<1000000;i++) clock_gettime(clock_thread_cputime_id,&t); //measured block end clock_gettime(clock_thread_cputime_id,&mt2); mt = (mt2.tv_sec - mt1.tv_sec)*1000000000ll + mt2.tv_nsec - mt1.tv_nsec; printf("%lld\n",mt); homecoming 0; }
i'm using standalone arm-v7a toolchain generated android ndk r9d resides under /opt/android-toolchain
.
configuration 1:
these default flags generated toolchain file in https://github.com/taka-no-me/android-cmake.
compiler configuration:
/opt/android-toolchain/bin/arm-linux-androideabi-g++ \ -dandroid -wno-psabi --sysroot=/opt/android-toolchain/sysroot \ -fpic -funwind-tables -finline-limit=64 -fsigned-char \ -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp \ -mfpu=vfpv3-d16 -fdata-sections -ffunction-sections \ -wa,--noexecstack -mthumb -fomit-frame-pointer \ -fno-strict-aliasing -o3 -dndebug \ -isystem /opt/android-toolchain/sysroot/usr/include \ -isystem /opt/android-toolchain/include/c++/4.8 \ -isystem /opt/android-toolchain/include/c++/4.8/arm-linux-androideabi/armv7-a \ -o my-object-file.o -c my-source-file.cpp
linker configuration:
/opt/android-toolchain/bin/arm-linux-androideabi-gcc \ -wno-psabi --sysroot=/opt/android-toolchain/sysroot \ -fpic -funwind-tables -finline-limit=64 -fsigned-char \ -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp \ -mfpu=vfpv3-d16 -fdata-sections -ffunction-sections \ -wa,--noexecstack -mthumb -fomit-frame-pointer \ -fno-strict-aliasing -o3 -dndebug -wl,--fix-cortex-a8 \ -wl,--no-undefined -wl,-allow-shlib-undefined -wl,--gc-sections \ -wl,-z,noexecstack -wl,-z,relro -wl,-z,now \ -wl,-z,nocopyreloc my-object-file.o -o my-executable \ -l/libs/armeabi-v7a -rdynamic \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libstdc++.a" \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libsupc++.a" \ -lm
samsung galaxy note 10.1 2014 edition exynos octa 5420 @1.9 ghz running samsung stock 4.4.2 rom, code takes 2.0 seconds samsung galaxy note ii exynos 4412 @1.6 ghz running cyanogenmod 11 based on android 4.4.4, code takes 0.75 seconds samsung galaxy s3 exynos 4412 @1.4 ghz running cyanogenmod 11 based on android 4.4.4, code takes 1.1 seconds configuration 2:
nearly flags before removed.
compiler configuration:
/opt/android-toolchain/bin/arm-linux-androideabi-g++ \ -dandroid --sysroot=/opt/android-toolchain/sysroot \ -o3 -dndebug \ -isystem /opt/android-toolchain/sysroot/usr/include \ -isystem /opt/android-toolchain/include/c++/4.8 \ -isystem /opt/android-toolchain/include/c++/4.8/arm-linux-androideabi/armv7-a \ -o my-object-file.o -c my-source-file.cpp
linker configuration:
/opt/android-toolchain/bin/arm-linux-androideabi-gcc \ --sysroot=/opt/android-toolchain/sysroot -o3 -dndebug \ -wl,-z,nocopyreloc my-object-file.o -o my-executable \ -l/libs/armeabi-v7a -rdynamic \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libstdc++.a" \ "/opt/android-toolchain/arm-linux-androideabi/lib/armv7-a/thumb/libsupc++.a" \ -lm
samsung galaxy note 10.1 2014 edition exynos octa 5420 @1.9 ghz running samsung stock 4.4.2 rom, code takes 2.2 seconds samsung galaxy note ii exynos 4412 @1.6 ghz running cyanogenmod 11 based on android 4.4.4, code takes 0.94 seconds samsung galaxy s3 exynos 4412 @1.4 ghz running cyanogenmod 11 based on android 4.4.4, code takes 1.1 seconds notes both configurations:
i set the lowest cpu clock frequency highest possible, i.e 1.9 ghz, cpu tuning app.
i made sure there no background processes hogging cpu.
i tried -mcpu=cortex-a15
flag, doesn't alter execution time significantly.
also tried -mfpu=neon -marm -mtune=cortex-a15
, doesn't alter execution time significantly.
clock_gettime()
not culprit, code visibly slower.
other pieces of code tried, including parts of opencv imgproc
, stl calls such std::map::find()
, std::sort()
visibly , clock_gettime()
-measurably slower on exynos octa 5420 compared 2 others listed above.
my hypotheses:
my thread somehow getting stuck on 1 of cortex-a7 cores instead of hopping on 1 of cortex-a15 ones. if might case, can create sure case or how can forcefulness threads onto cortex-a15 cores?
i failed set cpu clock frequency lower limit , cpu beingness throttled. if might case, how can create sure case?
samsung's kernel somehow worse compared cm's. can cause much difference in execution time?
i'm pretty much stumped @ point. advices , insights can money's worth out of device?
edit: flashed custom tweaked kernel (http://forum.xda-developers.com/showthread.php?t=2725193) , set governor performance
, execution time went downwards 1.3 seconds, think 3rd hypothesis bit stronger now. still slower older cpus though...
android c++ performance android-ndk arm
No comments:
Post a Comment