为什么我在docker上的JAVA程序看到的CPU这么少?

1. 问题:

docker 运行的 elasticsearch 6.6.1 压测性能非常不好, 后发现es的线程池数设置为1, available processor也为1,并且修改此参数后仍然压不上去.

2. 分析:

2.1 es的threadpool中的处理器数是怎么来的?

available processor 实际上es根据os的cpu数来设置的, os看到多少cpu就设置多少.
那么这么看, es的 thread pool上不来是因为jvm看到的cpu不对, 因此定位到这里.

根据博客 http://www.opscoder.info/es_threadpool.html 上记录
threadpool计算源码为:
int availableProcessors = EsExecutors.boundedNumberOfProcessors(settings);
int halfProcMaxAt5 = Math.min(((availableProcessors + 1) / 2), 5);
int halfProcMaxAt10 = Math.min(((availableProcessors + 1) / 2), 10);
也就是说,es的available processor实际是jvm绑定的processor数

2.1 问题确定docker上的JVM配置问题

6.6的es运行在JVM10上, 我隐约感觉是高版本的JVM对docker的优化. 果然搜索到java的一个bug是解释jvm10后,jvm对docker的优化.

JVM 10 以后针对docker环境进行了优化:

cpu优化, cpu数=cpu_quota() / cpu_period() ,也就是说docker中设置的cpu配额/1024, 比如20480/1024=20个cpu

内存优化, 如果不设置XMX,JVM默认会根据cgroup的limit设置XMX,也就是说不设置JMX也行(翻阅了一些资料, 还是不太靠谱,最好还是手动设置Xmx)

-XX:-UseContainerSupport : 默认JVM打开此功能,也就是上面说的两条, 前面加上减号关闭容器日志,也就和以前版本的jvm一样的行为.

3. 解决:

纯干货分享 jiangjiang.space

容器加入cpu_quota

$ docker  run  xxx   --cpu-quota=16384

加入后es中 available processor 变成16了

GET _nodes/os

4. 参考资料

http://www.opscoder.info/es_threadpool.html
https://bugs.openjdk.java.net/browse/JDK-8146115

请参考

Number of CPUs 
----------------------- 
Use a combination of number_of_cpus() and cpu_sets() in order to determine how many processors are available to the process and adjust the JVMs os::active_processor_count appropriately. The number_of_cpus() will be calculated based on the cpu_quota() and cpu_period() using this formula: number_of_cpus() = cpu_quota() / cpu_period(). If cpu_shares has been setup for the container, the number_of_cpus() will be calculated based on cpu_shares()/1024. 1024 is the default and standard unit for calculating relative cpu usage in cloud based container management software. 

Also add a new VM flag (-XX:ActiveProcessorCount=xx) that allows the number of CPUs to be overridden. This flag will be honored even if UseContainerSupport is not enabled. 

Total available memory 
------------------------------- 
Use the memory_limit() value from the cgroup file system to initialize the os::physical_memory() value in the VM. This value will propagate to all other parts of the Java runtime. 

Memory usage 
-------------------- 
Use memory_usage_in_bytes() for providing os::available_memory() by subtracting the usage from the total available memory allocated to the container. 

As as troubleshooting aid, we will dump any available container statistics to the hotspot error log and add container specific information to the JVM logging system. Unified Logging will be added to help to diagnose issue related to this support. Use -Xlog:os+container=trace for maximum logging of container information. 

A new option -XX:-UseContainerSupport will be added to allow the container support to be disabled. The default for this flag will be true. Container support will be enabled by default.