Jay Taylor's notes

back to listing index

Need simple kubectl command to see cluster resource usage · Issue #17512 · kubernetes/kubernetes · GitHub

[web search]
Original source (github.com)
Tags: howto memory kubernetes cpu
Clipped on: 2020-07-16

Skip to content

Join GitHub today

GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.

Sign up
New issue

Need simple kubectl command to see cluster resource usage #17512

Open
goltermann opened this issue on Nov 19, 2015 · 79 comments

Comments

Image (Asset 2/73) alt=
Contributor This user has previously committed to the kubernetes repository.

goltermann commented on Nov 19, 2015

Users are getting tripped up by pods not being able to schedule due to resource deficiencies. It can be hard to know when a pod is pending because it just hasn't started up yet, or because the cluster doesn't have room to schedule it. http://kubernetes.io/v1.1/docs/user-guide/compute-resources.html#monitoring-compute-resource-usage helps, but isn't that discoverable (I tend to try a 'get' on a pod in pending first, and only after waiting a while and seeing it 'stuck' in pending, do I use 'describe' to realize it's a scheduling problem).

This is also complicated by system pods being in a namespace that is hidden. Users forget that those pods exist, and 'count against' cluster resources.

There are several possible fixes offhand, I don't know what would be ideal:

  1. Develop a new pod state other than Pending to represent "tried to schedule and failed for lack of resources".

  2. Have kubectl get po or kubectl get po -o=wide display a column to detail why something is pending (perhaps the container.state that is Waiting in this case, or the most recent event.message).

  3. Create a new kubectl command to more easily describe resources. I'm imagining a "kubectl usage" that gives an overview of total cluster CPU and Mem, per node CPU and Mem and each pod/container's usage. Here we would include all pods, including system ones. This might be useful long term alongside more complex schedulers, or when your cluster has enough resources but no single node does (diagnosing the 'no holes large enough' problem).

Image (Asset 3/73) alt=
Image (Asset 4/73) alt=
Member This user is a member of the Kubernetes organization.

davidopp commented on Nov 19, 2015

Something along the lines of (2) seems reasonable, though the UX folks would know better than me.

(3) seems vaguely related to #15743 but I'm not sure they're close enough to combine.

Image (Asset 5/73) alt=
Contributor This user has previously committed to the kubernetes repository.

chrishiestand commented on Sep 15, 2016
edited

In addition to the case above, it would be nice to see what resource utilization we're getting.

kubectl utilization requests might show (maybe kubectl util or kubectl usage are better/shorter):

cores: 4.455/5 cores (89%)
memory: 20.1/30 GiB (67%)
...

In this example, the aggregate container requests are 4.455 cores and 20.1 GiB and there are 5 cores and 30GiB total in the cluster.

Image (Asset 6/73) alt=

xmik commented on Dec 19, 2016

There is:

$ kubectl top nodes
NAME                    CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
cluster1-k8s-master-1   312m         15%       1362Mi          68%       
cluster1-k8s-node-1     124m         12%       233Mi           11% 
Image (Asset 7/73) alt=

ozbillwang commented on Jan 10, 2017
edited

I use below command to get a quick view for the resource usage. It is the simplest way I found.

kubectl describe nodes
Image (Asset 8/73) alt=
Contributor This user has previously committed to the kubernetes repository.

tonglil commented on Jan 20, 2017

If there was a way to "format" the output of kubectl describe nodes, I wouldn't mind scripting my way to summarize all node's resource requests/limits.

Image (Asset 9/73) alt=

from-nibly commented on Feb 1, 2017

here is my hack kubectl describe nodes | grep -A 2 -e "^\\s*CPU Requests"

Image (Asset 10/73) alt=

jredl-va commented on May 25, 2017

@from-nibly thanks, just what i was looking for

Image (Asset 11/73) alt=
Contributor This user has previously committed to the kubernetes repository.

tonglil commented on May 25, 2017

Yup, this is mine:

$ cat bin/node-resources.sh 
#!/bin/bash
set -euo pipefail

echo -e "Iterating...\n"

nodes=$(kubectl get node --no-headers -o custom-columns=NAME:.metadata.name)

for node in $nodes; do
  echo "Node: $node"
  kubectl describe node "$node" | sed '1,/Non-terminated Pods/d'
  echo
done
Image (Asset 12/73) alt=
Contributor This user has previously committed to the kubernetes repository.

k8s-github-robot commented on May 31, 2017

@goltermann There are no sig labels on this issue. Please add a sig label by:
(1) mentioning a sig: @kubernetes/sig-<team-name>-misc
(2) specifying the label manually: /sig <label>

Note: method (1) will trigger a notification to the team. You can find the team list here.

Image (Asset 13/73) alt=
Member This user is a member of the Kubernetes organization.

kargakis commented on Jun 10, 2017

@kubernetes/sig-cli-misc

Image (Asset 14/73) alt=
Image (Asset 15/73) alt=
Contributor This user has previously committed to the kubernetes repository.

alok87 commented on Jul 5, 2017
edited

You can use the below command to find the percentage cpu utlisation of your nodes

alias util='kubectl get nodes | grep node | awk '\''{print $1}'\'' | xargs -I {} sh -c '\''echo   {} ; kubectl describe node {} | grep Allocated -A 5 | grep -ve Event -ve Allocated -ve percent -ve -- ; echo '\'''
Note: 4000m cores is the total cores in one node
alias cpualloc="util | grep % | awk '{print \$1}' | awk '{ sum += \$1 } END { if (NR > 0) { result=(sum**4000); printf result/NR \"%\n\" } }'"

$ cpualloc
3.89358%

Note: 1600MB is the total cores in one node
alias memalloc='util | grep % | awk '\''{print $3}'\'' | awk '\''{ sum += $1 } END { if (NR > 0) { result=(sum*100)/(NR*1600); printf result/NR "%\n" } }'\'''

$ memalloc
24.6832%
Image (Asset 16/73) alt=
Contributor This user has previously committed to the kubernetes repository.

alok87 commented on Jul 21, 2017

@tomfotherby alias util='kubectl get nodes | grep node | awk '\''{print $1}'\'' | xargs -I {} sh -c '\''echo {} ; kubectl describe node {} | grep Allocated -A 5 | grep -ve Event -ve Allocated -ve percent -ve -- ; echo '\'''

Image (Asset 17/73) alt=

tomfotherby commented on Jul 25, 2017

@alok87 - Thanks for your aliases. In my case, this is what worked for me given that we use bash and m3.large instance types (2 cpu , 7.5G memory).

alias util='kubectl get nodes --no-headers | awk '\''{print $1}'\'' | xargs -I {} sh -c '\''echo {} ; kubectl describe node {} | grep Allocated -A 5 | grep -ve Event -ve Allocated -ve percent -ve -- ; echo '\'''

# Get CPU request total (we x20 because because each m3.large has 2 vcpus (2000m) )
alias cpualloc='util | grep % | awk '\''{print $1}'\'' | awk '\''{ sum += $1 } END { if (NR > 0) { print sum/(NR*20), "%\n" } }'\'''

# Get mem request total (we x75 because because each m3.large has 7.5G ram )
alias memalloc='util | grep % | awk '\''{print $5}'\'' | awk '\''{ sum += $1 } END { if (NR > 0) { print sum/(NR*75), "%\n" } }'\'''
$util
ip-10-56-0-178.ec2.internal
  CPU Requests	CPU Limits	Memory Requests	Memory Limits
  960m (48%)	2700m (135%)	630Mi (8%)	2034Mi (27%)

ip-10-56-0-22.ec2.internal
  CPU Requests	CPU Limits	Memory Requests	Memory Limits
  920m (46%)	1400m (70%)	560Mi (7%)	550Mi (7%)

ip-10-56-0-56.ec2.internal
  CPU Requests	CPU Limits	Memory Requests	Memory Limits
  1160m (57%)	2800m (140%)	972Mi (13%)	3976Mi (53%)

ip-10-56-0-99.ec2.internal
  CPU Requests	CPU Limits	Memory Requests	Memory Limits
  804m (40%)	794m (39%)	824Mi (11%)	1300Mi (17%)

cpualloc 
48.05 %

$ memalloc 
9.95333 %
Image (Asset 18/73) alt=

nfirvine commented on Aug 30, 2017

#17512 (comment) kubectl top shows usage, not allocation. Allocation is what causes the insufficient CPU problem. There's a ton of confusion in this issue about the difference.

AFAICT, there's no easy way to get a report of node CPU allocation by pod, since requests are per container in the spec. And even then, it's difficult since .spec.containers[*].requests may or may not have the limits/requests fields (in my experience)

Image (Asset 19/73) alt= Open
Image (Asset 20/73) alt=
Contributor This user has previously committed to the kubernetes repository.

misterikkit commented on Jan 2, 2018

/cc @misterikkit

Image (Asset 21/73) alt=
Contributor This user has previously committed to the kubernetes repository.

negz commented on Feb 20, 2018

Getting in on this shell scripting party. I have an older cluster running the CA with scale down disabled. I wrote this script to determine roughly how much I can scale down the cluster when it starts to bump up on its AWS route limits:

#!/bin/bash

set -e

KUBECTL="kubectl"
NODES=$($KUBECTL get nodes --no-headers -o custom-columns=NAME:.metadata.name)

function usage() {
	local node_count=0
	local total_percent_cpu=0
	local total_percent_mem=0
	local readonly nodes=$@

	for n in $nodes; do
		local requests=$($KUBECTL describe node $n | grep -A2 -E "^\\s*CPU Requests" | tail -n1)
		local percent_cpu=$(echo $requests | awk -F "[()%]" '{print $2}')
		local percent_mem=$(echo $requests | awk -F "[()%]" '{print $8}')
		echo "$n: ${percent_cpu}% CPU, ${percent_mem}% memory"

		node_count=$((node_count + 1))
		total_percent_cpu=$((total_percent_cpu + percent_cpu))
		total_percent_mem=$((total_percent_mem + percent_mem))
	done

	local readonly avg_percent_cpu=$((total_percent_cpu / node_count))
	local readonly avg_percent_mem=$((total_percent_mem / node_count))

	echo "Average usage: ${avg_percent_cpu}% CPU, ${avg_percent_mem}% memory."
}

usage $NODES

Produces output like:

ip-REDACTED.us-west-2.compute.internal: 38% CPU, 9% memory
...many redacted lines...
ip-REDACTED.us-west-2.compute.internal: 41% CPU, 8% memory
ip-REDACTED.us-west-2.compute.internal: 61% CPU, 7% memory
Average usage: 45% CPU, 15% memory.
Image (Asset 22/73) alt=

ylogx commented on Feb 21, 2018

There is also pod option in top command:

kubectl top pod
Image (Asset 23/73) alt=

nfirvine commented on Feb 21, 2018

@ylogx #17512 (comment)

Image (Asset 24/73) alt=

shtouff commented on Mar 4, 2018

My way to obtain the allocation, cluster-wide:

$ kubectl get po --all-namespaces -o=jsonpath="{range .items[*]}{.metadata.namespace}:{.metadata.name}{'\n'}{range .spec.containers[*]}  {.name}:{.resources.requests.cpu}{'\n'}{end}{'\n'}{end}"

It produces something like:

kube-system:heapster-v1.5.0-dc8df7cc9-7fqx6
  heapster:88m
  heapster-nanny:50m
kube-system:kube-dns-6cdf767cb8-cjjdr
  kubedns:100m
  dnsmasq:150m
  sidecar:10m
  prometheus-to-sd:
kube-system:kube-dns-6cdf767cb8-pnx2g
  kubedns:100m
  dnsmasq:150m
  sidecar:10m
  prometheus-to-sd:
kube-system:kube-dns-autoscaler-69c5cbdcdd-wwjtg
  autoscaler:20m
kube-system:kube-proxy-gke-cluster1-default-pool-cd7058d6-3tt9
  kube-proxy:100m
kube-system:kube-proxy-gke-cluster1-preempt-pool-57d7ff41-jplf
  kube-proxy:100m
kube-system:kubernetes-dashboard-7b9c4bf75c-f7zrl
  kubernetes-dashboard:50m
kube-system:l7-default-backend-57856c5f55-68s5g
  default-http-backend:10m
kube-system:metrics-server-v0.2.0-86585d9749-kkrzl
  metrics-server:48m
  metrics-server-nanny:5m
kube-system:tiller-deploy-7794bfb756-8kxh5
  tiller:10m
Image (Asset 25/73) alt=

kierenj commented on Mar 13, 2018

This is weird. I want to know when I'm at or nearing allocation capacity. It seems a pretty basic function of a cluster. Whether it's a statistic that shows a high % or textual error... how do other people know this? Just always use autoscaling on a cloud platform?

Image (Asset 26/73) alt=

dpetzold commented on May 1, 2018
edited

I authored https://github.com/dpetzold/kube-resource-explorer/ to address #3. Here is some sample output:

$ ./resource-explorer -namespace kube-system -reverse -sort MemReq
Namespace    Name                                               CpuReq  CpuReq%  CpuLimit  CpuLimit%  MemReq    MemReq%  MemLimit  MemLimit%
---------    ----                                               ------  -------  --------  ---------  ------    -------  --------  ---------
kube-system  event-exporter-v0.1.7-5c4d9556cf-kf4tf             0       0%       0         0%         0         0%       0         0%
kube-system  kube-proxy-gke-project-default-pool-175a4a05-mshh  100m    10%      0         0%         0         0%       0         0%
kube-system  kube-proxy-gke-project-default-pool-175a4a05-bv59  100m    10%      0         0%         0         0%       0         0%
kube-system  kube-proxy-gke-project-default-pool-175a4a05-ntfw  100m    10%      0         0%         0         0%       0         0%
kube-system  kube-dns-autoscaler-244676396-xzgs4                20m     2%       0         0%         10Mi      0%       0         0%
kube-system  l7-default-backend-1044750973-kqh98                10m     1%       10m       1%         20Mi      0%       20Mi      0%
kube-system  kubernetes-dashboard-768854d6dc-jh292              100m    10%      100m      10%        100Mi     3%       300Mi     11%
kube-system  kube-dns-323615064-8nxfl                           260m    27%      0         0%         110Mi     4%       170Mi     6%
kube-system  fluentd-gcp-v2.0.9-4qkwk                           100m    10%      0         0%         200Mi     7%       300Mi     11%
kube-system  fluentd-gcp-v2.0.9-jmtpw                           100m    10%      0         0%         200Mi     7%       300Mi     11%
kube-system  fluentd-gcp-v2.0.9-tw9vk                           100m    10%      0         0%         200Mi     7%       300Mi     11%
kube-system  heapster-v1.4.3-74b5bd94bb-fz8hd                   138m    14%      138m      14%        301856Ki  11%      301856Ki  11%
Image (Asset 27/73) alt=

Spaceman1861 commented on Sep 2, 2019

Oooo shiny @hjacobs I like that.

Image (Asset 28/73) alt=

amelbakry commented on Sep 2, 2019

This is a script (deployment-health.sh) to get the utilization of the pods in deployment based on the usage and configured limits
https://github.com/amelbakry/kubernetes-scripts

Image (Asset 29/73) alt=

Image (Asset 30/73) alt= Open
Image (Asset 31/73) alt=

alikhil commented on Sep 25, 2019
edited

Inspired by the answers of @lentzi90 and @ylogx, I have created own big script which shows actual resource usage (kubectl top pods) and resource requests and limits:

join -a1 -a2 -o 0,1.2,1.3,2.2,2.3,2.4,2.5, -e '<none>' <(kubectl top pods) <(kubectl get pods -o custom-columns=NAME:.metadata.name,"CPU_REQ(cores)":.spec.containers[*].resources.requests.cpu,"MEMORY_REQ(bytes)":.spec.containers[*].resources.requests.memory,"CPU_LIM(cores)":.spec.containers[*].resources.limits.cpu,"MEMORY_LIM(bytes)":.spec.containers[*].resources.limits.memory) | column -t -s' ' 

output example:

NAME                                                             CPU(cores)  MEMORY(bytes)  CPU_REQ(cores)  MEMORY_REQ(bytes)  CPU_LIM(cores)  MEMORY_LIM(bytes)
xxxxx-847dbbc4c-c6twt                                            20m         110Mi          50m             150Mi              150m            250Mi
xxx-service-7b6b9558fc-9cq5b                                     19m         1304Mi         1               <none>             1               <none>
xxxxxxxxxxxxxxx-hook-5d585b449b-zfxmh                            0m          46Mi           200m            155M               200m            155M

Here is the alias for you to just use kstats in your terminal:

alias kstats='join -a1 -a2 -o 0,1.2,1.3,2.2,2.3,2.4,2.5, -e '"'"'<none>'"'"' <(kubectl top pods) <(kubectl get pods -o custom-columns=NAME:.metadata.name,"CPU_REQ(cores)":.spec.containers[*].resources.requests.cpu,"MEMORY_REQ(bytes)":.spec.containers[*].resources.requests.memory,"CPU_LIM(cores)":.spec.containers[*].resources.limits.cpu,"MEMORY_LIM(bytes)":.spec.containers[*].resources.limits.memory) | column -t -s'"'"' '"'" 

P.S. I've tested scripts only on my mac, for linux and windows it may require some changes

Image (Asset 32/73) alt=

demisx commented on Sep 25, 2019

This is a script (deployment-health.sh) to get the utilization of the pods in deployment based on the usage and configured limits
https://github.com/amelbakry/kubernetes-scripts

@amelbakry I am getting the following error trying to execute it on a Mac:

Failed to execute process './deployment-health.sh'. Reason:
exec: Exec format error
The file './deployment-health.sh' is marked as an executable but could not be run by the operating system.
Image (Asset 33/73) alt=

This comment was left via email reply. cgthayer commented on Sep 25, 2019

Woops, "#!" needs to be the very first line. Instead try "bash ./deployment-health.sh" to work around the issue. /charles PS. PR opened to fix the issue
Image (Asset 34/73) alt=

demisx commented on Oct 2, 2019

@cgthayer You might want to apply that PR fix globally. Also, when I ran the scripts on MacOs Mojave, a bunch of errors showed up, including EU specific zone names which I don't use. Looks like these scripts have been written for a specific project.

Image (Asset 35/73) alt=

slmingol commented on Oct 21, 2019

Here's a modified version of the join ex. which does totals of columns as well.

oc_ns_pod_usage () {
    # show pod usage for cpu/mem
    ns="$1"
    usage_chk3 "$ns" || return 1
    printf "$ns\n"
    separator=$(printf '=%.0s' {1..50})
    printf "$separator\n"
    output=$(join -a1 -a2 -o 0,1.2,1.3,2.2,2.3,2.4,2.5, -e '<none>' \
        <(kubectl top pods -n $ns) \
        <(kubectl get -n $ns pods -o custom-columns=NAME:.metadata.name,"CPU_REQ(cores)":.spec.containers[*].resources.requests.cpu,"MEMORY_REQ(bytes)":.spec.containers[*].resources.requests.memory,"CPU_LIM(cores)":.spec.containers[*].resources.limits.cpu,"MEMORY_LIM(bytes)":.spec.containers[*].resources.limits.memory))
    totals=$(printf "%s" "$output" | awk '{s+=$2; t+=$3; u+=$4; v+=$5; w+=$6; x+=$7} END {print s" "t" "u" "v" "w" "x}')
    printf "%s\n%s\nTotals: %s\n" "$output" "$separator" "$totals" | column -t -s' '
    printf "$separator\n"
}

Example

$ oc_ns_pod_usage ls-indexer
ls-indexer
==================================================
NAME                                                CPU(cores)  MEMORY(bytes)  CPU_REQ(cores)  MEMORY_REQ(bytes)  CPU_LIM(cores)  MEMORY_LIM(bytes)
ls-indexer-f5-7cd5859997-qsfrp                      15m         741Mi          1               1000Mi             2               2000Mi
ls-indexer-f5-7cd5859997-sclvg                      15m         735Mi          1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-4b7j2                 92m         1103Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-5xj5l                 88m         1124Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-6vvl2                 92m         1132Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-85f66                 85m         1151Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-924jz                 96m         1124Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-g6gx8                 119m        1119Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-hkhnt                 52m         819Mi          1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-hrsrs                 51m         1122Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-j4qxm                 53m         885Mi          1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-lxlrb                 83m         1215Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-mw6rt                 86m         1131Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-pbdf8                 95m         1115Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-qk9bm                 91m         1141Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-sdv9r                 54m         1194Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-t67v6                 75m         1234Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-tkxs2                 88m         1364Mi         1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-v6jl2                 53m         747Mi          1               1000Mi             2               2000Mi
ls-indexer-filebeat-7858f56c9-wkqr7                 53m         838Mi          1               1000Mi             2               2000Mi
ls-indexer-metricbeat-74d89d7d85-jp8qc              190m        1191Mi         1               1000Mi             2               2000Mi
ls-indexer-metricbeat-74d89d7d85-jv4bv              192m        1162Mi         1               1000Mi             2               2000Mi
ls-indexer-metricbeat-74d89d7d85-k4dcd              194m        1144Mi         1               1000Mi             2               2000Mi
ls-indexer-metricbeat-74d89d7d85-n46tz              192m        1155Mi         1               1000Mi             2               2000Mi
ls-indexer-packetbeat-db98f6fdf-8x446               35m         1198Mi         1               1000Mi             2               2000Mi
ls-indexer-packetbeat-db98f6fdf-gmxxd               22m         1203Mi         1               1000Mi             2               2000Mi
ls-indexer-syslog-5466bc4d4f-gzxw8                  27m         1125Mi         1               1000Mi             2               2000Mi
ls-indexer-syslog-5466bc4d4f-zh7st                  29m         1153Mi         1               1000Mi             2               2000Mi
==================================================
Totals:                                             2317        30365          28              28000              56              56000
==================================================
Image (Asset 36/73) alt=

cristifalcas commented on Oct 25, 2019

And what is usage_chk3?

Image (Asset 37/73) alt=

davidB commented on Oct 25, 2019
edited

I would like to also share my tools ;-) kubectl-view-allocations: kubectl plugin to list allocations (cpu, memory, gpu,... X requested, limit, allocatable,...)., request are welcome.

I made it because I would like to provide to my (internal) users a way to see "who allocates what". By default every resources are displayed, but in the following sample I only request resource with "gpu" in name.

> kubectl-view-allocations -r gpu

 Resource                                   Requested  %Requested  Limit  %Limit  Allocatable  Free
  nvidia.com/gpu                                    7         58%      7     58%           12     5
  ├─ node-gpu1                                      1         50%      1     50%            2     1
  │  └─ xxxx-784dd998f4-zt9dh                       1                  1
  ├─ node-gpu2                                      0          0%      0      0%            2     2
  ├─ node-gpu3                                      0          0%      0      0%            2     2
  ├─ node-gpu4                                      1         50%      1     50%            2     1
  │  └─ aaaa-1571819245-5ql82                       1                  1
  ├─ node-gpu5                                      2        100%      2    100%            2     0
  │  ├─ bbbb-1571738839-dfkhn                       1                  1
  │  └─ bbbb-1571738888-52c4w                       1                  1
  └─ node-gpu6                                      2        100%      2    100%            2     0
     ├─ bbbb-1571738688-vlxng                       1                  1
     └─ cccc-1571745684-7k6bn                       1                  1

coming version(s):

  • will allow to hide (node, pod) level or to choose how to group, (eg to provide an overview with only resources)
  • installation via curl, krew, brew, ... (currently binary are available under the releases section of github)

Thanks to kubectl-view-utilization for the inspiration, but adding support to other resources was to many copy/paste or hard to do for me in bash (for a generic way).

Image (Asset 38/73) alt=

libudas commented on Nov 28, 2019

here is my hack kubectl describe nodes | grep -A 2 -e "^\\s*CPU Requests"

This doesn't work anymore :(

Image (Asset 39/73) alt=

MostafaGazar commented on Nov 28, 2019

Give kubectl describe node | grep -A5 "Allocated" a try

Image (Asset 40/73) alt=

alexkreidler commented on Dec 1, 2019

This is currently the 4th highest requested issue by thumbs up, but still is priority/backlog.

I'd be happy to take a stab at this if someone could point me in the right direction or if we could finalize a proposal. I think the UX of @davidB's tool is awesome, but this really belongs in the core kubectl.

Image (Asset 41/73) alt=

smpar commented on Dec 18, 2019

Using the following comands: kubectl top nodes & kubectl describe node we do not get consistent results

For example with the first one the CPU(cores) are 1064m but this result cannot be fetched with the second one(1480m):

kubectl top nodes
NAME                                                CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
abcd-p174e23ea5qa4g279446c803f82-abc-node-0         1064m        53%    6783Mi          88%
kubectl describe node abcd-p174e23ea5qa4g279446c803f82-abc-node-0
...
  Resource  Requests          Limits
  --------  --------          ------
  cpu       1480m (74%)       1300m (65%)
  memory    2981486848 (37%)  1588314624 (19%)

Any idea about getting the CPU(cores) without using the kubectl top nodes ?

Image (Asset 42/73) alt=

omerfsen commented on Jan 12

I would like to also share my tools ;-) kubectl-view-allocations: kubectl plugin to list allocations (cpu, memory, gpu,... X requested, limit, allocatable,...)., request are welcome.

I made it because I would like to provide to my (internal) users a way to see "who allocates what". By default every resources are displayed, but in the following sample I only request resource with "gpu" in name.

> kubectl-view-allocations -r gpu

 Resource                                   Requested  %Requested  Limit  %Limit  Allocatable  Free
  nvidia.com/gpu                                    7         58%      7     58%           12     5
  ├─ node-gpu1                                      1         50%      1     50%            2     1
  │  └─ xxxx-784dd998f4-zt9dh                       1                  1
  ├─ node-gpu2                                      0          0%      0      0%            2     2
  ├─ node-gpu3                                      0          0%      0      0%            2     2
  ├─ node-gpu4                                      1         50%      1     50%            2     1
  │  └─ aaaa-1571819245-5ql82                       1                  1
  ├─ node-gpu5                                      2        100%      2    100%            2     0
  │  ├─ bbbb-1571738839-dfkhn                       1                  1
  │  └─ bbbb-1571738888-52c4w                       1                  1
  └─ node-gpu6                                      2        100%      2    100%            2     0
     ├─ bbbb-1571738688-vlxng                       1                  1
     └─ cccc-1571745684-7k6bn                       1                  1

coming version(s):

* will allow to hide (node, pod) level or to choose how to group, (eg to provide an overview with only resources)

* installation via curl, krew, brew, ... (currently binary are available under the releases section of github)

Thanks to kubectl-view-utilization for the inspiration, but adding support to other resources was to many copy/paste or hard to do for me in bash (for a generic way).

Hello David it would be nice if you provide more compiled binary for new distributions. On Ubuntu 16.04 we get

kubectl-view-allocations: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by kubectl-view-allocations)

dpkg -l |grep glib

ii libglib2.0-0:amd64 2.48.2-0ubuntu4.4

Image (Asset 43/73) alt= Closed
Image (Asset 44/73) alt=

davidB commented on Jan 15
edited

@omerfsen can you try the new version kubectl-view-allocations and comment into the ticket version `GLIBC_2.25' not found #14 ?

Image (Asset 45/73) alt=

abelal83 commented on Jan 31

My way to obtain the allocation, cluster-wide:

$ kubectl get po --all-namespaces -o=jsonpath="{range .items[*]}{.metadata.namespace}:{.metadata.name}{'\n'}{range .spec.containers[*]}  {.name}:{.resources.requests.cpu}{'\n'}{end}{'\n'}{end}"

It produces something like:

kube-system:heapster-v1.5.0-dc8df7cc9-7fqx6
  heapster:88m
  heapster-nanny:50m
kube-system:kube-dns-6cdf767cb8-cjjdr
  kubedns:100m
  dnsmasq:150m
  sidecar:10m
  prometheus-to-sd:
kube-system:kube-dns-6cdf767cb8-pnx2g
  kubedns:100m
  dnsmasq:150m
  sidecar:10m
  prometheus-to-sd:
kube-system:kube-dns-autoscaler-69c5cbdcdd-wwjtg
  autoscaler:20m
kube-system:kube-proxy-gke-cluster1-default-pool-cd7058d6-3tt9
  kube-proxy:100m
kube-system:kube-proxy-gke-cluster1-preempt-pool-57d7ff41-jplf
  kube-proxy:100m
kube-system:kubernetes-dashboard-7b9c4bf75c-f7zrl
  kubernetes-dashboard:50m
kube-system:l7-default-backend-57856c5f55-68s5g
  default-http-backend:10m
kube-system:metrics-server-v0.2.0-86585d9749-kkrzl
  metrics-server:48m
  metrics-server-nanny:5m
kube-system:tiller-deploy-7794bfb756-8kxh5
  tiller:10m

by far the best answer here.

Image (Asset 46/73) alt=

stefanjacobs commented on Feb 10
edited

Inspired by the scripts above I created the following script to view the usage, requests and limits:

join -1 2 -2 2 -a 1 -a 2 -o "2.1 0 1.3 2.3 2.5 1.4 2.4 2.6" -e '<wait>' \
  <( kubectl top pods --all-namespaces | sort --key 2 -b ) \
  <( kubectl get pods --all-namespaces -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,"CPU_REQ(cores)":.spec.containers[*].resources.requests.cpu,"MEMORY_REQ(bytes)":.spec.containers[*].resources.requests.memory,"CPU_LIM(cores)":.spec.containers[*].resources.limits.cpu,"MEMORY_LIM(bytes)":.spec.containers[*].resources.limits.memory | sort --key 2 -b ) \
  | column -t -s' '

Because the join shell script expects a sorted list, the scripts given above failed for me.

You see as a result the current usage from top and from the deployment the requests and the limits of (here) all namespaces:

NAMESPACE                 NAME                                                        CPU(cores)  CPU_REQ(cores)  CPU_LIM(cores)  MEMORY(bytes)  MEMORY_REQ(bytes)   MEMORY_LIM(bytes)
kube-system               aws-node-2jzxr                                              18m         10m             <none>          41Mi           <none>              <none>
kube-system               aws-node-5zn6w                                              <wait>      10m             <none>          <wait>         <none>              <none>
kube-system               aws-node-h8cc5                                              20m         10m             <none>          42Mi           <none>              <none>
kube-system               aws-node-h9n4f                                              0m          10m             <none>          0Mi            <none>              <none>
kube-system               aws-node-lz5fn                                              17m         10m             <none>          41Mi           <none>              <none>
kube-system               aws-node-tpmxr                                              20m         10m             <none>          39Mi           <none>              <none>
kube-system               aws-node-zbkkh                                              23m         10m             <none>          47Mi           <none>              <none>
cluster-autoscaler        cluster-autoscaler-aws-cluster-autoscaler-5db55fbcf8-mdzkd  1m          100m            500m            9Mi            300Mi               500Mi
cluster-autoscaler        cluster-autoscaler-aws-cluster-autoscaler-5db55fbcf8-q9xs8  39m         100m            500m            75Mi           300Mi               500Mi
kube-system               coredns-56b56b56cd-bb26t                                    6m          100m            <none>          11Mi           70Mi                170Mi
kube-system               coredns-56b56b56cd-nhp58                                    6m          100m            <none>          11Mi           70Mi                170Mi
kube-system               coredns-56b56b56cd-wrmxv                                    7m          100m            <none>          12Mi           70Mi                170Mi
gitlab-runner-l           gitlab-runner-l-gitlab-runner-6b8b85f87f-9knnx              3m          100m            200m            10Mi           128Mi               256Mi
gitlab-runner-m           gitlab-runner-m-gitlab-runner-6bfd5d6c84-t5nrd              7m          100m            200m            13Mi           128Mi               256Mi
gitlab-runner-mda         gitlab-runner-mda-gitlab-runner-59bb66c8dd-bd9xw            4m          100m            200m            17Mi           128Mi               256Mi
gitlab-runner-ops         gitlab-runner-ops-gitlab-runner-7c5b85dc97-zkb4c            3m          100m            200m            12Mi           128Mi               256Mi
gitlab-runner-pst         gitlab-runner-pst-gitlab-runner-6b8f9bf56b-sszlr            6m          100m            200m            20Mi           128Mi               256Mi
gitlab-runner-s           gitlab-runner-s-gitlab-runner-6bbccb9b7b-dmwgl              50m         100m            200m            27Mi           128Mi               512Mi
gitlab-runner-shared      gitlab-runner-shared-gitlab-runner-688d57477f-qgs2z         3m          <none>          <none>          15Mi           <none>              <none>
kube-system               kube-proxy-5b65t                                            15m         100m            <none>          19Mi           <none>              <none>
kube-system               kube-proxy-7qsgh                                            12m         100m            <none>          24Mi           <none>              <none>
kube-system               kube-proxy-gn2qg                                            13m         100m            <none>          23Mi           <none>              <none>
kube-system               kube-proxy-pz7fp                                            15m         100m            <none>          18Mi           <none>              <none>
kube-system               kube-proxy-vdjqt                                            15m         100m            <none>          23Mi           <none>              <none>
kube-system               kube-proxy-x4xtp                                            19m         100m            <none>          15Mi           <none>              <none>
kube-system               kube-proxy-xlpn7                                            0m          100m            <none>          0Mi            <none>              <none>
metrics-server            metrics-server-5875c7d795-bj7cq                             5m          200m            500m            29Mi           200Mi               500Mi
metrics-server            metrics-server-5875c7d795-jpjjn                             7m          200m            500m            29Mi           200Mi               500Mi
gitlab-runner-s           runner-heq8ujaj-project-10386-concurrent-06t94f             <wait>      200m,100m       200m,200m       <wait>         200Mi,128Mi         500Mi,500Mi
gitlab-runner-s           runner-heq8ujaj-project-10386-concurrent-10lpn9j            1m          200m,100m       200m,200m       12Mi           200Mi,128Mi         500Mi,500Mi
gitlab-runner-s           runner-heq8ujaj-project-10386-concurrent-11jrxfh            <wait>      200m,100m       200m,200m       <wait>         200Mi,128Mi         500Mi,500Mi
gitlab-runner-s           runner-heq8ujaj-project-10386-concurrent-129hpvl            1m          200m,100m       200m,200m       12Mi           200Mi,128Mi         500Mi,500Mi
gitlab-runner-s           runner-heq8ujaj-project-10386-concurrent-13kswg8            1m          200m,100m       200m,200m       12Mi           200Mi,128Mi         500Mi,500Mi
gitlab-runner-s           runner-heq8ujaj-project-10386-concurrent-15qhp5w            <wait>      200m,100m       200m,200m       <wait>         200Mi,128Mi         500Mi,500Mi

Noteworthy: You can sort over CPU consumption with e.g.:

| awk 'NR<2{print $0;next}{print $0| "sort --key 3 --numeric -b --reverse"}

This works on Mac - I am not sure, if it works on Linux, too (because of join, sort, etc...).

Hopefully, someone can use this till kubectl gets a good view for that.

Image (Asset 47/73) alt=

eyalev commented on Feb 18

I have a good experience with kube-capacity.

Example:

kube-capacity --util

NODE              CPU REQUESTS    CPU LIMITS    CPU UTIL    MEMORY REQUESTS    MEMORY LIMITS   MEMORY UTIL
*                 560m (28%)      130m (7%)     40m (2%)    572Mi (9%)         770Mi (13%)     470Mi (8%)
example-node-1    220m (22%)      10m (1%)      10m (1%)    192Mi (6%)         360Mi (12%)     210Mi (7%)
example-node-2    340m (34%)      120m (12%)    30m (3%)    380Mi (13%)        410Mi (14%)     260Mi (9%)
Image (Asset 48/73) alt=

boniek83 commented on Apr 9
edited

In order for this tool to be truly useful it should detect all kubernetes device plugins deployed on cluster and show usage for all of them. CPU/Mem is definetly not enough. There's also GPUs, TPUs (for machine learning), Intel QAT and probably more I don't know about. Also what about storage? I should be able to easily see what was requested and what is used (ideally in terms of iops as well).

Image (Asset 49/73) alt=

davidB commented on Apr 9

@boniek83 , It's why I created kubectl-view-allocations, because I need to list GPU,... any feedback (on the github project) are welcomes. I curious to know if it detects TPU (it should if it is listed as a Node's resources)

Image (Asset 50/73) alt=

boniek83 commented on Apr 9

@boniek83 , It's why I created kubectl-view-allocations, because I need to list GPU,... any feedback (on the github project) are welcomes. I curious to know if it detects TPU (it should if it is listed as a Node's resources)

I'm aware of your tool and, for my purpose, it is the best that is currently available. Thanks for making it!
I will try to get TPUs tested after Easter. It would be helpful if this data would be available in web app format with pretty graphs so I wouldn't have to give any access to kubernetes to data scientists. They only want to know who is eating away resources and nothing more :)

Image (Asset 51/73) alt=

eht16 commented on Apr 12

Since none of the tools and scripts above fit my needs (and this issue is still open :( ), I hacked my own variant:
https://github.com/eht16/kube-cargo-load

It provides a quick overview of PODs in a cluster and shows their configured memory requests and limits and the actual memory usage. The idea is to get a picture of the ratio between configured memory limits and actual usage.

Image (Asset 52/73) alt=

RahulRatan07 commented on Apr 25

How can we get memory dumps logs of the pods?
Pods are often getting hung,

Image (Asset 53/73) alt=

hmsvigle commented on Apr 27

  • kubectl describe nodes OR kubectl top nodes , which one should be considered to calculate cluster resource utilization ?
  • Also Why there is difference between these 2 results.
    Is there any logical explanation this yet ?
Image (Asset 54/73) alt=
Contributor This user has previously committed to the kubernetes repository.

brianpursley commented on Apr 29

/kind feature

Image (Asset 55/73) alt=

prathameshd9 commented on Apr 30

All the comments and hacks with nodes worked well for me. I also need something for a higher view to keep track of..like sum of resources per node pool !

arunsah added a commit to arunsah/arunsah.github.io that referenced this issue on May 5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Linked pull requests

Successfully merging a pull request may close this issue.

None yet