Kubernetes contexts 101
This guide is to setup your access to k8s contexts or environments (dev, stage, production).
Jump Server
In order to access to k8s you need to access a jump server, you can use following:
Azure
20.172.208.50
You have two main ways to login
Simple
ssh -o ServerAliveInterval=3 -i PATH_TO_YOUR_PRIVATE_KEY USERNAME@JUMPSERVER_IP
Local port tunnel
ssh -o ServerAliveInterval=3 \
-L 40000:localhost:40000 \
-i PATH_TO_YOUR_PRIVATE_KEY \
USERNAME@JUMPSERVER_IP
This command will open a tunnel between your computer at port 40000
and the jump server port 40000
this will helpful later, but please note the following:
The tunnel will connect to any port of the server, even if you don’t started, meaning that if other developer is using it you will connect to his session
To avoid that I recommend the following use port ranges for each developer
Setup to k8s context
Once in the jump server, you need to login to a k8s context, usually credentials will last long time, so is not needed to login from time to time; so this should be only made once and only if you got messages about being logged out, then you need to repeat it.
To login you need:
az aks get-credentials --resource-group RESOURCE_GROUP --name CONTEXT_NAME
This will login using your azure credentials, which will basically show you a login url with a device token, you need to open a browser and login with your@touchcast.com
account and allow the device to access itIf everything is fine, it will create / update a file called
.kube/config
which is a YAML file; there will aclusters
section and at least onecluster
object, under this object there will be aserver
you need to replace this string (named Original server string in table) with proper replacement string (named Replacement server string in table)
Following is the table containing the resource groups available (you should have access at least to dev)
dev
Resource Group | cluster-1 |
---|---|
Context Name | cluster-1-aks |
Original server string |
|
Replacement server string |
|
stage
Resource Group | staging-cluster-1 |
---|---|
Context Name | staging-cluster-1-aks |
Original server string |
|
Replacement server string |
|
production
Resource Group | prod-cluster-1 |
---|---|
Context Name | prod-cluster-1-aks |
Original server string |
|
Replacement server string |
|
Common commands for k8s
Here is a reference for most operations on k8s kubectl Quick Reference
The following are most common operations used daily
Listing all containers in a namespace
For example listing all containers in gpt
namespace
The output will look like this
Getting logs of a container
For example
The following flags are useful
--tail=N
only shows the lastN
lines--follow
keeps running the command so any new log will be printed in the console
Accessing a running container
The most common CMD
is /bin/sh
but some containers have /bin/bash
which is better and user friendly
Note that several projects also have a /var/log/monolog.log
which contains the regular logs of the application running
Delete a pod
For now only use in these two scenarios:
when is a container you have created
when a superior has allowed it
most of the containers will be regenerated (pod name will be changed) after being deleted, so sometimes is needed to things like restarting state (cache for example) or is failing due some unexpected condition
Copying a file to a container
For example
This will copy a local file into a running pod
Starting a container
Creating a container is helpful in order to run some commands (like db / redis / cache access) or test some features.
Note the containers must be removed after finished the work. For now only dev
environment can be used for others is only when a superior ask you to run some operations (always will be first tested on dev)
The container description file is a YAML file, this template can be used to run a image
You need to provide following variables:
PODNAME
the name of the pod, usually add your name so can be know to which belongs, for example:ciscape-pipeline-daniel
NAMESPACE
the namespace where needs to be placed the container, for examplegpt
DOCKER_IMAGE
the image name, can be used the docker acr which template is like this:touchcastfabric.azurecr.io/IMAGE:TAG
for example forcogcache-prompt-proxy-api
project and tag for dev is:touchcastfabric.azurecr.io/cogcache/prompt-proxy-api:latest
you can also run standard docker images likeubuntu:20.04
orpython:3.11-slim
You can check the ephemeral-storage
and memory
from spec.containers[0].resources
but if you need more resources always is good to ask if k8s can handle it
After you run the command, you need to wait until the container is in Running
state (using kubectl get pods
) and later you can enter into it (kubectl exec
)
Port forwarding
If you create a ssh tunnel, you can forward a port from a running pod into your local machine, for example the argo server have an open port for UI at port 2746
as there is no way to reach the pod port, you need to do two steps:
forward the port
JUMP_PORT
from the jump server into the pod portPOD_PORT
JUMP_PORT
must be the same port used on the ssh connection ( the syntax of ssh tunnel is-L LOCAL_PORT:localhost:JUMP_PORT
So the steps are
So for example if we want to use local machine port 12345
and for developer daniel
and connect to argo ui following can be used:
You need to select a random port based on the previous port range per developer; also is suggested to use the same LOCAL_PORT
as JUMP_PORT
so will make easier to follow steps.
After the command is run you should be able to open argo ui on a local brower with address: http://localhost:12345
Switching contexts
If needs to be changed the current context the following commands can be run:
kubectl config use-context CONTEXT_NAME
So the commands will be:
Switch to dev
kubectl config use-context cluster-1-aks
Switch to stage
kubectl config use-context staging-cluster-1-aks
Switch to prod
kubectl config use-context prod-cluster-1-aks
To print the current context being used
kubectl config current-context
Is suggested to add the following at the end of ~/.profile
in jump server to see the current context:
So in the command line will show like:
Cluster Subscription & Resources
Fixing Issues
Listing commands not showing any item / Commands failing to get resources data
By default you will see all the resources in your account subscription; and most accounts are on the same subscription as resources; but in some specific accounts the resource’s subscription will be different than your account and you won’t see any resources listed (like if you didn’t have any permission or resources doesn’t exists)
The default subscription for most of the resources is: ab767a8f-4a33-4cf8-8dc2-8a0e2e2e6b0c
But there are resources in other subscriptions, for example PTU resources.
If you don’t see any resources you can try to set the subscription to the default:
And try again to check if is a permission issue
Additionally you can try to list all your subscriptions to see if the default is available to you:
Note that this command will require you to install an extension:
And later will ask you to login for this scope