SUSE Cloud Native Foundations: My Study Notes

I completed the SUSE Cloud Native Foundations scholarship through Udacity. These are my lesson notes, structured for reference, not for reading top to bottom.

I kept the parts that felt operationally useful and skipped most of the certification fluff.

Lesson 2: Cloud Native Application Design

Context discovery

Before writing a line of code, it's worth doing a proper context discovery pass. Two things to nail down:

Functional requirements — what the application actually needs to do:

Who are the stakeholders?
What are the core functionalities?
Who are the end users (customer-facing vs. internal tool)?
What are the inputs and outputs?
Which engineering teams are involved?

Available resources — what you actually have to work with:

Engineering headcount and skill set
Budget and financial constraints
Timelines
Existing internal knowledge

If you skip this step, you usually pay for it later in rework.

Monoliths and microservices

Most business applications still resolve into the same three tiers regardless of architecture:

UI — handles HTTP requests and returns a response
Business logic — the code that provides the actual service
Data layer — access and storage of data

The difference is how those tiers are packaged and deployed.

Monolith: all tiers are part of the same unit — one repository, shared resources (CPU/memory), one programming language, one release binary.

Microservice: each tier (or sub-component) is an independent unit — separate repositories, isolated resource allocation, a well-defined API surface, language of choice, its own release binary.

Trade-offs

Dimension	Monolith	Microservices
Development complexity	One language, one repo, sequential	Multiple languages/repos, concurrent
Scalability	Replicate the entire stack	Replicate only the hot unit
Time to deploy	One pipeline, higher risk per release	Many pipelines, lower risk per release
Flexibility	Restructuring required for new features	Change an independent unit
Operational cost	Low initially, exponential at scale	High initially, proportional at scale
Reliability	Whole stack fails together; low observability	Isolated failures; high per-unit visibility

Neither is universally better. The right choice depends on team size, traffic patterns, and how the application will be maintained at scale. The architecture will also change over time: services get split, merged, replaced, or retired as the product matures.

Maintenance operations

Once in production, architectures change. The common operations:

Split — a service has grown too large and complex; break it into smaller, manageable units
Merge — two closely coupled services make more sense as one
Replace — a more efficient implementation is available (e.g., rewriting a Java service in Go for latency gains)
Stale — a service no longer provides business value; archive or deprecate it

Same job, different shape: keep the system useful without making it miserable to run.

Application best practices

Regardless of architecture, apply these practices across every service to improve resilience, reduce time to recovery, and enable observability.

Health checks

Expose an HTTP endpoint (typically /healthz or /status) that returns the current health state. Kubernetes uses readiness checks to decide whether to send traffic to a Pod, and liveness checks to decide when to restart it.

@app.route('/status')
def status():
    response = app.response_class(
        response=json.dumps({"result": "OK - healthy"}),
        status=200,
        mimetype='application/json'
    )
    return response

@app.route('/status')
def status():
    response = app.response_class(
        response=json.dumps({"result": "OK - healthy"}),
        status=200,
        mimetype='application/json'
    )
    return response

Metrics

Expose a /stats endpoint reporting runtime statistics if you want a simple application view, or a Prometheus-compatible /metrics endpoint if you want standard scraping. What matters is that your platform can scrape it and your team can act on it.

@app.route('/stats')
def stats():
    response = app.response_class(
        response=json.dumps({
            "status": "success",
            "code": 0,
            "data": {"UserCount": 140, "UserCountActive": 23}
        }),
        status=200,
        mimetype='application/json'
    )
    return response

@app.route('/stats')
def stats():
    response = app.response_class(
        response=json.dumps({
            "status": "success",
            "code": 0,
            "data": {"UserCount": 140, "UserCountActive": 23}
        }),
        status=200,
        mimetype='application/json'
    )
    return response

Logs

Log to STDOUT and STDERR. A logging tool or node-level agent can collect from there without coupling the application to local files. Standard log levels:

DEBUG — fine-grained process events
INFO — coarse-grained operational info
WARN — potential issue, not yet an error
ERROR — error encountered, application still running
FATAL — critical failure, application not operational

Always include a timestamp on every log line.

import logging
logging.basicConfig(level=logging.DEBUG)
 
@app.route('/status')
def healthcheck():
    app.logger.info('Status request successful')
    ...

import logging
logging.basicConfig(level=logging.DEBUG)
 
@app.route('/status')
def healthcheck():
    app.logger.info('Status request successful')
    ...

Tracing

Tracing builds a full picture of how a request flows through multiple services. Individual service records are spans; a collection of spans forms a trace. Jaeger is the common implementation in Kubernetes environments.

Resource consumption

Know your CPU and memory baselines. Benchmark network throughput. Without resource awareness, you can't set meaningful Kubernetes requests and limits and the scheduler is flying blind.

Lesson 3: Container Orchestration with Kubernetes

Docker for application packaging

Three moving parts: Dockerfile, Docker image, Docker registry.

Dockerfile

A set of instructions that produces a layered image. Each instruction creates a layer; layers are cached. Change a layer early in the file and everything after it rebuilds.

Core instructions:

FROM    # set the base image
RUN     # execute a command during build
COPY    # copy files from host to container filesystem
ADD     # like COPY, but also handles URLs and tar extraction
CMD     # default command to run when the container starts
EXPOSE  # document the port the application listens on

FROM    # set the base image
RUN     # execute a command during build
COPY    # copy files from host to container filesystem
ADD     # like COPY, but also handles URLs and tar extraction
CMD     # default command to run when the container starts
EXPOSE  # document the port the application listens on

Example — packaging a Go application:

FROM golang:alpine
 
WORKDIR /go/src/app
 
ADD . .
 
RUN go build -o helloworld
 
EXPOSE 6111
 
CMD ["./helloworld"]

FROM golang:alpine
 
WORKDIR /go/src/app
 
ADD . .
 
RUN go build -o helloworld
 
EXPOSE 6111
 
CMD ["./helloworld"]

Docker image

Build and run:

# build from current directory, tag as go-helloworld
docker build -t go-helloworld .
 
# run in detached mode, map host port 5111 to container port 6111
docker run -d -p 5111:6111 go-helloworld
 
# retrieve container logs
docker logs <CONTAINER_ID>

# build from current directory, tag as go-helloworld
docker build -t go-helloworld .
 
# run in detached mode, map host port 5111 to container port 6111
docker run -d -p 5111:6111 go-helloworld
 
# retrieve container logs
docker logs <CONTAINER_ID>

Docker registry

Tag before pushing. An untagged image gets a non-human-readable ID; a tag provides registry/repo/name:version.

# tag for DockerHub
docker tag go-helloworld pixelpotato/go-helloworld:v1.0.0
 
# push
docker push pixelpotato/go-helloworld:v1.0.0

# tag for DockerHub
docker tag go-helloworld pixelpotato/go-helloworld:v1.0.0
 
# push
docker push pixelpotato/go-helloworld:v1.0.0

Public registries: DockerHub, GitHub Container Registry. Private: GCR, ECR, Harbor, Artifact Registry.

Docker command reference

docker build [OPTIONS] PATH          # build an image
docker run [OPTIONS] IMAGE           # run a container
docker logs CONTAINER_ID             # get container logs
docker images                        # list available images
docker ps                            # list running containers
docker tag SOURCE_IMAGE TARGET_IMAGE # tag an image
docker login                         # authenticate to DockerHub
docker push NAME[:TAG]               # push to registry
docker pull NAME[:TAG]               # pull from registry

docker build [OPTIONS] PATH          # build an image
docker run [OPTIONS] IMAGE           # run a container
docker logs CONTAINER_ID             # get container logs
docker images                        # list available images
docker ps                            # list running containers
docker tag SOURCE_IMAGE TARGET_IMAGE # tag an image
docker login                         # authenticate to DockerHub
docker push NAME[:TAG]               # push to registry
docker pull NAME[:TAG]               # pull from registry

Kubernetes architecture

Kubernetes is a container orchestrator. You declare desired state; Kubernetes works continuously to achieve and maintain it.

A cluster is made up of nodes — physical or virtual servers. Nodes split into two planes:

Control plane (master nodes) — makes cluster-wide decisions
Data plane (worker nodes) — hosts workloads

Control plane components

kube-apiserver — the nucleus. Exposes the Kubernetes API; all operations flow through it. Validates and persists state to etcd.
etcd — distributed key-value store. The source of truth for the entire cluster. Back it up.
kube-scheduler — watches for unscheduled Pods and assigns them to nodes based on resource availability, affinity, taints, and tolerations.
kube-controller-manager — runs the control loops (Deployment, ReplicaSet, Node controllers, etc.). Each loop reconciles desired vs. actual state.

Data plane components

kubelet — runs on every node. Receives PodSpecs from the API server and ensures described containers are running and healthy.
kube-proxy — manages network rules on each node; routes traffic to the correct Pod for a given Service.

kubelet and kube-proxy are installed on all nodes — master and worker alike.

Bootstrapping a cluster

Provisioning manually is error-prone. Tooling handles this automatically.

Production-grade: kubeadm, Kubespray, Kops, k3s

Development-grade: kind, minikube, k3d

For local dev with k3s via Vagrant:

vagrant status   # inspect available boxes
vagrant up       # spin up the box
vagrant ssh      # SSH in

vagrant status   # inspect available boxes
vagrant up       # spin up the box
vagrant ssh      # SSH in

Kubeconfig

Grants access to a cluster. Default location: ~/.kube/config. k3s places it at /etc/rancher/k3s/k3s.yaml.

Three sections:

Cluster — cluster name, API server endpoint, CA certificate
User — credentials (username/password, token, or client certificates)
Context — links a user to a cluster; current-context sets the active one

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: {{ CA }}
    server: https://127.0.0.1:63668
  name: udacity-cluster
users:
- name: udacity-user
  user:
    client-certificate-data: {{ CERT }}
    client-key-data: {{ KEY }}
- name: green-user
  user:
    token: {{ TOKEN }}
contexts:
- context:
    cluster: udacity-cluster
    user: udacity-user
  name: udacity-context
current-context: udacity-context

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: {{ CA }}
    server: https://127.0.0.1:63668
  name: udacity-cluster
users:
- name: udacity-user
  user:
    client-certificate-data: {{ CERT }}
    client-key-data: {{ KEY }}
- name: green-user
  user:
    token: {{ TOKEN }}
contexts:
- context:
    cluster: udacity-cluster
    user: udacity-user
  name: udacity-context
current-context: udacity-context

kubectl cluster-info                    # control plane and add-on endpoints
kubectl get nodes                       # list all nodes
kubectl get nodes -o wide               # with internal IPs and container runtime
kubectl describe node <NODE_NAME>       # full node config including pod CIDR

kubectl cluster-info                    # control plane and add-on endpoints
kubectl get nodes                       # list all nodes
kubectl get nodes -o wide               # with internal IPs and container runtime
kubectl describe node <NODE_NAME>       # full node config including pod CIDR

Kubernetes resources

Pods

The atomic unit. A Pod wraps one or more containers that share a network namespace and storage. The 1:1 Pod-to-container ratio is the recommended default. Don't run bare Pods in production, they won't be rescheduled if the node dies.

# headless pod for testing
kubectl run -it busybox-test --image=busybox --restart=Never

# headless pod for testing
kubectl run -it busybox-test --image=busybox --restart=Never

Deployments and ReplicaSets

A Deployment describes the desired state of an application. It manages a ReplicaSet, which ensures the specified number of replicas are running at all times.

Rolling update strategies:

RollingUpdate — replaces pods incrementally; supports maxSurge and maxUnavailable
Recreate — kills all existing pods before creating new ones

kubectl create deploy go-helloworld --image=pixelpotato/go-helloworld:v1.0.0 -n test

kubectl create deploy go-helloworld --image=pixelpotato/go-helloworld:v1.0.0 -n test

Full Deployment manifest with probes and resource limits:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: go-helloworld
  name: go-helloworld
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: go-helloworld
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: go-helloworld
    spec:
      containers:
      - image: pixelpotato/go-helloworld:v2.0.0
        imagePullPolicy: IfNotPresent
        name: go-helloworld
        ports:
        - containerPort: 6112
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: 6112
        readinessProbe:
          httpGet:
            path: /
            port: 6112
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: go-helloworld
  name: go-helloworld
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: go-helloworld
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: go-helloworld
    spec:
      containers:
      - image: pixelpotato/go-helloworld:v2.0.0
        imagePullPolicy: IfNotPresent
        name: go-helloworld
        ports:
        - containerPort: 6112
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: 6112
        readinessProbe:
          httpGet:
            path: /
            port: 6112
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"

readinessProbe gates traffic. livenessProbe restarts stuck pods. For long-running production services, both should be treated as standard, not optional extras.

Services

A Service provides a stable virtual IP and DNS name for a set of Pods. Pod IPs are ephemeral; the Service abstracts over that churn.

Type	Scope	Use case
`ClusterIP`	Internal only	Service-to-service (default)
`NodePort`	Node IP + static port	Direct external access for dev/testing
`LoadBalancer`	Cloud LB	Production external ingress

kubectl expose deploy go-helloworld --port=8111 --target-port=6112

kubectl expose deploy go-helloworld --port=8111 --target-port=6112

Full Service manifest:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: go-helloworld
  name: go-helloworld
  namespace: default
spec:
  ports:
  - port: 8111
    protocol: TCP
    targetPort: 6112
  selector:
    app: go-helloworld
  type: ClusterIP

apiVersion: v1
kind: Service
metadata:
  labels:
    app: go-helloworld
  name: go-helloworld
  namespace: default
spec:
  ports:
  - port: 8111
    protocol: TCP
    targetPort: 6112
  selector:
    app: go-helloworld
  type: ClusterIP

Ingress

Manages external HTTP/HTTPS access to services. An Ingress Controller reads the rules and configures the load balancer.

Request flow: external user → Ingress → Ingress Controller → LoadBalancer → Pod

ConfigMaps and Secrets

ConfigMaps store non-confidential key-value pairs. Secrets store sensitive data, but the values are only base64-encoded by default. If etcd encryption at rest is not enabled, that data is not meaningfully protected. In production, enable encryption at rest and preferably use an external secrets system.

kubectl create configmap test-cm --from-literal=color=yellow
kubectl create secret generic test-secret --from-literal=color=blue

kubectl create configmap test-cm --from-literal=color=yellow
kubectl create secret generic test-secret --from-literal=color=blue

Namespaces

Logical separation within a cluster by team, environment, or tenant. Resource quotas and network policies apply per namespace. Eliminates noisy-neighbour resource contention.

kubectl create ns test-udacity
kubectl get po -n test-udacity

kubectl create ns test-udacity
kubectl get po -n test-udacity

Imperative vs. declarative management

Imperative — kubectl create, kubectl run, kubectl expose directly against the live cluster. Fast for development; not version-controlled, not repeatable.

Declarative — YAML manifests applied with kubectl apply -f. Recommended for production. Manifests live in Git; changes are auditable.

Every YAML manifest has four required sections:

apiVersion: # API version for the resource type
kind:        # resource type (Deployment, Service, ConfigMap, etc.)
metadata:    # name, namespace, labels
spec:        # desired configuration state

apiVersion: # API version for the resource type
kind:        # resource type (Deployment, Service, ConfigMap, etc.)
metadata:    # name, namespace, labels
spec:        # desired configuration state

# apply all manifests in a directory
kubectl apply -f exercises/manifests/
 
# delete resources defined in a manifest
kubectl delete -f manifest.yaml
 
# generate a manifest template without creating the resource
kubectl create deploy demo --image=nginx --dry-run=client -o yaml

# apply all manifests in a directory
kubectl apply -f exercises/manifests/
 
# delete resources defined in a manifest
kubectl delete -f manifest.yaml
 
# generate a manifest template without creating the resource
kubectl create deploy demo --image=nginx --dry-run=client -o yaml

kubectl command reference

kubectl create RESOURCE NAME [FLAGS]        # create a resource
kubectl describe RESOURCE NAME              # detailed resource info
kubectl get RESOURCE NAME [-o yaml]         # get resource (optionally as YAML)
kubectl edit RESOURCE NAME                  # edit resource in-place
kubectl label RESOURCE NAME [PARAMS]        # add or update labels
kubectl port-forward RESOURCE/NAME [PARAMS] # forward a local port to a pod
kubectl logs RESOURCE/NAME [FLAGS]          # stream or retrieve logs
kubectl delete RESOURCE NAME                # delete a resource

kubectl create RESOURCE NAME [FLAGS]        # create a resource
kubectl describe RESOURCE NAME              # detailed resource info
kubectl get RESOURCE NAME [-o yaml]         # get resource (optionally as YAML)
kubectl edit RESOURCE NAME                  # edit resource in-place
kubectl label RESOURCE NAME [PARAMS]        # add or update labels
kubectl port-forward RESOURCE/NAME [PARAMS] # forward a local port to a pod
kubectl logs RESOURCE/NAME [FLAGS]          # stream or retrieve logs
kubectl delete RESOURCE NAME                # delete a resource

Failure modes

Kubernetes handles low-level failures automatically:

ReplicaSets — maintain the desired replica count
Liveness probes — restart pods in an errored state
Readiness probes — remove unhealthy pods from load balancer rotation
Services — single stable entry point across pod churn

Control plane failure is a separate category. Applications continue running and handling traffic, but no new workloads can be scheduled and no configuration changes can be applied. Recovering the control plane is a critical priority but it doesn't take down live traffic.

Lesson 4: Open Source PaaS

The problem PaaS solves

Running Kubernetes across multiple environments (sandbox, staging, production) and multiple regions compounds quickly. Three environments × three regions = nine clusters to upgrade, patch, and maintain. If you do not have a platform team, that is a fast way to manufacture operational overhead.

Cloud Foundry

Cloud Foundry is an application platform. Push source code; CF handles buildpacks, containerisation, routing, and scaling.

# target org and space
cf login -a https://api.example.com
cf target -o my-org -s production
 
# push an application
cf push my-app -b go_buildpack -m 256M -i 2
 
# scale horizontally
cf scale my-app -i 5
 
# tail logs
cf logs my-app --recent
 
# set environment variables
cf set-env my-app DB_HOST postgres.example.com
cf restage my-app

# target org and space
cf login -a https://api.example.com
cf target -o my-org -s production
 
# push an application
cf push my-app -b go_buildpack -m 256M -i 2
 
# scale horizontally
cf scale my-app -i 5
 
# tail logs
cf logs my-app --recent
 
# set environment variables
cf set-env my-app DB_HOST postgres.example.com
cf restage my-app

CF is opinionated; standard buildpacks, managed routing, one pipeline model. That's the value for standard web applications. The ceiling appears when you need fine-grained resource control, custom networking, or workloads that don't map cleanly to an HTTP process model.

Function as a Service

FaaS (AWS Lambda, GCP Cloud Functions) is the far end of the managed spectrum. You provide a function; the platform handles everything else.

Best suited for: event-driven, stateless, short-lived tasks. Not suited for: long-running processes, persistent connections, or complex warm-up requirements.

Glossary

Term	Definition
Monolith	Application design where all tiers are managed as a single unit
Microservice	Application design where tiers are independent, separately deployed units
Dockerfile	Set of instructions used to build a Docker image
Docker image	Read-only template for creating a runnable container
Docker registry	Central mechanism to store and distribute Docker images
Node	A physical or virtual server in a Kubernetes cluster
Cluster	A collection of distributed nodes for managing and hosting workloads
Master node	Control plane node — makes global cluster decisions
Worker node	Data plane node — hosts application workloads
Bootstrap	Process of provisioning a cluster so each node is fully operational
Kubeconfig	Metadata file that grants access to a Kubernetes cluster
Pod	Smallest deployable unit; provides the execution environment for a container
ReplicaSet	Ensures a desired number of Pod replicas are running at all times
Deployment	Describes and manages the desired state of an application
Service	Stable network abstraction over a collection of Pods
Ingress	Manages external HTTP/HTTPS access to cluster services
ConfigMap	Stores non-confidential configuration data as key-value pairs
Secret	Stores sensitive data as key-value pairs (base64-encoded)
Namespace	Logical separation between applications and their resources
CRD	Custom Resource Definition — extends the Kubernetes API
Imperative config	Managing resources via direct kubectl commands against the live cluster
Declarative config	Managing resources via YAML manifests stored and version-controlled locally