{"posts":[{"id":"cmov05f6h0000u1r4irglt7qg","title":"How Amazon S3 Files Changes Kubernetes Storage on EKS","slug":"how-amazon-s3-files-changes-kubernetes-storage-on-eks-9ab5fd8e0436","publishedAt":"2026-04-12T12:16:01.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/0*cgqbpW6nkSyHzSJe.png","excerpt":"How S3 Files enables shared storage for Kubernetes workloads on AWSYour data lives in S3. Your application needs a file system. And between those two things, yo...","content":"<blockquote>How S3 Files enables shared storage for Kubernetes workloads on AWS</blockquote><p>Your data lives in S3. Your application needs a file system. And between those two things, you end up writing sync scripts, duplicating data into EFS, paying double the storage cost, and still wondering why your pipeline is slow or your pods are failing on startup because the data was not ready yet.</p><p>This is not a small inconvenience. For teams running ML pipelines, multi-pod workloads, or anything that needs shared access to large datasets, this is a real architectural problem that wastes both time and money every single day.</p><p>Amazon S3 Files changes that. And if you are running workloads on EKS, this is one of those releases you actually need to pay attention to.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*cgqbpW6nkSyHzSJe.png\" /></figure><h3>What is Amazon S3 Files?</h3><p>Amazon S3 Files is a feature that provides file system–like access to data stored in S3. Instead of interacting with S3 only through APIs, applications can access data using standard file operations.</p><p>In practice, this means an S3 bucket can be mounted and accessed similar to a network file system. Applications can read, write, and manage files using familiar file system interfaces and existing libraries, without requiring changes to how they handle data.</p><p>Multiple compute services — including EC2, ECS, and EKS — can access the same data concurrently through this interface, enabling shared access patterns that were difficult to implement with traditional S3 access methods.</p><p>An important aspect of this model is that the data remains in S3. There is no requirement to copy or synchronize data into a separate file system before it can be used. The same underlying data can be accessed both through standard S3 APIs and through the file system interface.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*auWR6GGoyueO2_nl.png\" /></figure><h3>How Kubernetes Handles Storage</h3><p>To understand how S3 Files integrates with EKS, it is useful to look at how Kubernetes handles storage at a high level.</p><p>Kubernetes uses three core objects for storage.</p><p>A <strong>PersistentVolume (PV)</strong> represents the actual storage resource, such as an EBS volume, an EFS file system, or an external storage system exposed through a CSI driver.</p><p>A <strong>PersistentVolumeClaim (PVC)</strong> is how a pod requests storage. It defines the required size and access mode, and Kubernetes binds it to a matching PersistentVolume.</p><p>A <strong>StorageClass</strong> defines how storage should be provisioned. It is associated with a specific CSI driver and includes configuration parameters for dynamic provisioning.</p><p>The <strong>CSI driver</strong> (Container Storage Interface) is the integration layer that allows external storage systems to work with Kubernetes. AWS provides CSI drivers for services like EBS, EFS, and S3-based access.</p><p>S3 Files integrates into this model through a CSI-based approach, which enables it to be mounted and used by pods like other external storage systems.</p><h3>How S3 and EKS Used to Work Together</h3><p>Before S3 Files, using S3 storage on EKS typically involved the Mountpoint for Amazon S3 CSI driver. Mountpoint for S3 started as an open-source FUSE-based client and was later integrated into Kubernetes through a CSI driver.</p><p>It worked well for read-heavy workloads where pods needed to access data directly from S3 without copying it first. However, it had limitations that made it unsuitable for general-purpose or write-heavy use cases.</p><p>S3 objects are immutable, so modifying part of a file required replacing the entire object. There was no support for file locking, which made concurrent writes across multiple pods unreliable. Common file system operations such as directory renames were also not supported. As a result, workloads requiring shared read-write access across pods could not rely on this approach.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*vnpy9jIcUNpzj0l1.png\" /></figure><p>In practice, this led to a two-layer storage pattern: S3 for durable storage and EFS for file system access. Data was often copied or synchronized between the two, increasing cost and adding operational overhead.</p><h3>How S3 Files Works with EKS</h3><p>S3 Files integrates with EKS through a CSI-based approach. Instead of using the Mountpoint for S3 CSI driver, it relies on the Amazon EFS CSI driver (version 3.0.0 or above) to enable mounting through a file system interface.</p><p>The setup involves a few key steps.</p><p>First, create an S3 file system associated with your bucket:</p><pre>aws s3files create-file-system \\<br>  --bucket your-bucket-name \\<br>  --file-system-name my-s3-filesystem</pre><p>This returns a file system identifier, which is used when defining the PersistentVolume in Kubernetes.</p><p>Next, configure IAM permissions for your pods. This is typically done using EKS Pod Identities by attaching the required policy to the pod execution role:</p><pre>aws iam attach-role-policy \\<br>  --role-name your-eks-pod-role \\<br>  --policy-arn arn:aws:iam::aws:policy/AmazonS3FilesCSIDriverPolicy</pre><p>Then install or upgrade the EFS CSI driver in your cluster:</p><pre>helm repo add aws-efs-csi-driver \\<br>  https://kubernetes-sigs.github.io/aws-efs-csi-driver/</pre><pre>helm repo update<br>helm upgrade --install aws-efs-csi-driver \\<br>  aws-efs-csi-driver/aws-efs-csi-driver \\<br>  --namespace kube-system \\<br>  --version 3.0.0</pre><p>After this, define a PersistentVolume using the S3 file system identifier, bind it to a PersistentVolumeClaim, and mount it into your pods. This allows multiple pods to access the same underlying data through a shared file system interface.</p><p>One important requirement is that the EKS cluster and the S3 Files mount target must be within the same VPC. Ensure that network and security group configurations allow NFS traffic (port 2049) between worker nodes and the mount target.</p><h3>Where S3 Files Actually Helps on EKS</h3><p>S3 Files is not the right solution for every storage requirement on EKS. However, for specific workloads, it addresses long-standing limitations in how data is accessed and shared.</p><p>Machine learning training pipelines are a primary example. Large datasets are typically stored in S3, often ranging from tens to hundreds of gigabytes. Traditionally, this required staging data into a file system or copying it to nodes before processing could begin. With S3 Files, training pods can access the same dataset through a shared interface without requiring additional data movement.</p><p>AI agent workloads benefit in a similar way. These systems often need to persist state, write checkpoints, and share data across multiple pods running in parallel. A shared file system interface simplifies coordination, allowing different pods to read and write data in a more consistent manner.</p><p>Shared configuration and certain log-processing scenarios are also relevant. Instead of relying on intermediate systems to move data into S3, workloads can interact with shared files directly through the mounted interface. This can simplify how data is produced and consumed across pods.</p><p>Data lake workloads are another strong fit. When analytics data is already stored in S3 and processing tools expect a file system interface, S3 Files can reduce the need for intermediate data transfer steps before processing.</p><h3>S3 Files vs EFS on EKS</h3><p>S3 Files and EFS serve different purposes, even though both provide file system–style access for workloads on EKS.</p><p>EFS is designed for low-latency, fully managed file storage where data is frequently accessed. It is well suited for workloads that require consistent performance and traditional file system behavior across all data.</p><p>S3 Files is better aligned with scenarios where data already resides in S3 and needs to be accessed through a file system interface. Instead of maintaining a separate file system, it allows workloads to interact with the same underlying data using file-based access patterns.</p><p>Another key difference is access flexibility. With S3 Files, data remains in S3 and can still be accessed through standard S3 APIs, CLI tools, and SDKs. This allows the same dataset to be used across both object-based and file-based workflows. With EFS, data is managed separately and does not natively integrate with S3 without additional data movement.</p><p>In terms of workload fit, EFS is more suitable for latency-sensitive applications and environments that require full compatibility with file system operations. S3 Files is more appropriate for large datasets, shared access scenarios, and workflows where avoiding data duplication is important.</p><p>There are also platform considerations. EFS supports EKS workloads running on both EC2 and Fargate, while S3 Files is currently limited to EC2-backed nodes.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*4bimLib1Odn9bZuP.png\" /></figure><h3>Limitations of Amazon S3 Files on EKS</h3><p>S3 Files is a new capability and comes with limitations that are important to consider before using it in production.</p><p><strong>Fargate is not supported.</strong> S3 Files currently works only with EC2-backed EKS nodes. Workloads running on Fargate cannot use this feature.</p><p><strong>Static provisioning only.</strong> Dynamic provisioning is not available. File systems must be created separately, and <em>PersistentVolumes</em> need to be defined manually, which can add operational overhead.</p><p><strong>S3 key length limits affect directory depth.</strong> S3 enforces a maximum object key length of 1,024 bytes. Applications that generate deeply nested paths or long file names may encounter this limit.</p><p><strong>File-level operations are not available in CloudTrail.</strong> CloudTrail captures control plane actions such as creating or modifying file systems, but individual file operations are not logged at that level. Monitoring is typically available through aggregated metrics.</p><p><strong>EFS CSI driver version requirement.</strong> S3 Files requires version 3.0.0 or later of the EFS CSI driver. Clusters running older versions must be upgraded before using this feature.</p><pre>kubectl get deployment efs-csi-controller \\<br>  -n kube-system \\<br>  -o jsonpath=’{.spec.template.spec.containers[?(@.name==”efs-plugin”)].image}’</pre><h3>Conclusion</h3><p>S3 Files addresses a long-standing gap in how storage is handled for EKS workloads. It introduces a way to access S3 data through a file system interface, reducing the need for separate storage layers in certain scenarios.</p><p>For EKS environments, this can simplify architectures that previously relied on combining S3 with a separate file system for application access. In cases where data already resides in S3, workloads can interact with it more directly without requiring additional data movement.</p><p>However, it is not a complete replacement for existing storage solutions. Limitations such as lack of Fargate support, static provisioning, and its early-stage nature mean it should be evaluated carefully before production use.</p><p>For workloads such as machine learning pipelines, shared data processing, and data lake access, S3 Files provides a practical alternative where file-based access to S3 data is required.</p><p>A reasonable approach is to start with controlled workloads, evaluate behavior, and determine where it fits within your existing architecture.</p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9ab5fd8e0436\" width=\"1\" height=\"1\" alt=\"\"><hr><p><a href=\"https://blog.devops.dev/how-amazon-s3-files-changes-kubernetes-storage-on-eks-9ab5fd8e0436\">How Amazon S3 Files Changes Kubernetes Storage on EKS</a> was originally published in <a href=\"https://blog.devops.dev\">DevOps.dev</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>","url":"https://blog.devops.dev/how-amazon-s3-files-changes-kubernetes-storage-on-eks-9ab5fd8e0436?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:50.582Z","updatedAt":"2026-05-07T04:45:06.381Z","tags":[]},{"id":"cmov05h6j0001u1r4lcw34n3q","title":"Using Docker Hub to Store, Share, and Manage Docker Images","slug":"using-docker-hub-to-store-share-and-manage-docker-images-6bf747994937","publishedAt":"2026-02-01T16:00:04.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/873/1*MW2KoAWhtEd-o_i5SUCfnA.png","excerpt":"Learn how Docker Hub is used to store, share, and manage Docker images efficiently for development, CI/CD pipelines, and production deployments.Docker Hub is th...","content":"<p><em>Learn how Docker Hub is used to store, share, and manage Docker images efficiently for development, CI/CD pipelines, and production deployments.</em></p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/873/1*MW2KoAWhtEd-o_i5SUCfnA.png\" /></figure><p>Docker Hub is the <strong>most widely used container registry</strong> for storing, sharing, and managing <strong>Docker images</strong>. It serves as a <strong>central platform</strong> where <strong>developers and DevOps teams</strong> can publish images, collaborate efficiently, and integrate container workflows into modern <strong>CI/CD pipelines</strong>.</p><p>As a core part of the <strong>Docker ecosystem</strong>, Docker Hub enables teams to maintain a <strong>consistent and reliable source of container images</strong> across <strong>development, testing, and production environments</strong>. With features such as <strong>public and private repositories</strong>, <strong>access control</strong>, and <strong>automated image builds</strong>, it simplifies how <em>containerized applications</em> are distributed and deployed.</p><p>In this blog post, we’ll explore how Docker Hub works in practice — covering <strong>image push and pull operations</strong>, <strong>repository management</strong>, <strong>access control</strong>, and <strong>best practices</strong> for maintaining <em>secure and reliable container images</em>. Whether you’re just starting with containers or managing <strong>large-scale deployments</strong>, Docker Hub remains a <strong>critical tool</strong> for efficient container image management.</p><h3>1. What Is Docker Hub?</h3><p>Docker Hub is a <strong>cloud-based container registry</strong> that allows users to <strong>store, manage, and distribute Docker images</strong> efficiently. It functions much like <em>GitHub for Docker images</em>, providing a <strong>centralized platform</strong> where developers can publish <em>containerized applications</em> and share them with teams or the broader community.</p><p>Docker Hub hosts both <strong>public and private repositories</strong>, enabling teams to distribute <strong>open-source images</strong> or securely manage <strong>internal applications</strong>. By using Docker Hub, developers ensure that Docker images are <strong>versioned</strong>, <strong>accessible</strong>, and <strong>consistent</strong> across different environments.</p><p>At its core, Docker Hub focuses on <strong>two key responsibilities</strong>:</p><h4>1.1 Image Storage</h4><p>Docker Hub acts as a <strong>central storage layer</strong> for Docker images, ensuring they are <em>safely stored</em> and <strong>accessible from any system</strong> running Docker.</p><h4>1.2 Image Distribution</h4><p>Docker Hub simplifies <strong>image delivery</strong> by allowing images to be <strong>uploaded once</strong> and <strong>pulled multiple times</strong> across different environments, supporting <strong>faster deployments</strong> and <strong>consistent application behavior</strong>.</p><h3>2. Key Features of Docker Hub</h3><p>Docker Hub is more than just a storage service — it’s a <strong>complete platform</strong> for managing, distributing, and automating <strong>container images</strong>. Its features are designed to support <strong>teams</strong>, <strong>pipelines</strong>, and <strong>production-ready deployments</strong>.</p><h4>2.1 Store &amp; Access Docker Images</h4><p>Securely store Docker images in <strong>public or private repositories</strong> and share them across <strong>development</strong>, <strong>testing</strong>, and <strong>production environments</strong>.</p><h4>2.2 Continuous Builds</h4><p>Automatically build images from <strong>GitHub or GitLab</strong> whenever code is updated, keeping containers <strong>up to date</strong> with minimal manual effort.</p><h4>2.3 Team Collaboration &amp; Access</h4><p>Manage who can <strong>view</strong>, <strong>push</strong>, or <strong>pull</strong> images. Organize users into <strong>teams and roles</strong> for secure collaboration at scale.</p><h4>2.4 Docker Hub Images</h4><p>Use <strong>official images</strong> from trusted vendors like <strong>Nginx</strong>, <strong>MySQL</strong>, <strong>Redis</strong>, or <strong>verified publisher images</strong> for improved security and reliability.</p><h4>2.5 Webhooks &amp; CI/CD Automation</h4><p>Trigger builds, deployments, or workflows automatically using <strong>webhooks</strong>, keeping <strong>production environments synchronized</strong> with the latest code.</p><h4>2.6 Scaling Docker Hub</h4><p>For large organizations, manage multiple <strong>teams</strong>, <strong>repositories</strong>, and <strong>permissions</strong> efficiently using <strong>Docker Hub organization features</strong>.</p><h3>3. Docker Hub vs Other Container Registries</h3><p>Docker Hub is a <strong>general-purpose container registry</strong>, but it is not the only option. Teams choose container registries based on <strong>scale</strong>, <strong>security requirements</strong>, and <strong>cloud integration</strong>.</p><ul><li><strong>Docker Hub: </strong>Best suited for <strong>public images</strong>, simplicity, and <strong>broad ecosystem support</strong></li><li><strong>Amazon ECR: </strong>Preferred for <strong>AWS-native workloads</strong> and <strong>private enterprise images</strong></li><li><strong>Azure Container Registry: </strong>Optimized for <strong>Azure</strong> and <strong>AKS environments</strong></li><li><strong>Google Artifact Registry: </strong>Designed specifically for <strong>GCP workloads</strong></li><li><strong>GitHub Container Registry: </strong>Provides tight integration with <strong>GitHub repositories and workflows</strong></li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*IBbLh5DjD43Me_YGK3dWVg.png\" /></figure><p>Docker Hub is well-suited for <strong>open-source projects</strong> and <strong>small to medium teams</strong>. For <strong>large-scale</strong> or <strong>cloud-specific production systems</strong>, organizations often rely on <strong>cloud-native</strong> or <strong>private container registries</strong>.</p><h3>4. Setting Up a Docker Hub Account</h3><p>Getting started with Docker Hub is <strong>quick and easy</strong>. Creating an account allows you to <strong>store, manage, and share Docker images</strong> with your team or the wider community.</p><h4>4.1 Create Your Account</h4><ul><li><strong>Visit Docker Hub:</strong> Go to <a href=\"https://hub.docker.com/\"><strong>https://hub.docker.com</strong></a></li><li><strong>Sign Up:</strong> Click <strong>Sign Up</strong> and register using your <strong>email</strong> or <strong>GitHub account</strong></li><li><strong>Verify Email:</strong> Docker Hub will send a confirmation email — click the link to <strong>verify your account</strong></li></ul><h4>4.2 Log In to Docker Hub</h4><p>Once verified, log in to Docker Hub via the <strong>web interface</strong> or the <strong>command line interface (CLI)</strong> using:</p><pre>docker login</pre><p>Enter your <strong>Docker Hub username and password</strong> when prompted. This authenticates your machine to <strong>push</strong> and <strong>pull</strong> images.</p><h4>4.3 Explore Your Dashboard</h4><p>After logging in, you can:</p><ul><li><strong>Create new repositories</strong> for your images</li><li><strong>Organize existing images</strong></li><li><strong>Manage access</strong> for collaborators or teams</li></ul><p>With your account set up, you’re ready to <strong>push Docker images</strong> and <strong>pull existing images</strong>.</p><h3>5. Docker Hub Repository Structure</h3><p>Docker Hub isn’t just a storage place — it’s a <strong>smart, organized platform</strong> for managing Docker images at scale. Understanding its repository structure helps teams <strong>collaborate</strong>, <strong>automate workflows</strong>, and <strong>deploy reliably</strong>.</p><h4>5.1 Docker Image Identifier</h4><p>Each image on Docker Hub follows a clear naming format:</p><pre>&lt;username&gt;/&lt;repository&gt;:&lt;tag&gt;</pre><p><strong>Example:</strong></p><pre>bhagirath00/webapp:latest</pre><ul><li><strong>bhagirath00</strong> → Docker Hub username</li><li><strong>webapp</strong> → Repository name</li><li><strong>latest</strong> → Tag representing a specific version of the image</li></ul><p>This structure makes it easy to <strong>track versions</strong>, <strong>roll back updates</strong>, and <strong>share images consistently</strong> across teams.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*YoJ-YY-dx-1qbKEOLDkLWw.png\" /></figure><h4>5.2 Public vs. Private Repositories</h4><p>Docker Hub supports flexible visibility options:</p><ul><li><strong>Public Repositories: </strong>Open to everyone. Ideal for <strong>open-source projects</strong> or <strong>community sharing</strong>.</li><li><strong>Private Repositories: </strong>Access-controlled. Only <strong>authorized team members</strong> can push or pull images — suitable for <strong>internal or sensitive projects</strong>.</li></ul><h4>5.3 Tags: Versioning and Organization</h4><p>Tags are more than just labels — they are <strong>powerful tools</strong> for version control:</p><ul><li>Use <strong>semantic versioning</strong> (v1.0, v1.1, v2.0)</li><li>Reserve <strong>latest</strong> for stable releases, not production-critical deployments</li><li>Maintain <strong>consistent naming conventions</strong> across multiple environments</li></ul><h3>6. Docker Hub Deployment Architectures</h3><p>In real-world systems, Docker Hub functions as a <strong>central image distribution layer</strong> between <strong>build systems</strong> and <strong>runtime environments</strong>. It enables <strong>consistent</strong> and <strong>repeatable deployments</strong> across <strong>development</strong>, <strong>staging</strong>, and <strong>production</strong> environments.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*aVq1wkIaBcii8pSR5sH9hA.png\" /></figure><h4>Common Usage Patterns:</h4><ul><li><strong>Local Development to Production</strong>: Developers build images <strong>locally</strong> and push them to <strong>Docker Hub</strong>. Servers or virtual machines pull the <em>same images</em> for deployment, ensuring <strong>identical runtime environments</strong>.</li><li><strong>CI/CD-Driven Image Distribution: </strong>Build pipelines compile application code, create <strong>Docker images</strong>, and publish them to Docker Hub. Deployment platforms then pull <strong>versioned images</strong> directly from the registry.</li><li><strong>Container Orchestration Platforms: Kubernetes</strong> and other orchestrators retrieve images from Docker Hub during <strong>pod creation</strong>, <strong>scaling</strong>, and <strong>rolling updates</strong>.</li></ul><h3>7. Pushing an Image to Docker Hub</h3><p>Once your Docker image is ready locally, you can push it to <strong>Docker Hub</strong> so it’s accessible from <strong>any environment</strong>.</p><p><strong>Step 1</strong>: Authenticate with Docker Hub</p><p>Log in from your terminal:</p><pre>docker login</pre><p>Enter your <strong>Docker Hub username and password</strong> when prompted. This connects your local machine to Docker Hub.</p><p><strong>Step 2</strong>: Tag Your Image</p><p>Docker requires images to be <strong>tagged</strong> before pushing:</p><pre>docker tag &lt;local-image&gt; &lt;username&gt;/&lt;repository&gt;:&lt;tag&gt;</pre><p><strong>Example:</strong></p><pre>docker tag myapp bhagirath00/myapp:v1.0</pre><ul><li><strong>myapp</strong> → Local image name</li><li><strong>bhagirath00/myapp:v1.0</strong> → Docker Hub repository and tag</li></ul><p><strong>Step 3</strong>: Push the Image</p><pre>docker push bhagirath00/myapp:v1.0</pre><p>Once complete, your image is available on <strong>Docker Hub</strong>, ready to be <strong>pulled by others</strong> or <strong>deployed in production</strong>.</p><h3>8. Pulling Docker Images from Docker Hub</h3><p>Pulling images lets you download Docker images from <strong>Docker Hub</strong> to your <strong>local system</strong> so they can be run as containers.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*nalUyTVfgqSwYZhLPZVwjw.png\" /></figure><p><strong>Step 1:</strong> Pull the Image</p><p>Use the docker pull command:</p><pre>docker pull &lt;username&gt;/&lt;repository&gt;:&lt;tag</pre><p><strong>Example:</strong></p><pre>docker pull bhagirath00/myapp:v1.0</pre><p>If you omit the tag, Docker defaults to <strong>latest</strong>:</p><pre>docker pull bhagirath00/myapp</pre><p><strong>Step 2</strong>: Run the Pulled Image</p><p>After pulling, run the image as a container:</p><pre>docker run bhagirath00/myapp:v1.0</pre><p>This launches the container with the <strong>environment defined by the image</strong>.</p><h3>9. Controlling Docker Images Across Environments</h3><p>Managing Docker images extends beyond <strong>building</strong> and <strong>pulling</strong> them. In <strong>production environments</strong>, images must be <strong>controlled</strong>, <strong>audited</strong>, and <strong>retired systematically</strong> to maintain <strong>stability</strong> and <strong>operational reliability</strong>.</p><h4>9.1 Build and Distribution Control</h4><p>Images should be built in <strong>controlled environments</strong>, typically <strong>CI systems</strong>, and distributed through <strong>Docker Hub as the single source of truth</strong>. Rebuilding images directly in <strong>production environments</strong> introduces <strong>inconsistency</strong> and <strong>operational risk</strong>.</p><h4>9.2 Environment Consistency</h4><p>The <strong>same image artifact</strong> should be used across <strong>development</strong>, <strong>staging</strong>, and <strong>production</strong> environments. Rebuilding images per environment increases <strong>configuration drift</strong> and makes issues <strong>difficult to reproduce</strong>.</p><h4>9.3 Retention and Decommissioning</h4><p>As applications evolve, <strong>outdated images</strong> must be removed to:</p><ul><li><strong>Reduce storage clutter</strong></li><li><strong>Limit accidental deployments of deprecated versions</strong></li><li><strong>Maintain operational clarity</strong></li></ul><p>Only images required for <strong>active deployments</strong> and <strong>rollback scenarios</strong> should be retained.</p><h4>9.4 Operational Recovery</h4><p>A managed <strong>image lifecycle</strong> enables <strong>fast recovery</strong> during incidents by redeploying <strong>previously validated images</strong> rather than rebuilding under pressure. This approach <strong>minimizes downtime</strong> and <strong>reduces failure risk</strong>.</p><p>Effective lifecycle management ensures <strong>Docker Hub</strong> remains a <strong>controlled distribution system</strong> rather than an <strong>unstructured image store</strong>.</p><h3>10. Docker Hub Rate Limits, Plans, and Usage</h3><p>Docker Hub applies <strong>usage limits</strong> that primarily affect <strong>image pulls</strong>, especially in <strong>automated</strong> and <strong>production environments</strong>.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*f9dmUhbWBzC23iU64WFbwQ.png\" /></figure><p>Docker Hub provides different <strong>subscription tiers</strong> to support varying usage levels:</p><ul><li><strong>Free tier: </strong>Intended for <strong>personal use</strong> and <strong>public images</strong>, with <strong>limited pull capacity</strong></li><li><strong>Paid tiers: </strong>Designed for <strong>team workflows</strong>, <strong>CI/CD pipelines</strong>, and <strong>production deployments</strong>, offering <strong>higher or unrestricted usage</strong></li><li><strong>Enterprise tier: </strong>Built for <strong>large organizations</strong> requiring <strong>governance</strong> and <strong>operational scale</strong></li></ul><h4>For reliable deployments</h4><ul><li><strong>Always authenticate Docker clients</strong></li><li><strong>Avoid anonymous pulls</strong></li><li><strong>Use appropriate subscription tiers for CI/CD and production workloads</strong></li></ul><h3>11. Managing Repositories and Access on Docker Hub</h3><p>Docker Hub allows you to <strong>organize</strong>, <strong>secure</strong>, and <strong>control access</strong> to your Docker images efficiently. Proper repository and access management ensures that your images are available to the <strong>right people</strong> and protected from <strong>unauthorized access</strong>.</p><h4>11.1 Creating a Repository</h4><p>To organize your Docker images, start by creating a repository:</p><ul><li>Log in to <strong>Docker Hub</strong> and navigate to <strong>Repositories</strong></li><li>Click <strong>Create Repository</strong></li><li>Enter a <strong>Repository Name</strong> and select visibility: <strong>Public</strong> → Open to everyone. <strong>Private</strong> → Restricted to authorized users</li><li>Click <strong>Create</strong></li></ul><p>Your repository is now ready to receive images.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*6wPUn6q2AV4GwfHLt0Sb3w.png\" /></figure><h3>11.2 Managing Access</h3><p>For <strong>private repositories</strong>, Docker Hub lets you control who can <strong>push</strong>, <strong>pull</strong>, or <strong>manage</strong> images:</p><ul><li>Navigate to <strong>Repository Settings → Collaborators</strong></li><li>Add team members or individual users using their <strong>Docker Hub username</strong></li><li>Assign appropriate roles: <strong>Read, Write, Admin</strong></li></ul><h4>11.3 Organizations and Teams</h4><p>Docker Hub supports <strong>organizations</strong>, allowing businesses to:</p><ul><li><strong>Group multiple repositories</strong> under a single organization</li><li><strong>Create teams</strong> with role-based access control</li><li><strong>Control access</strong> for development, testing, or production environments</li><li><strong>Simplify management</strong> across multiple projects</li></ul><h3>12. Automating Docker Builds on Docker Hub</h3><p>Docker Hub can automatically build Docker images whenever changes are pushed to a connected <strong>Git repository</strong>. Automation streamlines workflows, reduces <strong>manual effort</strong>, and ensures images are always <strong>up to date</strong> with the latest code.</p><h4>12.1 Connect Your Git Repository</h4><p>To enable automated builds, link your <strong>GitHub</strong> or <strong>GitLab</strong> account:</p><ul><li>Go to <strong>Docker Hub → Account Settings → Linked Accounts</strong></li><li>Click <strong>Connect</strong> next to GitHub or GitLab</li><li>Authorize Docker Hub to access your repositories</li></ul><p>This allows Docker Hub to <strong>monitor code changes</strong> and trigger builds automatically.</p><h4>12.2 Create an Automated Build Repository</h4><ol><li>Navigate to <strong>Repositories → Create Repository</strong></li><li>Enter a <strong>Repository Name</strong></li><li>Choose <strong>Automated Build</strong> instead of a standard repository</li><li>Link it to the relevant <strong>Git repository</strong></li><li>Configure <strong>branch or tag triggers</strong><br>(e.g., build images when code is pushed to main)</li></ol><h4>12.3 Configure Build Rules</h4><ul><li>Map <strong>Git branches or tags</strong> to Docker image tags<br>(e.g., main → latest, v1.0 → v1.0)</li><li>Set <strong>build contexts</strong> if the Dockerfile is not in the root directory</li><li>Enable <strong>build notifications</strong> or <strong>webhooks</strong> for CI/CD integration</li></ul><p>Automation ensures every code change produces a <strong>fresh, versioned Docker image</strong>, keeping deployments consistent.</p><h4>12.4 Benefits of Automated Builds</h4><ul><li><strong>Time-saving</strong>: No manual docker build commands required</li><li><strong>Consistency</strong>: Images always reflect the latest code</li><li><strong>CI/CD ready</strong>: Seamless pipeline integration</li><li><strong>Version control</strong>: Tags applied automatically from Git branches or releases</li></ul><h3>Conclusion</h3><p>Docker Hub is a <strong>powerful platform</strong> for <strong>storing</strong>, <strong>sharing</strong>, and <strong>managing Docker images</strong>. Whether you’re working on a <strong>small personal project</strong> or a <strong>large enterprise application</strong>, Docker Hub provides the <strong>flexibility</strong> and <strong>scalability</strong> required to manage container images effectively.</p><p>In this post, we covered how to <strong>push and pull images</strong>, <strong>manage repositories</strong>, <strong>automate builds</strong>, and apply <strong>best practices</strong> for using Docker Hub efficiently. Mastering Docker Hub helps <strong>streamline workflows</strong>, <strong>enhance collaboration</strong>, and ensure Docker images are <strong>consistently available for deployment</strong>.</p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=6bf747994937\" width=\"1\" height=\"1\" alt=\"\"><hr><p><a href=\"https://blog.devops.dev/using-docker-hub-to-store-share-and-manage-docker-images-6bf747994937\">Using Docker Hub to Store, Share, and Manage Docker Images</a> was originally published in <a href=\"https://blog.devops.dev\">DevOps.dev</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>","url":"https://blog.devops.dev/using-docker-hub-to-store-share-and-manage-docker-images-6bf747994937?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:53.179Z","updatedAt":"2026-05-07T04:45:08.097Z","tags":[]},{"id":"cmov05hpn0002u1r42wyfz906","title":"End-to-End CI/CD Pipeline Using Jenkins and Kubernetes","slug":"end-to-end-ci-cd-pipeline-using-jenkins-and-kubernetes-99be6542a07f","publishedAt":"2026-01-19T12:40:12.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/1*wdnX5U75JIcltmXmuCz_XA.png","excerpt":"Building Scalable, Cloud-Native CI/CD Pipelines with Jenkins and KubernetesIn modern DevOps workflows, running Jenkins on static or long-lived build agents ofte...","content":"<blockquote>Building Scalable, Cloud-Native CI/CD Pipelines with Jenkins and Kubernetes</blockquote><p>In modern <strong>DevOps workflows</strong>, running <strong>Jenkins </strong>on static or long-lived build agents often leads to scalability issues, inefficient resource usage, and maintenance overhead. As applications grow and deployment frequency increases, <strong>CI/CD systems</strong> must be dynamic, resilient, and <strong><em>cloud-native.</em></strong></p><p><strong>Kubernetes</strong> solves these challenges by providing on-demand, isolated, and auto-scalable environments for Jenkins workloads. By integrating Jenkins with Kubernetes, teams can dynamically provision build agents as pods, optimize resource utilization, and build highly scalable<strong> CI/CD pipelines.</strong></p><p>In this blog, you’ll learn how Jenkins integrates with Kubernetes for CI/CD, understand the pipeline architecture, set up Jenkins on Kubernetes, and build a production-ready <strong>CI/CD pipeline </strong>using containerized workloads and Kubernetes deployments.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*wdnX5U75JIcltmXmuCz_XA.png\" /></figure><h3>1. Why Integrate Jenkins with Kubernetes for CI/CD?</h3><p><strong>Kubernetes</strong> provides a robust and scalable platform for running containerized applications, and Jenkins is a powerful tool for automating the<strong> CI/CD pipeline</strong>. When integrated, these two tools can provide significant benefits:</p><ul><li><strong>Dynamic Agent Provisioning</strong>: <strong>Jenkins</strong> dynamically creates <strong>Kubernetes pods</strong> as build agents for each <strong>pipeline run</strong>. Agents are provisioned only when needed and automatically destroyed after job completion, eliminating idle infrastructure.</li><li><strong>Scalability</strong>: <strong>Kubernetes </strong>scales <strong>Jenkins</strong> agents based on workload demand. Multiple<strong> pipelines </strong>can run in parallel, allowing for faster builds and testing cycles.</li><li><strong>Isolation</strong>: Each Jenkins job runs inside its own Kubernetes<strong> pod,</strong> ensuring clean, reproducible, and conflict-free build environments across pipelines.</li><li><strong>Cloud-Native Deployment</strong>: Applications can be built, containerized, and deployed directly to Kubernetes<strong> clusters, </strong>enabling seamless end-to-end CI/CD workflows in cloud-native environments.</li><li><strong>Resource Efficiency</strong>: Because agents are short-lived and container-based, system resources are consumed only during active pipeline execution, significantly reducing infrastructure costs.</li></ul><h3>2. Prerequisites for Jenkins and Kubernetes CI/CD Integration</h3><p>Before integrating Jenkins with Kubernetes, ensure you have the following prerequisites in place. These prerequisites form the foundation for a stable and production-ready CI/CD setup.</p><ul><li><strong>Kubernetes Cluster</strong>: A running Kubernetes cluster is required to host Jenkins agents and deploy applications. This can be a managed Kubernetes service such as<strong> Amazon EKS, Google GKE, Azure AKS</strong>, or a self-managed on-premise<strong> cluster.</strong></li><li><strong>Jenkins Installed</strong>: Jenkins must be installed and accessible. It can run: — inside a Kubernetes cluster (recommended for cloud-native setups) or on a standalone virtual machine or server.</li><li><strong>Kubernetes Plugin for Jenkins</strong>: The Kubernetes <strong>Plugin </strong>enables Jenkins to dynamically provision Kubernetes pods as build agents. This plugin is essential for running CI/CD pipelines using Kubernetes-based agents.</li><li><strong>Cluster Access and Permissions</strong>: Jenkins must have <strong>permission to communicate </strong>with the<strong> Kubernetes API</strong> server. This is typically achieved using a Kubernetes Service Account with the required RBAC roles.</li><li><strong>kubectl</strong>: The kubectl CLI tool is useful for: — managing Kubernetes resources — debugging deployments — running deployment steps inside <strong>Jenkins pipelines</strong></li></ul><h3>3. Jenkins Kubernetes Integration Architecture</h3><p>Jenkins integrates with Kubernetes using the <strong>Kubernetes Plugin</strong>, which allows Jenkins to run CI/CD jobs inside Kubernetes pods instead of on static build agents.</p><p>In this setup, Jenkins focuses on <strong>orchestrating the pipeline</strong>, while Kubernetes handles <strong>executing jobs and managing resources</strong>. Whenever a pipeline starts, Jenkins asks Kubernetes to spin up a temporary pod to run the job. Once the job finishes, the pod is automatically removed.</p><p>This makes the entire CI/CD system dynamic, scalable, and cloud-native.</p><h3>How Jenkins and Kubernetes Work Together:</h3><ol><li><strong>Jenkins Controller</strong>: Jenkins controller manages pipelines, jobs, and credentials. It does not run builds directly. Instead, it coordinates with Kubernetes to run jobs on demand.</li><li><strong>Kubernetes Plugin</strong>: plugin connects Jenkins to the Kubernetes cluster and handles the creation and cleanup of agent pods whenever a pipeline is triggered</li><li><strong>Kubernetes Agent Pods</strong>: Each CI/CD job runs inside its own Kubernetes pod. These pods are: — created only when needed — isolated from each other — automatically destroyed after the job completes</li><li><strong>Jenkins Pipeline</strong>: A Jenkinsfile defining the CI/CD steps, including build, test, and deployment stages.</li><li><strong>Kubernetes Cluster</strong>: Kubernetes cluster provides the infrastructure where agent pods run and where applications are ultimately deployed.</li></ol><h3>4. CI/CD Pipeline Architecture with Jenkins and Kubernetes</h3><p>This CI/CD architecture uses Jenkins as the pipeline orchestrator and Kubernetes as the execution and deployment platform. Instead of relying on static Jenkins agents, Kubernetes dynamically provisions build agents as pods, making the pipeline scalable and resource-efficient.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*_ytHNYS4c7d4lfEC.png\" /></figure><h3>4.1. Git</h3><p>The pipeline begins with a code change pushed to a Git repository (GitHub, GitLab, or Bitbucket).<br>A webhook triggers Jenkins automatically on every commit or pull request, ensuring that no manual intervention is required.</p><p><strong>Role of Git:</strong></p><ul><li>Stores application source code and Dockerfile</li><li>Triggers Jenkinsfilepipelines via webhooks</li><li>Acts as the single source of truth for builds</li></ul><h3>4.2. Jenkins Controller</h3><p>The Jenkins controller manages the CI/CD pipeline logic defined in the Jenkinsfile.<br>When a build is triggered, Jenkins does <strong>not</strong> execute jobs on itself. Instead, it requests Kubernetes to create an ephemeral agent pod.</p><p><strong>Key Responsibilities:</strong></p><ul><li>Parses the Jenkinsfile</li><li>Orchestrates pipeline stages (build, test, deploy)</li><li>Requests Kubernetes to provision agent pods</li><li>Tracks pipeline execution and logs</li></ul><h3>4.3. Kubernetes Agent Pods (Dynamic Build Agents)</h3><p>Using the Jenkins Kubernetes Plugin, Jenkins dynamically spins up <strong>agent pods</strong> inside the Kubernetes cluster. Each pipeline run gets its own isolated pod, which is destroyed after completion.</p><p><strong>Why this matters:</strong></p><ul><li>No long-running or idle agents</li><li>Clean environment for every build</li><li>Parallel pipelines without conflicts</li><li>Automatic scaling based on workload</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*Ju4WkHXjTDwaZKx0.png\" /></figure><p>Each agent pod can include multiple containers (for example: Maven, Docker CLI, kubectl ), allowing different stages to run in the right environment.</p><h3>4.4. Docker Image Build &amp; Push</h3><p>Inside the Kubernetes agent pod, Jenkins builds the application and creates a Docker image using the project’s Dockerfile.<br>The image is then pushed to a container registry such as <strong>Docker Hub, Amazon ECR, or GCR</strong>.</p><p><strong>What happens here:</strong></p><ul><li>Application is compiled and tested</li><li>Docker image is built inside the agent pod</li><li>Image is tagged with version or commit hash</li><li>Image is pushed to a container registry</li></ul><p>This ensures the same image is used across all environments.</p><h3>4.5. Kubernetes Deployment</h3><p>Once the Docker image is available in the registry, Jenkins deploys the application to Kubernetes using kubectl or Helm.</p><p><strong>Deployment flow:</strong></p><ul><li>Jenkins applies Kubernetes manifests or Helm charts</li><li>Kubernetes pulls the image from the registry</li><li>Pods are created or updated using rolling deployments</li><li>Application becomes available via Service or Ingress</li></ul><p>This completes the <strong>end-to-end CI/CD loop</strong> from code commit to a running application in Kubernetes.</p><h3>5. How to Install and Run Jenkins on Kubernetes</h3><p>Getting Jenkinsfileup and running on Kubernetes is easier than you might think, especially with <strong>Helm</strong>, the package manager for Kubernetes. Helm simplifies complex deployments and ensures you can get a production-ready Jenkins instance quickly.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*_cL08duqC6hEyNfQ.png\" /></figure><h3>5.1 Installing Jenkins with Helm</h3><p>The easiest way to install Jenkins on Kubernetes is using Helm.</p><h4>Step 1: Create a Namespace for Jenkins</h4><p>It’s a good practice to isolate Jenkins in its own namespace:</p><pre>kubectl create namespace jenkins</pre><h4>Step 2: Install Jenkins</h4><p>Helm is a package manager for Kubernetes that simplifies the installation of complex applications like Jenkins. To install Jenkins using Helm:</p><pre>helm repo add jenkins https://charts.jenkins.io<br>helm repo update<br>helm install jenkins jenkins/jenkins --namespace jenkins</pre><h4>Step 3: Access Jenkins</h4><p>Once installed, you can access Jenkins via the Kubernetes service. To get the admin password:</p><pre>kubectl get svc --namespace jenkins</pre><pre>kubectl exec --namespace jenkins -it $(kubectl get pods --namespace jenkins -l &quot;app.kubernetes.io/component=jenkins-master&quot; -o jsonpath=&quot;{.items[0].metadata.name}&quot;) -- cat /run/secrets/chart-admin-password</pre><p>Open Jenkins in your browser using the service IP and port, then log in using the retrieved admin password.</p><h3>5.2 Configuring the Cloud</h3><p>Once Jenkins is installed, configure it to use Kubernetes for dynamic agent provisioning:</p><ol><li><strong>Install the Kubernetes Plugin</strong>: Go to <strong>Manage Jenkins</strong> &gt; <strong>Manage Plugins</strong> and install the <strong>Kubernetes Plugin</strong>. This plugin allows Jenkins to communicate with your cluster and provision agents on-demand.</li><li><strong>Configure Kubernetes Cloud</strong>:</li></ol><ul><li>Navigate to <strong>Manage Jenkins</strong> &gt; <strong>Configure System</strong>.</li><li>Scroll down to <strong>Cloud</strong> and click <strong>Add a new cloud</strong> &gt; <strong>Kubernetes</strong>.</li><li>Provide the <strong>Kubernetes API URL</strong>, <strong>Jenkins URL</strong>, and configure the <strong>Kubernetes Service Account</strong> so Jenkins can manage pods.</li></ul><p><strong>3. Create Pod Templates</strong>: Pod templates define what containers are included in each Jenkins agent pod. You can create different templates for different types of jobs, for example:</p><ul><li>Maven builds</li><li>Docker image builds</li><li>Helm deployments</li></ul><h3>6. Jenkinsfile-Based CI/CD Pipeline Implementation</h3><p>With Jenkins configured to use Kubernetes, the next step is to set up CI/CD pipelines that build and deploy applications to Kubernetes.</p><p>A <strong>Jenkinsfil</strong>e allows you to describe your entire pipeline <em>— build, test, and deployment </em>as code, making it version-controlled, repeatable, and easy to maintain.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*dRSOPfqg481_5UD0.png\" /></figure><h3>6.1 Configuring Jenkins Pipeline for Kubernetes</h3><p>A<strong> Jenkinsfile</strong> defines <strong>what steps your pipeline runs</strong> and <strong>where they run</strong>.<br>When using Kubernetes integration, Jenkins dynamically creates a <strong>pod-based agent</strong> for each pipeline execution.</p><p>Here’s an example of a <strong>Jenkinsfile</strong> that uses Kubernetes agents and deploys an application to a Kubernetes cluster:</p><pre>pipeline {<br>    agent {<br>        kubernetes {<br>            label &#39;my-k8s-agent&#39;<br>            defaultContainer &#39;jnlp&#39;<br>            yaml &#39;&#39;&#39;<br>            apiVersion: v1<br>            kind: Pod<br>            spec:<br>              containers:<br>              - name: maven<br>                image: maven:3.9.6-eclipse-temurin-17<br>                command:<br>                - cat<br>                tty: true<br>              - name: kubectl<br>                image: bitnami/kubectl:latest<br>                command:<br>                - cat<br>                tty: true<br>            &#39;&#39;&#39;<br>        }<br>    }<br>    stages {<br>        stage(&#39;Build&#39;) {<br>            steps {<br>                container(&#39;maven&#39;) {<br>                    sh &#39;mvn clean install&#39;<br>                }<br>            }<br>        }<br>        stage(&#39;Test&#39;) {<br>            steps {<br>                container(&#39;maven&#39;) {<br>                    sh &#39;mvn test&#39;<br>                }<br>            }<br>        }<br>        stage(&#39;Deploy to Kubernetes&#39;) {<br>            steps {<br>                container(&#39;kubectl&#39;) {<br>                    sh &#39;kubectl apply -f deployment.yaml&#39;<br>                }<br>            }<br>        }<br>    }<br>}</pre><p><strong>What’s happening here?</strong></p><ul><li>Jenkins creates a <strong>temporary Kubernetes pod</strong> for this pipeline run</li><li>The pod includes multiple containers (Maven for build/test,<strong> kubectl</strong> for deployment)</li><li>Each stage runs in the most appropriate container</li><li>After the pipeline finishes, the pod is automatically destroyed</li></ul><p>This approach keeps builds <strong>clean, isolated, and scalable</strong>.</p><h3>6.2 Automating Deployments to Kubernetes</h3><p>In the pipeline above, the <strong>Deploy to Kubernetes</strong> stage uses kubectl to apply Kubernetes manifests.<br>These YAML files typically define resources such as:</p><ul><li>Deployments</li><li>Services</li><li>ConfigMaps</li><li><strong>Ingress</strong></li></ul><p>Because deployment happens only after successful build and test stages, Jenkins ensures that <strong>only validated artifacts</strong> reach your Kubernetes cluster.</p><p>This automation removes manual deployment steps and enables fast, consistent releases.</p><h3>6.3 Deploying Applications with Helm</h3><p>While kubectl apply works well, managing multiple YAML files can become difficult as applications grow.<br>This is where <strong>Helm</strong> becomes extremely useful.</p><p>Helm allows you to:</p><ul><li>Package Kubernetes resources into reusable charts</li><li>Version deployments</li><li>Easily upgrade or roll back releases</li></ul><p>Here’s a simple<strong> Jenkinsfile </strong>example that deploys an application using Helm:</p><pre>pipeline {<br>    agent any<br>    stages {<br>        stage(&#39;Build&#39;) {<br>            steps {<br>                sh &#39;mvn clean install&#39;<br>            }<br>        }<br>        stage(&#39;Deploy to Kubernetes with Helm&#39;) {<br>            steps {<br>                sh &#39;helm upgrade --install myapp ./helm-chart/&#39;<br>            }<br>        }<br>    }<br>}</pre><p>With Helm:</p><ul><li>Application configuration becomes cleaner</li><li>Environment-specific values are easier to manage</li><li>Production deployments are more predictable</li></ul><h3>7. Best Practices for Jenkins Kubernetes CI/CD Pipelines</h3><p>To get the most out of Jenkins and Kubernetes, it’s important to follow a few proven best practices. These help keep your pipelines scalable, secure, and easy to maintain as workloads grow.</p><ul><li><strong>Use Pod Templates</strong>: Define reusable pod templates for different job types to avoid duplication.</li><li><strong>Run Each Job in an Isolated Pod</strong>: Each Jenkins job should run in an isolated pod to ensure that builds are clean and independent.</li><li><strong>Leverage Auto-scaling</strong>: Enable auto-scaling in Kubernetes to dynamically adjust the number of nodes based on Jenkins job demand.</li><li><strong>Manage Secrets Securely</strong>: Use Kubernetes secrets to securely manage credentials and sensitive information.</li><li><strong>Use Helm</strong>: Package your application as a Helm chart to simplify deployment and versioning.</li></ul><h3>8. Monitoring and Scaling Jenkins CI/CD Pipelines on Kubernetes</h3><p>As CI/CD pipelines grow in complexity and usage, monitoring and scaling become critical to maintaining performance and reliability. Kubernetes makes this much easier by providing built-in scalability and strong observability integrations.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/0*PBVgqZxIYq0AUFam.png\" /></figure><h3>Monitoring Jenkins</h3><ul><li><strong>Jenkins Dashboard</strong>: The Jenkins dashboard gives a quick, high-level view of pipeline executions, build history, and agent activity. It’s useful for tracking failed jobs, build durations, and overall pipeline health.</li><li><strong>Prometheus and Grafana: </strong>For deeper visibility, Jenkins can be integrated with Prometheus and Grafana. This allows teams to monitor: — Resource usage of Jenkins controllers and agents — Build and job execution metrics — Pod and node performance inside the Kubernetes cluster — Grafana dashboards make it easy to visualize trends, detect bottlenecks, and proactively address performance issues before they impact deployments.</li></ul><h3>Scaling Jenkins with Kubernetes</h3><p>Kubernetes enables Jenkins to scale automatically based on workload demand. Jenkins agents can be created or destroyed as pods, allowing the CI/CD system to handle sudden spikes in build traffic without manual intervention.</p><p>By combining Kubernetes auto-scaling with proper monitoring, teams can ensure that:</p><ul><li>Builds remain fast during peak usage</li><li>Infrastructure costs stay optimized</li><li><strong>CI/CD pipelines</strong> remain reliable and resilient</li></ul><h3>Conclusion</h3><p>Integrating<strong> Jenkins with Kubernetes </strong>creates a modern, cloud-native CI/CD platform that is scalable, efficient, and production-ready. By running Jenkins agents as Kubernetes pods, teams can dynamically provision build environments, optimize resource usage, and eliminate the limitations of static build agents.</p><p>Kubernetes features such as pod isolation, auto-scaling, and Helm-based deployments allow Jenkins pipelines to remain clean, reliable, and easy to manage as applications grow. This integration enables seamless automation — from code commits and builds to testing and deployment directly into Kubernetes clusters.</p><p>By combining <strong>Jenkins and Kubernetes</strong>, you can build CI/CD pipelines that are faster, more resilient, and ready for real-world production workloads — making continuous delivery a natural part of your <strong>DevOps workflow.</strong></p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=99be6542a07f\" width=\"1\" height=\"1\" alt=\"\"><hr><p><a href=\"https://blog.devops.dev/end-to-end-ci-cd-pipeline-using-jenkins-and-kubernetes-99be6542a07f\">End-to-End CI/CD Pipeline Using Jenkins and Kubernetes</a> was originally published in <a href=\"https://blog.devops.dev\">DevOps.dev</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>","url":"https://blog.devops.dev/end-to-end-ci-cd-pipeline-using-jenkins-and-kubernetes-99be6542a07f?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:53.867Z","updatedAt":"2026-05-07T04:45:11.033Z","tags":[]},{"id":"cmov05i830003u1r4gbnl8p9h","title":"How Git Stores Files Internally to Saves Space in Your Repository","slug":"how-git-stores-files-internally-to-saves-space-in-your-repository-61c5f9e25d6a","publishedAt":"2026-01-15T04:16:56.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/1*0G_JPNH0pFrZdhkHx_xM9g.png","excerpt":"Learn how Git stores files internally using snapshots, blobs, trees, and hashing to avoid duplication and save repository space efficiently.Git is the most wide...","content":"<blockquote>Learn how Git stores files internally using snapshots, blobs, trees, and hashing to avoid duplication and save repository space efficiently.</blockquote><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*0G_JPNH0pFrZdhkHx_xM9g.png\" /></figure><p>Git is the most widely used version control system in the world, and one of the key reasons for its popularity is its <strong>highly efficient storage model</strong>. At first glance, Git appears to store a complete copy of your project every time you commit. Surprisingly, repositories remain compact even after thousands of commits.</p><p>So how does Git duplicate files while still saving disk space?</p><p>In this article, we will explore <strong>how Git stores files internally</strong>, how it <strong>avoids unnecessary duplication</strong>, and why its storage mechanism is both <strong>fast and space-efficient</strong>. By the end, you will clearly understand how Git manages file data under the hood and why it scales so well for large projects.</p><h3>Overview: How Git Stores Data Efficiently</h3><p>Unlike traditional version control systems such as Subversion (SVN), which store <strong>file differences </strong>between versions, Git takes a fundamentally different approach.</p><p>Git stores <strong>snapshots of the entire project state</strong> at every commit.</p><p>However, Git is smart enough <strong>not to duplicate unchanged data</strong>. If a file has not changed between commits, Git simply <strong>reuses the previously stored version</strong> instead of saving a new copy. This design enables Git to deliver:</p><ul><li>Faster operations (branching, merging, checkout)</li><li>Reduced disk usage</li><li>Strong data integrity and reliability</li></ul><h3>1. How Git Stores Data Using Snapshots Instead of File Differences</h3><p>Most version control systems track <strong>line-by-line changes</strong> over time. Git does not.</p><p>Every time you create a commit, Git records a <strong>snapshot of the entire file structure</strong> at that moment.</p><h3>What Happens When Files Don’t Change?</h3><p>If a file remains unchanged between commits:</p><ul><li>Git does <strong>not</strong> store the file again</li><li>Git simply creates a reference to the existing stored content</li></ul><p>This means Git behaves like a <strong>content-addressable filesystem</strong>, where identical content is stored once and referenced many times.</p><h3>Why This Matters</h3><p>This snapshot model allows Git to:</p><ul><li>Instantly switch between branches</li><li>Perform fast merges</li><li>Avoid recalculating diffs repeatedly</li></ul><h3>2. Git Object Model: How Files Are Stored Internally</h3><p>Git stores all repository data as <strong>objects</strong> inside the .git/objects directory. Each object is identified by a <strong>cryptographic hash</strong> based on its content.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*W1fbzO4cWgnJZiEozg2Hig.png\" /></figure><p>There are four primary object types in Git:</p><ul><li><strong>Blob</strong> — File contents</li><li><strong>Tree</strong> — Directory structure</li><li><strong>Commit</strong> — A snapshot with metadata</li><li><strong>Tag</strong> — Named references to commits</li></ul><h3>2.1 Blob Objects: File Content Storage</h3><p>A <strong>blob (Binary Large Object)</strong> represents the <strong>raw content of a file</strong>.</p><p>Key characteristics of blobs:</p><ul><li>Store file data only (no filename or permissions)</li><li>Identical file contents result in <strong>identical blob hashes</strong></li><li>Stored only once, regardless of how many commits reference them</li></ul><h3>Why Blobs Enable De-duplication</h3><p>If two files — or the same file across commits — have identical content:</p><ul><li>Git stores <strong>one blob</strong></li><li>Multiple commits point to the same blob</li></ul><p>This is the foundation of Git’s space-saving mechanism.</p><p>You can inspect blobs using:</p><pre>git ls-tree &lt;commit-hash&gt;</pre><h3>2.2 Tree Objects: Directory Structures</h3><p>A <strong>tree object</strong> represents a directory in your project.</p><p>It contains:</p><ul><li>File names</li><li>File permissions</li><li>References to blob objects</li><li>References to other tree objects (subdirectories)</li></ul><p>Each directory in your project maps to a tree object, allowing Git to recreate the complete filesystem structure for any commit.</p><h3>2.3 Commit Objects: Snapshots in Time</h3><p>A <strong>commit object</strong> ties everything together.</p><p>It contains:</p><ul><li>A reference to the root tree</li><li>Author and committer information</li><li>Commit message</li><li>Parent commit(s)</li></ul><h3>Commit Structure Example</h3><pre>Commit<br>└── Tree (Root Directory)<br>    ├── Blob (File 1)<br>    ├── Blob (File 2)<br>    └── Tree (Subdirectory)<br>        ├── Blob (File 3)<br>        └── Blob (File 4)</pre><p>Each commit represents a <strong>complete snapshot</strong>, but most data is reused from earlier commits.</p><h3>3. Inside the .git Directory: Git’s Internal Storage and Control System</h3><p>The .git directory is the <strong>core of every Git repository</strong>. It stores all metadata, objects, and references.</p><h3>3.1 .git/objects/</h3><p>This directory stores all Git objects (blobs, trees, commits) in compressed form. Objects are named using their hash values.</p><h3>3.2 .git/refs/</h3><p>References to branches and tags live here. Each branch is simply a pointer to a commit.</p><h3>3.3 .git/index (Staging Area)</h3><p>The index tracks what will be included in the next commit. It bridges the gap between your working directory and the repository.</p><h3>3.4 .git/HEAD</h3><p>The HEAD file points to the currently checked-out branch or commit.</p><h3>4. How Git Uses Hashing, Compression, and De-duplication to Save Space</h3><p>Git’s efficiency comes from three core techniques.</p><h3>4.1 Content-Addressable Hashing</h3><p>Git computes a hash (SHA-1 by default, SHA-256 supported) for every object based on its content.</p><ul><li>Same content → same hash</li><li>Different content → different hash</li></ul><p>This guarantees data integrity and prevents duplication.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*ACX2vpNc8RrF25qdc2-KWA.png\" /></figure><h3>4.2 Object Compression</h3><p>Git compresses objects using <strong>zlib</strong>, reducing disk usage while maintaining fast access.</p><h3>4.3 Automatic De-duplication</h3><p>Git never stores the same content twice. If a file hasn’t changed:</p><ul><li>No new blob is created</li><li>Existing blobs are reused</li></ul><p>This is how Git <strong>duplicates files logically without duplicating data physically</strong>.</p><h3>5. From Working Directory to Commits: How Git Builds and Stores Snapshots</h3><p>To fully understand how Git duplicates files while saving space, it is essential to understand the <strong>three logical areas</strong> through which every change flows: the <strong>working directory</strong>, the <strong>staging area</strong>, and the <strong>commit history</strong>. These are not just conceptual layers — they directly influence how Git creates objects and reuses existing data.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*3knFXIGKg0zLFYwMb8lDqQ.png\" /></figure><h3>5.1 Working Directory</h3><p>The <strong>working directory</strong> is the actual project folder on your local machine. It contains real files that you edit using your editor or IDE.</p><p>Key characteristics:</p><ul><li>Files here exist <strong>outside</strong> of Git’s object database</li><li>Changes are not tracked automatically</li><li>Git does not store anything permanently at this stage</li></ul><p>When you modify a file in the working directory:</p><ul><li>Git detects the change</li><li>No new blob is created yet</li><li>No disk space inside .git/objects is used</li></ul><p>This design allows Git to remain fast and lightweight while you experiment with changes.</p><h3>5.2 Staging Area (Index)</h3><p>The <strong>staging area</strong>, also called the <strong>index</strong>, is where Git begins its internal storage optimization.</p><p>When you run:</p><pre>git add &lt;file&gt;</pre><p>Git performs the following actions:</p><ul><li>Reads the file content from the working directory</li><li>Computes a hash based on the content</li><li>Checks whether an identical blob already exists</li><li>Reuses the existing blob or creates a new one if needed</li><li>Records the blob reference in .git/index</li></ul><p>Important details:</p><ul><li>The staging area stores <strong>references</strong>, not copies</li><li>Unchanged files reuse existing blob objects</li><li>Partial staging is supported, allowing fine-grained commits</li></ul><p>This is where Git’s <strong>de-duplication logic</strong> begins to take effect.</p><h3>5.3 Commit History</h3><p>When you run:</p><pre>git commit</pre><p>Git creates a <strong>commit object</strong>, which includes:</p><ul><li>A reference to a tree object</li><li>Metadata (author, timestamp, message)</li><li>A reference to the parent commit</li></ul><p>Crucially:</p><ul><li>Git does <strong>not</strong> duplicate file content</li><li>The new tree references existing blobs whenever possible</li><li>Only changed files produce new blobs</li></ul><p>Each commit represents a <strong>complete snapshot</strong>, but internally, most data is shared across commits. This allows Git to maintain a full project history without ballooning repository size.</p><h3>6. Exploring Git’s Internals Using Low-Level Git Commands</h3><p>One of Git’s strengths is transparency. Git provides low-level commands that allow you to <strong>inspect its internal object database</strong>, making it easier to understand how files are stored and reused.</p><p>These commands are especially valuable for developers who want to understand Git beyond everyday workflows.</p><h3>6.1 git cat-file: Viewing Raw Git Objects</h3><p>The git cat-file command allows you to inspect any Git object directly.</p><p>To view a commit object:</p><pre>git cat-file -p &lt;object-hash&gt;</pre><p>This displays:</p><ul><li>The referenced tree</li><li>Parent commit</li><li>Author and committer details</li><li>Commit message</li></ul><p>You can also inspect blob objects to see file content exactly as Git stores it, confirming that identical content is reused across commits.</p><h3>6.2 git ls-tree: Exploring Tree Structures</h3><p>The git ls-tree command shows how a commit or tree maps to files and directories.</p><pre>git ls-tree &lt;commit-hash&gt;</pre><p>Output includes:</p><ul><li>File permissions</li><li>Object type (blob or tree)</li><li>Object hash</li><li>File or directory name</li></ul><p>This command clearly demonstrates how Git builds directory snapshots using <strong>tree objects that reference blob objects</strong>, without duplicating data.</p><h3>6.3 git rev-parse: Resolving References to Hashes</h3><p>The git rev-parse command helps resolve symbolic references into their actual object hashes.</p><pre>git rev-parse HEAD</pre><p>Use cases include:</p><ul><li>Verifying which commit a branch points to</li><li>Debugging detached HEAD states</li><li>Understanding reference resolution</li></ul><p>This reinforces the idea that <strong>branches and tags are lightweight pointers</strong>, not copies of data.</p><h3>Conclusion: Why Git’s Storage Model Is So Powerful</h3><p>Git’s ability to duplicate files logically without duplicating data physically is the cornerstone of its performance and scalability. By storing content as immutable, hashed objects and reusing them across commits, Git ensures that repositories remain fast and space-efficient — even with extensive histories.</p><h3>Key Takeaways</h3><ul><li>Git stores <strong>snapshots</strong>, not file diffs</li><li>Identical file content is stored <strong>only once and reused</strong></li><li>Blobs, trees, and commits form Git’s object model</li><li>The .git directory contains all internal data</li><li>Hashing and compression ensure integrity and efficiency</li></ul><p>Understanding Git’s internal storage model gives you deeper confidence when working with branches, rebases, merges, and large repositories. It also explains why Git continues to outperform traditional version control systems in both speed and reliability.</p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=61c5f9e25d6a\" width=\"1\" height=\"1\" alt=\"\"><hr><p><a href=\"https://blog.devops.dev/how-git-stores-files-internally-to-saves-space-in-your-repository-61c5f9e25d6a\">How Git Stores Files Internally to Saves Space in Your Repository</a> was originally published in <a href=\"https://blog.devops.dev\">DevOps.dev</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>","url":"https://blog.devops.dev/how-git-stores-files-internally-to-saves-space-in-your-repository-61c5f9e25d6a?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:54.531Z","updatedAt":"2026-05-07T04:45:18.430Z","tags":[]},{"id":"cmov05iki0004u1r45jy9s678","title":"OpsGuard: A Production-Ready Status Page with Modern DevOps Practices","slug":"opsguard-a-production-ready-status-page-with-modern-devops-practices-a37e1c14a71d","publishedAt":"2025-12-31T05:46:42.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/1*eqa9tnFrTGuluPdxW5cLxg.png","excerpt":"From Code to Cloud: Building a Self-Healing Status Page on AWS EKSHow I built an enterprise-grade infrastructure monitoring platform using Kubernetes, Terraform...","content":"<h3>From Code to Cloud: Building a Self-Healing Status Page on AWS EKS</h3><p><strong><em>How I built an enterprise-grade infrastructure monitoring platform using Kubernetes, Terraform, and GitOps — from concept to deployment on AWS</em></strong></p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*eqa9tnFrTGuluPdxW5cLxg.png\" /></figure><h3><strong>Introduction: Why I Built OpsGuard</strong></h3><p>Every modern organization needs a way to communicate service health to their customers. When AWS, GitHub, or Cloudflare experience issues, they don’t leave users guessing — they have public status pages that provide real-time updates. But what if you could build your own?</p><p>That’s exactly what I set out to do with <strong>OpsGuard</strong> — a self-hosted, production-ready status page that combines real-time infrastructure monitoring with automated incident management. More importantly, I wanted to build it using industry-standard DevOps practices that enterprises actually use in production.</p><p>This blog post walks through my journey of building OpsGuard, the architectural decisions I made, the challenges I faced, and how modern DevOps practices like <strong>GitOps</strong>, <strong>Infrastructure as Code (IaC)</strong>, and <strong>DevSecOps</strong> came together to create a robust, scalable solution.</p><h3><strong>The Architecture: Designing for Scale and Reliability</strong></h3><p>Before writing a single line of code, I spent considerable time designing an architecture that would be both scalable and maintainable. The key principle was <strong>separation of concerns</strong> — each component should do one thing well.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*Ae-vBBcSTWrfiggPEgw1nA.png\" /></figure><h4><strong>The Application Layer</strong></h4><p>The application consists of three main services:</p><p>1. <strong>Frontend (React + TypeScript)</strong>: A responsive, real-time dashboard that displays service health, active incidents, and historical uptime metrics. Built with Vite for optimal performance.</p><p>2. <strong>Backend API (FastAPI + Python)</strong>: The core REST API that handles all business logic, from processing health check results to managing incidents. FastAPI was chosen for its excellent async support and automatic OpenAPI documentation.</p><p>3. <strong>Worker Service</strong>: A background service that performs automated health checks every 30 seconds. It monitors configured endpoints and updates service status in real-time.</p><h4><strong>The Data Layer</strong></h4><p>For data persistence, I choose:</p><ul><li><strong>PostgreSQL (Amazon RDS)</strong>: For storing services, incidents, and historical data. RDS provides automatic backups, multi-AZ failover, and managed updates.</li><li><strong>Redis (ElastiCache)</strong>: For caching frequently accessed data and managing real-time WebSocket connections.</li></ul><h4><strong>Why Kubernetes?</strong></h4><p>Kubernetes might seem like overkill for a status page, but it provides critical capabilities:</p><ul><li><strong>Self-healing</strong>: If a pod crashes, Kubernetes automatically restarts it</li><li><strong>Rolling updates</strong>: Zero-downtime deployments</li><li><strong>Horizontal Pod Autoscaler (HPA)</strong>: Automatic scaling based on load</li><li><strong>Resource limits</strong>: Prevent any single service from consuming all resources</li></ul><h3><strong>Infrastructure as Code: Reproducible Deployments with Terraform</strong></h3><p>One of my core principles was that <strong>infrastructure should be code</strong>. No clicking around in the AWS console — everything defined in Terraform.</p><pre>module &quot;eks&quot; {<br>  source       = &quot;./modules/eks&quot;<br>  cluster_name = &quot;opsguard-prod&quot;<br>  subnet_ids   = module.vpc.private_subnet_ids<br>  # ... configuration<br>}</pre><h4><strong>What Terraform Manages</strong></h4><p>My Terraform configuration provisions:</p><ul><li><strong>VPC</strong> with public and private subnets across 3 availability zones</li><li><strong>EKS Cluster</strong> with managed node groups</li><li><strong>RDS PostgreSQL</strong> instance with encryption at rest</li><li><strong>ECR Repositories</strong> for container images</li><li><strong>Security Groups</strong> following least-privilege principles</li><li><strong>IAM Roles</strong> with minimal required permissions</li></ul><h3><strong>The CI/CD Pipeline: From Code to Production</strong></h3><p>Every commit to the main branch triggers a comprehensive pipeline that builds, tests, scans, and deploys the application.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*6dVBTSVJAyTXB_387I6k8Q.png\" /></figure><h4><strong>Pipeline Stages</strong></h4><p>1. <strong>Build</strong>: Docker images are built using multi-stage builds to minimize image size</p><p>2. <strong>Unit Tests</strong>: Pytest runs the test suite with coverage reporting</p><p>3. <strong>Security Scan (Bandit)</strong>: Static analysis catches common Python security issues</p><p>4. <strong>Dependency Check (Safety)</strong>: Identifies vulnerable dependencies</p><p>5. <strong>Container Scan (Trivy)</strong>: Scans Docker images for CVEs</p><p>6. <strong>Push to ECR</strong>: Images are tagged and pushed to Amazon ECR</p><p>7. <strong>Deploy via ArgoCD</strong>: GitOps handles the actual deployment</p><h4><strong>DevSecOps: Security Built In, Not Bolted On</strong></h4><p>Security isn’t an afterthought — it’s integrated into every stage:</p><ul><li><strong>Bandit</strong> catches hardcoded credentials and SQL injection vulnerabilities</li><li><strong>Safety</strong> alerts on vulnerable Python packages</li><li><strong>Trivy</strong> scans container images for known CVEs</li><li><strong>SonarQube</strong> provides code quality and security analysis</li></ul><p>If any security check fails, the pipeline stops. No exceptions.</p><h3><strong>GitOps with ArgoCD: The Future of Deployments</strong></h3><p>Traditional CI/CD pipelines push changes to production. GitOps flips this model — ArgoCD <strong>pulls</strong> the desired state from Git and ensures the cluster matches.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*ywyvU5fa_TYKh8Wn5HSlDA.png\" /></figure><h4><strong>How It Works</strong></h4><p>1. The CI pipeline updates the image tag in the Kubernetes manifests</p><p>2. ArgoCD detects the change in the Git repository</p><p>3. ArgoCD applies the changes to the cluster</p><p>4. If someone manually changes something in the cluster, ArgoCD reverts it</p><h4>This provides:</h4><ul><li><strong>Audit trail</strong>: Every change is a Git commit</li><li><strong>Easy rollbacks</strong>: `git revert` and ArgoCD handles the rest</li><li><strong>Drift detection</strong>: No more “configuration drift” in production</li><li><strong>Self-healing</strong>: The cluster always matches the Git repository</li></ul><h3><strong>Observability: You Can’t Fix What You Can’t See</strong></h3><p>A production system without observability is flying blind. OpsGuard uses the industry-standard Prometheus + Grafana stack.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*W-vyWWYYUMPuFRvVeJMKdA.png\" /></figure><h4><strong>Metrics Collection</strong></h4><p>Prometheus scrapes metrics from every component:</p><ul><li><strong>Application metrics</strong>: Request rates, latency percentiles, error rates</li><li><strong>Kubernetes metrics</strong>: Pod CPU/memory, node health, deployment status</li><li><strong>Custom metrics</strong>: Health check results, incident counts, uptime percentages</li></ul><h4><strong>Dashboards and Alerts</strong></h4><p>Grafana provides pre-configured dashboards showing:</p><ul><li>Cluster resource utilization</li><li>Application performance (RED metrics: Rate, Errors, Duration)</li><li>OpsGuard-specific metrics (service status, uptime)</li></ul><p>AlertManager routes critical alerts to Slack, email, or PagerDuty based on severity.</p><h3><strong>Container Security: Defense in Depth</strong></h3><p>Security at the container level required careful attention:</p><pre>securityContext:<br>  runAsNonRoot: true<br>  readOnlyRootFilesystem: true<br>  allowPrivilegeEscalation: false<br>  capabilities:<br>    drop:<br>      - ALL</pre><h4><strong>Key Security Measures</strong></h4><ul><li><strong>Non-root containers</strong>: Containers run as unprivileged users</li><li><strong>Read-only filesystems</strong>: Prevents attackers from modifying binaries</li><li><strong>Dropped capabilities</strong>: Containers can’t perform privileged operations</li><li><strong>Resource limits</strong>: Prevents resource exhaustion attacks</li><li><strong>Network policies</strong>: Pods can only communicate with allowed services</li></ul><h3><strong>Conclusion:</strong></h3><p>OpsGuard represents what’s possible when modern DevOps practices come together. <strong>Infrastructure as Code</strong> ensures reproducibility. <strong>CI/CD pipelines</strong> automate quality gates. <strong>GitOps</strong> provides declarative deployments. <strong>Observability</strong> enables proactive operations.</p><p>The project is open source on [GitHub]<a href=\"https://github.com/Bhagirath00/OpsGuard\">(https://github.com/Bhagirath00/OpsGuard)</a>. Whether you’re building your own status page or learning DevOps practices, I hope OpsGuard serves as a useful reference.</p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a37e1c14a71d\" width=\"1\" height=\"1\" alt=\"\">","url":"https://medium.com/@bhagirath00/opsguard-a-production-ready-status-page-with-modern-devops-practices-a37e1c14a71d?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:54.978Z","updatedAt":"2026-05-07T04:45:21.576Z","tags":[]},{"id":"cmov05ivo0005u1r4k75ijvl9","title":"Why a Good README.md Matters More Than Your Code","slug":"why-a-good-readme-md-matters-more-than-your-code-e8d6acf9f4f6","publishedAt":"2025-12-01T09:31:55.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/1*8jQCGI_fjxlJBlTYVJh2dQ.png","excerpt":"Is your repository a ghost town? Discover why the README.md is the most critical file in your project.The “Black Box” ProblemImagine you are shopping for a new ...","content":"<blockquote>Is your repository a ghost town? Discover why the README.md is the most critical file in your project.</blockquote><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*8jQCGI_fjxlJBlTYVJh2dQ.png\" /></figure><h3>The “Black Box” Problem</h3><p>Imagine you are shopping for a new laptop online. You click on a product that looks promising, but the page has no photos, no spec sheet, and no price. It just has a button that says “Buy Now.”</p><p>Would you click it? Of course not. You have no idea what you are getting into.</p><p>In the world of software development, your <strong>GitHub or GitLab repository is the product page</strong>, and your <strong>README.md is the sales pitch</strong>.</p><p>Too many developers fall into the “Black Box” trap. They spend hundreds of hours writing elegant, highly optimized algorithms, pushing perfectly tested code to the src folder, and then leave the root directory empty. They assume the code speaks for itself.</p><p>Code never speaks for itself. Unless a user can understand what your project does, how to install it, and why it matters in under 30 seconds, your code effectively does not exist.</p><p>This guide moves beyond the theory. You’ll look at the architecture of documentation, visualize the user journey, and provide the exact syntax you need to turn a dead repository into a thriving open-source project.</p><h3>1. The Visual Impact: Before vs. After</h3><p>Let’s look at a concrete example. We have a hypothetical library called Data-Muncher, a simple Python script that cleans CSV files.</p><h4>Scenario A: The “Ghost Town” (No README)</h4><p>When a recruiter or developer lands on this repository, this is all they see:</p><pre>📁 Data-Muncher /<br>├── 📁 src /<br>│   └── main.py<br>├── 📁 tests /<br>│   └── test_main.py<br>├── .gitignore<br>└── requirements.txt</pre><p><strong>The User Experience:</strong></p><ul><li><strong>Confusion:</strong> “What does this do? Does it munch data? Is it for SQL or CSV?”</li><li><strong>Frustration:</strong> “I have to read the source code to figure out how to run it.”</li><li><strong>Action:</strong> The user hits the “Back” button and finds a competitor.</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*0RYeU8uRgcjduQMeCWvg7A.jpeg\" /></figure><h4>Scenario B: The “Professional Product” (With README)</h4><p>Now, look at the exact same code, but with a structured README.md.</p><p>The directory now looks like this, but the <em>rendering</em> on GitHub presents a beautiful interface:</p><pre># 🦁 Data-Muncher<br><br>![Build Status](https://img.shields.io/badge/build-passing-brightgreen)<br>![Version](https://img.shields.io/badge/version-1.0.2-blue)<br>![License](https://img.shields.io/badge/license-MIT-green)<br><br>&gt; A lightning-fast Python library to clean messy CSV files 10x faster than Pandas.<br><br>## 🚀 Features<br>- Removes duplicates automatically.<br>- Normalizes date formats (ISO-8601).<br>- zero-dependency architecture.<br><br>## 📦 Installation<br>```<br>pip install data-muncher<br>```</pre><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/722/1*Tqtz2yAge4JQDbisPjO0bw.png\" /></figure><p><strong>The User Experience:</strong></p><ul><li><strong>Clarity:</strong> They know exactly what it is immediately.</li><li><strong>Trust:</strong> The “build passing” badge proves it works.</li><li><strong>Ease:</strong> They can copy-paste the installation command.</li></ul><h3>2. The “5-Second Rule”</h3><p>In UX design, we often talk about the <em>Time to Hello World</em> (TT-HW). This is the time it takes for a new user to land on your repo and get the code running on their machine.</p><p>If your TT-HW is longer than 5 minutes, you lose 80% of your potential users.</p><h4>The User Decision Flowchart</h4><p>Below is a diagram illustrating the mental process a developer goes through when evaluating your library.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*IKXHI1UBkGw_PIzGNo_GOQ.jpeg\" /></figure><p>A good README removes the “No” branches from this flowchart. It streamlines the path to the “Star the Repo” outcome.</p><h3>3. The Technical Anatomy of a Perfect README</h3><p>A professional README isn’t just a wall of text; it is structured data using Markdown. Here are the essential components and the syntax to create them.</p><h4>A. The Header and Elevator Pitch</h4><p>Don’t start with “Introduction.” Start with the name and a hook.</p><p><strong>Code Syntax:</strong></p><pre># Project Name<br>**The one-line elevator pitch goes here.** *Example: &quot;The only React Native boilerplate you will ever need.&quot;*</pre><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/707/1*yZEIjiOmiz8M0K8DkwNQUg.png\" /></figure><h4>B. Shields (Badges)</h4><p>Badges are the “Social Proof” of open source. They tell the user that the project is alive, maintained, and licensed. You don’t need complex code for this; you use markdown image links.</p><p><strong>Code Syntax:</strong></p><pre>![License](https://img.shields.io/badge/License-MIT-green.svg)<br>![Downloads](https://img.shields.io/badge/downloads-10k%2Fmonth-blue)</pre><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/614/1*i_brla_5aVsnIuABzLlUUw.png\" /></figure><h4>C. The Visual Demo (Show, Don’t Tell)</h4><p>If you are building a UI, a GIF is mandatory. If you are building a CLI (Command Line Interface), a screenshot of the terminal is mandatory.</p><p><strong>Why?</strong> The human brain processes images 60,000x faster than text.</p><p><strong>Code Syntax:</strong></p><pre>![App Demo GIF](./assets/demo.gif)<br>*Caption: Seeing the app in dark mode.*</pre><h4>D. The Quick Start (Copy-Paste Ready)</h4><p>This is the most crucial technical section. Do not describe how to install it; give the command. Use “Code Fences” (triple backticks) to allow users to copy the code easily.</p><p><strong>Bad Documentation:</strong></p><blockquote><em>“To install, you need to open your terminal and run the npm install command for our package.”</em></blockquote><p><strong>Good Documentation:</strong></p><pre>npm install my-awesome-package<br># or<br>yarn add my-awesome-package</pre><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/725/1*ovZW4OBalcTDhNwQsgfDTw.png\" /></figure><h3>4. README Driven Development (RDD)</h3><p>Most developers write the code first!!</p><p><strong>Readme Driven Development (RDD)</strong> suggests that you should write the README <em>before</em> you write a single line of code.</p><h4>How RDD Works:</h4><ol><li><strong>Draft the README:</strong> Write down the hypothetical installation command and the API functions you <em>wish</em> existed.</li><li><strong>Reality Check:</strong> As you write the README, you might realize, <em>“Wait, this function requires 5 arguments. That is too complicated to explain.”</em></li><li><strong>Refactor Design:</strong> You simplify the design <em>before</em> coding it, simply because explaining the complex version in the README was too hard.</li></ol><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*Q7Zx78B2qv39g6cOZ5MZaw.jpeg\" /></figure><h3>5. Formatting Matters: Markdown Tricks for SEO and Readability</h3><p>A wall of plain text is hard to scan. You need to use Markdown features to create hierarchy and “scannability.” Search engines (SEO) also prefer structured content.</p><h4>Using Collapsible Sections</h4><p>If you have a long list of configurations, use the HTML &lt;details&gt; tag within your Markdown to keep the page clean.</p><p><strong>Code Syntax:</strong></p><pre>&lt;details&gt;<br>&lt;summary&gt;Click to view Advanced Configuration&lt;/summary&gt;<br><br>| Option | Type | Default |<br>|--------|------|---------|<br>| --verbose | bool | false |<br>| --dry-run | bool | false |<br><br>&lt;/details&gt;</pre><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/632/1*OkrhjlukXlbOvuEX8Htyag.png\" /></figure><h4>Using Tables for Data</h4><p>Don’t list arguments in paragraphs. Use tables. They are cleaner and look professional.</p><p><strong>Code Syntax:</strong></p><pre>| Method | Description | Returns |<br>|--------|-------------|---------|<br>| `.init()` | Starts the server | `void` |<br>| `.stop()` | Kills the process | `boolean` |</pre><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/598/1*wW-1otZA-D5A7CqeAORi3Q.png\" /></figure><h3>6. The “Bus Factor” and Maintenance</h3><p>Documentation is an insurance policy against the “Bus Factor.”</p><p>The “Bus Factor” is the minimum number of team members that have to be hit by a bus (or quit) before the project creates stops functioning because no one knows how it works.</p><p>If only you understand how to deploy the database, your project has a Bus Factor of 1. This is dangerous.</p><p>A good README acts as an “External Brain.” It remembers the setup steps so you don’t have to.</p><h4>Essential “Maintenance” Sections to Include:</h4><ul><li><strong>Development Setup:</strong> How to clone and run the repo locally.</li><li><strong>Testing:</strong> How to run the test suite (npm run test).</li><li><strong>Deployment:</strong> How the code gets to production.</li></ul><h3>7. SEO: Getting Your Repo Found</h3><p>You want your project to be found on Google, not just GitHub. The README.md is the primary source of content that Google crawls.</p><p><strong>SEO Checklist for READMEs:</strong></p><ol><li><strong>Keywords in H1/H2:</strong> If your project is a “JSON Parser,” ensure those words appear in the Title and Description.</li><li><strong>Alt Text for Images:</strong> Google cannot see images.</li></ol><ul><li><em>Bad:</em> ![image](img.png)</li><li><em>Good:</em> ![Screenshot of the JSON Parser Dashboard showing real-time metrics](img.png)</li></ul><p><strong>3. Linking:</strong> Link to your other projects or your portfolio. This creates a “backlink” structure that improves your ranking.</p><h3>Conclusion: Documentation is Empathy</h3><p>Ultimately, writing a good README is an act of empathy. It signals that you care about the person on the other side of the screen.</p><p>When a hiring manager looks at your portfolio, they aren’t going to clone your repo and audit your variable names. They are going to read your README.</p><ul><li><strong>Messy README</strong> = Messy Developer.</li><li><strong>Structured, Clear README</strong> = Senior Engineer potential.</li></ul><p>Don’t let your brilliant code die in the dark. Light it up with a README that sells, explains, and guides.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/706/1*njJRvcIZV1Y1Ykv4J97cfw.png\" /></figure><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=e8d6acf9f4f6\" width=\"1\" height=\"1\" alt=\"\">","url":"https://medium.com/@bhagirath00/why-a-good-readme-md-matters-more-than-your-code-e8d6acf9f4f6?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:55.380Z","updatedAt":"2026-05-07T04:45:24.340Z","tags":[]},{"id":"cmov05j6t0006u1r4kp28h2tz","title":"How Kubernetes detects and restarts crashing pods automatically","slug":"how-kubernetes-detects-and-restarts-crashing-pods-automatcially-9c010618d877","publishedAt":"2025-11-26T16:00:11.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/1*LbnUHrAjpfYhfXK8Gmcleg.png","excerpt":"Understand Kubernetes self-healing by exploring how the Kubelet, PLEG, probes, and CrashLoopBackOff work together to restart failing containers.Kubernetes is of...","content":"<blockquote>Understand Kubernetes self-healing by exploring how the Kubelet, PLEG, probes, and CrashLoopBackOff work together to restart failing containers.</blockquote><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*LbnUHrAjpfYhfXK8Gmcleg.png\" /></figure><p><strong>Kubernetes</strong> is often described as a <strong>self-healing system</strong> because it can automatically recover from <strong>application failures</strong>. When a <strong>container crashes</strong>, Kubernetes detects the failure and brings the workload back to a running state without manual intervention. But how does Kubernetes restart crashing pods?</p><p>A common assumption is that Kubernetes <strong>restarts Pods</strong> when something goes wrong. In reality, <strong>Pods are rarely restarted</strong>. Failure recovery happens at a much lower level and is driven by continuous monitoring of <strong>container processes</strong> on each <strong>node</strong>.</p><p>Containers can fail in multiple ways: a process may exit with a <strong>non-zero exit code</strong>, the <strong>operating system</strong> may terminate it due to <strong>memory limits</strong>, or the application may become <strong>unresponsive</strong>. Kubernetes observes these failures and decides <strong>whether and when a restart should occur</strong>, applying safeguards to prevent <strong>repeated crashes</strong> from overwhelming the system.</p><p>This blog post walks through how Kubernetes <strong>detects container failures</strong> and how <strong>restart decisions</strong> are made, starting at the <strong>node level</strong> and moving up to <strong>cluster-wide recovery</strong>.</p><h3><strong>1. </strong>How Kubernetes Detects Container Failures at the Node Level</h3><h4>1.1 Why Failure Detection Happens on the Node, Not the Control Plane</h4><p>To understand detection, we must look at the <strong>Node Level</strong>. The Kubernetes Control Plane (API Server) is often too far removed to handle immediate process failures. The heavy lifting is performed locally by the <strong>Kubelet</strong>.</p><p>The Kubelet continuously reconciles two states:</p><ul><li><strong>Desired State</strong>: What should be running on the node, as defined by Pod specifications received from the API Server.</li><li><strong>Actual State</strong>: What is currently running, as reported by the container runtime.</li></ul><p>This reconciliation is performed through a continuous control loop known as the SyncLoop.</p><h4>1.2 Kubelet <strong>SyncLoop</strong></h4><p>The SyncLoopis the core control loop of the Kubelet. Its job is to ensure that the actual state of containers always matches the desired state.</p><p>For example:</p><ul><li><strong>Desired State:</strong> “Run Nginx version 1.2.” (From API Server)</li><li><strong>Actual State:</strong> “Nginx is running.” (From Runtime)</li></ul><p>When a container crashes or exits, the actual state no longer matches the desired state. Detecting this change quickly is critical for Kubernetes to take corrective action.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*hEO8rNBcnCMgx_l1HTHJVA.png\" /></figure><h4>1.3 Why Polling Did Not Scale</h4><p>In early Kubernetes versions, the Kubelet relied heavily on <strong>polling</strong> the container runtime — repeatedly asking whether containers were still running. As node density increased to hundreds of Pods per node, this approach caused unnecessary CPU overhead and delayed failure detection.</p><p>This limitation led to the introduction of an event-driven mechanism.</p><h4>1.4 Pod Lifecycle Event Generator (PLEG)</h4><p><strong>PLEG</strong> is an internal Kubelet component responsible for detecting container state transitions efficiently.</p><ol><li><strong>Relisting:</strong> Periodically retrieves the full list of containers from the runtime.</li><li><strong>Comparison:</strong> Compares the current container state with the previous snapshot.</li><li><strong>Event Generation:</strong> When a change is detected — such as a container transitioning from <em>Running</em> to <em>Exited</em> — PLEG generates a lifecycle event (for example, ContainerDied).</li></ol><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*NydENw_gPdRQTecvzGQjkg.png\" /></figure><p>This event immediately notifies the Kubelet, allowing it to react without waiting for the next reconciliation cycle.</p><h3>2. How Kubernetes Detects Container Failures</h3><p>Kubernetes determines that a container has failed by observing <strong>specific signals</strong> from the Linux kernel, the container runtime, and the Kubelet’s health probes. Understanding these signals is essential for diagnosingfailures and designing resilient workloads.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*DCzPvlnvcoo5eQ4ILzhQTw.png\" /></figure><h4>2.1 Process Exit (Crash)</h4><p>Every container runs a <strong>main process (PID 1)</strong>. When this process stops, it sends an <strong>exit code</strong> to the operating system:</p><ul><li><strong>Exit Code 0</strong>: The process finished successfully. Kubernetes considers the container <strong>Completed</strong>.</li><li><strong>Exit Code 1–255</strong>: The process crashed or threw an error. Kubernetes marks the container as <strong>Error</strong> via the <strong>CRI</strong>.</li></ul><p>Detecting non-zero exit codes allows the Kubelet to differentiate between successful completions and failures, forming the basis for restart decisions.</p><h4>2.2 OOMKilled Signal (Exit Code 137)</h4><p>One of the most common failure causes is being <strong>killed due to out-of-memory (OOM)</strong>:</p><ul><li><strong>Scenario:</strong> Your application tries to allocate 512MB of RAM, but the Pod’s resources.limits.memory is set to 256MB.</li><li><strong>Kernel Reaction:</strong> The Linux kernel cgroups mechanism enforces the memory limit and invokes the <strong>OOM Killer</strong>, sending a <strong>SIGKILL</strong> to the process.</li><li><strong>Result:</strong> The container dies instantly with <strong>Exit Code 137</strong> (128 + 9 for SIGKILL).</li></ul><p>What it looks like in kubectl describe pod:</p><pre>State:          Terminated<br>  Reason:       OOMKilled<br>  Exit Code:    137<br>  Started:      Wed, 26 Jan 2025 12:00:00 GMT<br>  Finished:     Wed, 26 Jan 2025 12:05:00 GMT</pre><blockquote>If you see <strong>Exit Code 137</strong>, simply restarting the container will not solve the problem. You must either fix the memory issue in your code or increase resources.limits.memory in your Pod specification.</blockquote><h4>2.3 Liveness Probe Failures</h4><p>Sometimes, the main process is still running, but the application is <strong>frozen, deadlocked, or stuck</strong>. The kernel alone cannot detect this scenario. Kubernetes relies on <strong>liveness probes</strong> to actively check the health of the application.</p><p>Example configuration:</p><pre>livenessProbe:<br>  httpGet:<br>    path: /healthz<br>    port: 8080<br>  initialDelaySeconds: 15<br>  periodSeconds: 20<br>  failureThreshold: 3</pre><p>If the endpoint returns an error (e.g., HTTP 500) or times out <strong>consecutively according to </strong><strong>failureThreshold</strong>, the Kubelet marks the container as <strong>unhealthy</strong> and forcefully kills it to trigger a restart.</p><h4>2.4 Lead-In to Restart Logic</h4><p>Once a failure is confirmed, the <strong>Kubelet consults the Pod’s </strong><strong>restartPolicy</strong> to determine whether and how to restart the container.</p><ul><li><strong>Always</strong> (default): Restart container regardless of exit reason.</li><li><strong>OnFailure</strong>: Restart container only if it exited with a non-zero code.</li><li><strong>Never</strong>: Do not restart; useful for debugging or one-off tasks.</li></ul><pre>spec:<br>  restartPolicy: Always</pre><h3>3. How Kubernetes Restarts Containers:</h3><p>Once a container failure is detected, Kubernetes must decide <strong>whether and when to restart it</strong>. This behavior is controlled by the <strong>Pod’s </strong><strong>restartPolicy</strong> and further refined by the <strong>CrashLoopBackOff</strong> mechanism to prevent repeated rapid restarts.</p><h4>3.1 Pod Restart Policies</h4><p>The <strong>Kubelet</strong> consults the Pod specification’s restartPolicy to determine how to respond to a container failure. There are three main policies:</p><ul><li><strong>Always (default): </strong>The container is restarted <strong>regardless of exit reason</strong>. This is ideal for <strong>long-running services</strong> such as web servers or APIs.</li></ul><pre>spec:<br>  restartPolicy: Always</pre><ul><li><strong>OnFailure: </strong>The container is restarted <strong>only if it exits with a non-zero exit code</strong>. If it completes successfully (Exit Code 0), it is not restarted. This is suitable for <strong>batch jobs or data-processing tasks</strong>.</li></ul><pre>spec:<br>  restartPolicy: OnFailure</pre><ul><li><strong>Never:</strong> The container is <strong>never restarted</strong>, even if it crashes. This is useful for <strong>debugging or one-off static Pods</strong>.</li></ul><pre>spec:<br>  restartPolicy: Never</pre><h4>3.2 CrashLoopBackOff: Preventing Rapid Restart Storms</h4><p>Imagine your application crashes immediately after starting. If Kubernetes restarted it <strong>instantly every time</strong>, the container could restart hundreds of times per second, consuming all CPU on the node.</p><p>To prevent this, Kubernetes uses the <strong>CrashLoopBackOff</strong> mechanism:</p><ul><li>When a container fails repeatedly, the Kubelet <strong>delays the next restart</strong> using <strong>exponential backoff</strong>.</li><li>Each subsequent crash doubles the wait time:</li></ul><ol><li><strong>Crash 1:</strong> Immediate Restart.</li><li><strong>Crash 2:</strong> Wait <strong>10s</strong>.</li><li><strong>Crash 3:</strong> Wait <strong>20s</strong>.</li><li><strong>Crash 4:</strong> Wait <strong>40s</strong>.</li><li><strong>…</strong></li><li><strong>Max Delay:</strong> <strong>300s.</strong></li></ol><blockquote>When you see CrashLoopBackOff in kubectl get pods, Kubernetes is <strong>currently waiting</strong> before attempting the next restart.</blockquote><h4>3.3 Resetting the Backoff Timer</h4><p>The backoff timer doesn’t last forever. If a container <strong>runs successfully for a stable period</strong>, Kubernetes <strong>resets the backoff counter</strong>. This period is often controlled by <strong>minReadySeconds</strong> or the default health window.</p><p><strong>Effect:</strong></p><ul><li>Prevents penalizing containers that had transient issues</li><li>Ensures normal operation resumes quickly after recovery</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*NAd7dHEiQPNfqDAVBmbFcA.png\" /></figure><h3>4. How Kubernetes Handles Container Recovery and Debugging</h3><p>Even with <strong>restart policies</strong> and <strong>CrashLoopBackOff</strong>, some containers may fail due to <strong>slow startups, deadlocks, or complex dependency issues</strong>. Kubernetes provides additional mechanisms to handle these scenarios safely and help engineers <strong>diagnose problems efficiently</strong>.</p><h4>4.1 Startup Probes: Handling Slow-Starting Applications</h4><p>Some applications, such as <strong>Java services or AI models loading large datasets</strong>, may take minutes to start. Standard <strong>liveness probes</strong> could kill these containers before they finish booting.</p><p><strong>Solution:</strong> Use a <strong>startupProbe</strong>:</p><pre>startupProbe:<br>  httpGet:<br>    path: /healthz<br>    port: 8080<br>  failureThreshold: 30<br>  periodSeconds: 10</pre><p><strong>How it works:</strong></p><ul><li>Kubernetes checks the probe every 10 seconds, up to 30 times (300 seconds total).</li><li><strong>Liveness probes are disabled</strong> until the startup probe succeeds.</li><li>This ensures that <strong>slow-starting applications are not killed prematurely</strong>.</li></ul><h4>4.2 Sidecar Containers: Isolating Critical Functions</h4><p>Kubernetes allows running <strong>sidecar containers</strong> alongside your main application:</p><ul><li>Examples: logging agents, monitoring exporters, or proxy services.</li><li>If a sidecar fails, it <strong>does not necessarily affect the main application</strong>, depending on your Pod design.</li><li>Properly configured sidecars can <strong>increase observability and recovery safety</strong>.</li></ul><h4>4.3 Debugging Crash Loops</h4><p>Sometimes, a container fails repeatedly and logs alone are insufficient. Kubernetes provides tools for <strong>live debugging</strong>:</p><ul><li><strong>Check previous logs</strong>:</li></ul><pre>kubectl logs &lt;pod-name&gt; --previous</pre><p>Shows logs from the last failed container instance.</p><ul><li><strong>Inspect Pod events</strong>:</li></ul><pre>kubectl describe pod &lt;pod-name&gt;</pre><p>Reveals the <strong>reason for Kubelet restarts</strong> (e.g., OOMKilled, Liveness Probe Failed).</p><ul><li><strong>Attach ephemeral containers</strong>:</li></ul><pre>kubectl debug -it &lt;pod-name&gt; --image=busybox --target=&lt;container-name&gt;</pre><p>Provides a shell inside a running container <strong>without restarting it</strong>, allowing inspection of the filesystem or environment.</p><h3>5. How Kubernetes Recovers When a Node Fails</h3><p>While container-level recovery handles <strong>individual container crashes</strong>, Kubernetes also ensures that workloads recover when <strong>an entire node fails</strong>. This is essential for maintaining <strong>high availability</strong> in a cluster.</p><h4>5.1 Detecting Node Failures</h4><p>Each node in a Kubernetes cluster continuously reports its <strong>status to the API Server</strong>. The control plane monitors these <strong>heartbeats</strong>:</p><ul><li>Nodes send <strong>status updates</strong> every ~10 seconds</li><li>If the API Server receives <strong>no updates for a configured timeout</strong> (default: 5 minutes, --pod-eviction-timeout), the node is considered <strong>NotReady</strong></li></ul><p>At this point, the <strong>Node Controller</strong> steps in.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*kU8JiCB5lAiSUmjiRxeOHQ.png\" /></figure><h4>5.2 Eviction and Pod Rescheduling</h4><p>Once a node is marked <strong>NotReady</strong>:</p><ol><li>The Node Controller applies a <strong>NoExecute taint</strong> to the node.</li><li>Pods running on the failed node are <strong>evicted</strong> according to their <strong>tolerations</strong>.</li><li><strong>ReplicaSets, Deployments, and StatefulSets</strong> detect that the number of running replicas has dropped below the desired count.</li><li>New pods are <strong>scheduled on healthy nodes</strong> to restore the intended workload.</li></ol><blockquote><em>This process ensures that your services remain available, even if one or more nodes fail.</em></blockquote><h3>6. How Kubernetes Keeps Your Applications Running</h3><p>Kubernetes’ self-healing capabilities go far beyond simply restarting containers. It is a <strong>sophisticated system</strong> designed to keep applications running reliably, even when things go wrong at the <strong>container, node, or cluster level</strong>.</p><p>Here’s what makes Kubernetes self-healing so powerful:</p><ol><li><strong>Instant Failure Detection:</strong> The <strong>Kubelet</strong>, together with <strong>PLEG</strong>, monitors every container and immediately detects crashes, OOMKilled signals, or frozen applications.</li><li><strong>Smart Restart Decisions:</strong> <strong>Restart policies</strong> and <strong>CrashLoopBackOff</strong> ensure that containers are restarted safely without overwhelming resources, giving your workloads time to stabilize.</li><li><strong>Advanced Recovery Tools:</strong> With <strong>startup probes, sidecars, and ephemeral containers</strong>, Kubernetes handles slow-starting apps and complex failures gracefully.</li><li><strong>Cluster-Level Resilience:</strong> Even if an entire node fails, controllers like <strong>ReplicaSets</strong> and the <strong>Node Controller</strong> reschedule pods on healthy nodes, keeping the cluster in its desired state.</li></ol><p>By understanding these mechanisms, engineers can <strong>design resilient applications, troubleshoot failures faster, and avoid downtime</strong>.</p><p>In short, Kubernetes doesn’t just restart what breaks — it <strong>orchestrates a full recovery strategy</strong>, automatically maintaining reliability across containers and nodes.</p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9c010618d877\" width=\"1\" height=\"1\" alt=\"\"><hr><p><a href=\"https://blog.devops.dev/how-kubernetes-detects-and-restarts-crashing-pods-automatcially-9c010618d877\">How Kubernetes detects and restarts crashing pods automatically</a> was originally published in <a href=\"https://blog.devops.dev\">DevOps.dev</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>","url":"https://blog.devops.dev/how-kubernetes-detects-and-restarts-crashing-pods-automatcially-9c010618d877?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:55.782Z","updatedAt":"2026-05-07T04:45:26.491Z","tags":[]},{"id":"cmov05jjq0007u1r4mrl828nh","title":"Click→Report→Resolve: My SIH 2025 Journey to Smarter Civic Governance","slug":"click-report-resolve-my-sih-2025-journey-to-smarter-civic-governance-00d0b32955fe","publishedAt":"2025-09-19T06:02:48.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/1*umAKJgkDZ0fsN7VaXvA0Lg.png","excerpt":"Imagine spotting a pothole or garbage pile on your street and reporting it within seconds — no complicated forms, no waiting endlessly for updates. Civic partic...","content":"<figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*umAKJgkDZ0fsN7VaXvA0Lg.png\" /></figure><p>Imagine spotting a pothole or garbage pile on your street and reporting it within seconds — no complicated forms, no waiting endlessly for updates. Civic participation in India has long faced challenges like <strong>inefficient reporting, lack of transparency, and low accessibility.</strong></p><p>At <strong>Smart India Hackathon 2025</strong>, my team <strong>Fedora</strong> tackled this problem head-on with an innovative idea: an <strong>AI-powered crowdsourced civic issue reporting and resolution system</strong>. This solution makes reporting problems as simple as clicking a photo and empowers citizens and governments alike with real-time insights.</p><h3>The Problem with Current Systems</h3><p>1.<strong> Inefficient Reporting </strong>— Most systems are slow, manual, and lack automation.<br>2. <strong>Lack of Transparency</strong> — Citizens rarely know if their complaint is received or resolved.<br>3. <strong>Low Accessibility </strong>— Existing portals are complex, especially for semi-literate or first-time digital users.</p><p>These issues widen the gap between citizens and government bodies, ultimately slowing down civic improvements.</p><p>My Solution: <strong>CityFix</strong></p><p>I built a <strong>mobile-first, AI-powered platform</strong> designed to make civic reporting simple, transparent, and accessible to all.</p><h4>Key Features:</h4><ul><li><strong>One-click photo</strong> reporting with auto-location tagging</li><li><strong>AI auto-classification </strong>of issues (potholes, garbage, etc.) with urgency detection</li><li><strong>Real-time tracking</strong> of complaints with status updates</li><li><strong>WhatsApp chatbot support </strong>for instant updates without app installation</li><li><strong>Multi-language accessibility</strong> (English, Hindi, and local dialects)</li><li><strong>Centralized dashboard for officials</strong> with analytics and heat maps</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*KCx8M45MxNV-4eItcq8pgA.png\" /></figure><h4>Technical Approach</h4><p>1. Citizen (Web-App)</p><p><strong>Frontend</strong>: React Native (for web) <br><strong>UI/UX</strong>: TailwindCSS + Material UI (for fast, accessible, multilingual design)<br><strong>Multilingual Support</strong>: i18next library</p><p>2. Reporting Features</p><p><strong>Camera + Location</strong>: React Native Camera/GPS (mobile)<br><strong>Notifications</strong>: Firebase Cloud Messaging (push notifications across devices)</p><p>3. Backend Infrastructure</p><p><strong>Serverless Processing</strong>: Google Cloud Functions (auto-scale, low latency)<br><strong>Data Management</strong>: Google Firestore (real-time sync, scalable), Firebase Storage (images, video)<br><strong>Authentication</strong>: Firebase Auth (secure login via email/phone/social)</p><p>4. Admin Dashboard</p><p><strong>Frontend</strong>: React.js + Chart.js / Recharts (data visualization)<br><strong>Maps</strong>: Leaflet.js (interactive city maps with issue heatmaps)<br><strong>Task Routing Engine</strong>: Custom ML model (Python/FastAPI) + rule-based priority system</p><p>5. Advanced Features</p><p><strong>AI/ML Layer</strong>: TensorFlow Lite for image classification (pothole, garbage, etc.)<br><strong>Analytics &amp; Insights</strong>: Google Data Studio / Looker for admin reports / Heatmaps for hotspots and trend analysis</p><p>6. Scalability &amp; Integration</p><p><strong>APIs</strong>: REST + GraphQL APIs for future extensions<br><strong>Offline Mode:</strong> Local storage with background sync (PWA capabilities)<br>This modular, scalable stack ensures smooth performance and accessibility.</p><h4>Impact &amp; Benefits</h4><p><strong>→ For Citizens</strong>:</p><ul><li>Faster, easier reporting</li><li>Real-time status tracking</li><li>Builds <strong>trust</strong> in government systems</li></ul><p><strong>→ For Government:</strong></p><ul><li>Efficient allocation of resources and workforce</li><li>Data-driven decisions with hotspot analytics</li><li>Increased accountability and transparency</li></ul><p><strong>→ For Communities:</strong></p><ul><li>Cleaner, safer, and more eco-friendly cities</li><li>Better collaboration between people and government</li><li>A step towards<strong> Smarter, Sustainable India</strong></li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*dFEZKAi5dq5k_GVuH_kvhg.png\" /></figure><h4>Why It Matters</h4><p>This solution aligns perfectly with the <strong>Clean &amp; Green Technology</strong> theme of SIH 2025. By merging <strong>AI, crowdsourcing, and civic governance</strong>, it bridges the gap between citizens and authorities — making cities not just smarter, but more inclusive and transparent.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*9fqYnel0W4sgFWjn-QLRSw.png\" /></figure><p><strong>References &amp; Inspiration</strong></p><p>My research drew inspiration from existing civic platforms like:</p><ul><li><a href=\"https://www.fixmystreet.com/\">FixMyStreet</a></li><li><a href=\"https://pgportal.gov.in\">CPGRAMS-Home</a></li><li><a href=\"https://www.mygov.in\">MyGov.in | MyGov: A Platform for Citizen Engagement towards Good Governance in India</a></li></ul><h4>Conclusion</h4><blockquote>Civic engagement shouldn’t be a struggle — it should be empowering. With <strong>Fedora Cityfix</strong>, we aim to <strong>simplify reporting, strengthen transparency, and accelerate issue resolution</strong>. Together, citizens and governments can co-create a cleaner, greener, and smarter India.</blockquote><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=00d0b32955fe\" width=\"1\" height=\"1\" alt=\"\">","url":"https://medium.com/@bhagirath00/click-report-resolve-my-sih-2025-journey-to-smarter-civic-governance-00d0b32955fe?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:56.246Z","updatedAt":"2026-05-07T04:45:32.687Z","tags":[]},{"id":"cmov05jux0008u1r4cvt6vst1","title":"Beyond Text — The Rise of Multimodal AI and Its Impact","slug":"beyond-text-the-rise-of-multimodal-ai-and-its-impact-51a21af6b6d7","publishedAt":"2025-09-01T07:24:19.000Z","readingTime":null,"thumbnail":"https://cdn-images-1.medium.com/max/1024/1*9tVguFcHL8kV7gRLb2dTtA.png","excerpt":"Beyond Text — The Rise of Multimodal AI and Its ImpactLarge Language Models (LLMs) have transformed how we interact with technology, but for a long time, their ...","content":"<h3>Beyond Text — The Rise of Multimodal AI and Its Impact</h3><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*9tVguFcHL8kV7gRLb2dTtA.png\" /></figure><p>Large Language Models (LLMs) have transformed how we interact with technology, but for a long time, their power was limited to a single domain: text. You could ask a chatbot a question, and it would give you a text response. But what if you could show it a picture and ask it to write a poem about it? Or show it a video and have it describe the events in a single paragraph?</p><p>This is the promise of multimodal AI, the next frontier in artificial intelligence. Instead of just “reading” words, these models can see, hear, and understand the world through multiple data formats, or “modalities,” just like humans do. This shift from single-sense to multi-sense AI is already reshaping industries and creating a new wave of applications.</p><h3>What is Multimodal AI?</h3><p>At its core, multimodal AI refers to a system that can process, understand, and generate content from more than one data type simultaneously. While a traditional LLM (like early versions of GPT) was “unimodal” (text-in, text-out), a multimodal model can handle a mix of inputs, such as:</p><ul><li>Text (written language)</li><li>Images (photos, graphics)</li><li>Audio (speech, sound effects)</li><li>Video (a combination of images and audio over time)</li></ul><p>This allows for more complex and context-rich interactions. For example, a doctor could input an X-ray, a patient’s medical history (text), and a recorded description of their symptoms (audio) to get a comprehensive diagnostic summary.</p><h3>How Multimodal Models Work</h3><p>The magic behind multimodal AI lies in its ability to fuse different data types into a single, unified representation. Here’s a simplified breakdown:</p><ol><li>Input Modules: The model uses specialized “encoders” to process each data type. A separate neural network might handle images (like a Convolutional Neural Network), while another handles text (Transformer-based models).</li><li>Fusion Module: This is the brain of the operation. The model takes the encoded data from each modality and combines them in a shared space. It learns the relationships between them — for instance, how a picture of a dog relates to the word “dog.”</li><li>Output Module: Once the data is fused, the model can generate a response in a single or multiple formats. This could be a text description, a new image, or a synthesized voice.</li></ol><p>By learning these deep connections, models like Google’s Gemini and OpenAI’s GPT-4o can reason across different types of information, leading to more accurate and coherent results with fewer “hallucinations.”</p><h3>Real-World Applications and Use Cases</h3><p>Multimodal AI isn’t just a research topic; it’s already powering groundbreaking applications across various fields.</p><ul><li>Healthcare: Analyzing medical scans (images) alongside patient records and notes (text) to assist with diagnostics.</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*3cfi9nHN1RVCQHw4aSop7w.png\" /></figure><ul><li>Retail &amp; E-commerce: Providing personalized shopping recommendations by analyzing a customer’s search query (text) and past purchases (transaction data) as well as the images of products they’ve browsed.</li><li>Autonomous Driving: Integrating real-time data from multiple sensors — cameras (video), radar, and LiDAR (sensor data) — to perceive the environment and make immediate decisions.</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*IJdgSt4ySZYPFiDUMKZBlA.png\" /></figure><ul><li>Content Creation: Generating a video script (text) from a series of images, or creating a new image from a combination of text and an existing photo.</li><li>Customer Service: Analyzing a customer’s tone of voice (audio) and chat log (text) to better understand their sentiment and provide a more empathetic response.</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*tAzGfctu7UIi5eudL6KS-Q.png\" /></figure><h3>The Future of Human-Computer Interaction</h3><p>The shift to multimodal AI marks a fundamental change in how we interact with technology. It’s moving us closer to a future where AI systems are not just tools but true collaborators that can perceive the world in a more holistic, human-like way.</p><p>As these models become more sophisticated, we can expect them to become even more integrated into our daily lives. From smart home assistants that can “see” a broken appliance and guide you through the repair, to educational tools that can “watch” you solve a problem and offer personalized feedback, the possibilities are nearly limitless.</p><p>By understanding the power of multi-modal AI, you’re not just keeping up with the latest trends — you’re preparing for a future where the digital world is as sensory and interconnected as our own.</p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=51a21af6b6d7\" width=\"1\" height=\"1\" alt=\"\">","url":"https://medium.com/@bhagirath00/beyond-text-the-rise-of-multimodal-ai-and-its-impact-51a21af6b6d7?source=rss-3302bbae903c------2","hash":"sync-hash","mediumUsername":"bhagirath00","createdAt":"2026-05-07T04:44:56.649Z","updatedAt":"2026-05-07T04:45:33.707Z","tags":[]}],"total":9,"page":1,"totalPages":1}