Ace Your Interviews 🎯
Browse our collection of interview questions across various technologies.
What is Docker and what problem does it solve?
Docker is a platform for building, shipping, and running applications in containers. It solves the 'works on my machine' problem — the chronic mismatch between development, testing, and production environments. A Docker container packages the application, its runtime, libraries, and dependencies into a single portable artifact that runs identically on any machine with Docker installed, eliminating environment-related deployment failures.
What is the difference between a Docker image and a Docker container?
An image is an immutable, built artifact — like a class definition or a blueprint. It contains the filesystem, code, runtime, and configuration needed to run an application. A container is a running instance of an image — like an object instantiated from a class. Multiple containers can run from the same image simultaneously. Containers are ephemeral — when stopped and removed, their state is lost (unless stored in a volume).
What is a Dockerfile?
A Dockerfile is a text file containing a series of instructions that Docker executes to build an image. Each instruction creates a layer in the final image. Key instructions: FROM (base image), RUN (execute commands during build), COPY (add files from build context), WORKDIR (set working directory), ENV (set environment variables), EXPOSE (document port), USER (switch to non-root user), and CMD (default command when container starts).
What is Docker Compose and why is it used?
Docker Compose is a tool for defining and running multi-container applications. A single docker-compose.yml file defines all services (web server, database, cache, background worker), their configuration, networking, and volumes. docker compose up starts all services together. It enables teams to run a complete application stack locally with a single command, eliminating manual service setup and configuration.
What is the difference between CMD and ENTRYPOINT in a Dockerfile?
ENTRYPOINT defines the fixed executable that always runs when the container starts — it cannot be overridden with docker run arguments, only appended to. CMD provides default arguments that can be completely overridden by docker run arguments. Common pattern: ENTRYPOINT ["/bin/sh", "-c"] for a fixed shell, CMD ["gunicorn server:app"] for the overridable default command. If both are defined as exec form arrays, CMD arguments are appended to ENTRYPOINT.
What is a Docker registry?
A Docker registry is a storage and distribution system for Docker images. Docker Hub is the default public registry. Private registries include AWS ECR (Elastic Container Registry), Google Artifact Registry, GitHub Container Registry (GHCR), and self-hosted registries. docker push uploads an image to a registry. docker pull downloads it. CI/CD pipelines push images after successful tests; deployment platforms pull images for deployment.
What is a Docker volume and why is it needed?
A Docker volume is persistent storage that exists outside the container filesystem. Container filesystems are ephemeral — data written inside a container is lost when the container is removed. Named volumes (postgres_data:/var/lib/postgresql/data) persist database data independently of the container lifecycle. Bind mounts map a host directory into the container — used in development so code changes on the host immediately appear inside the container without rebuilding.
What is the purpose of .dockerignore?
.dockerignore specifies files and directories to exclude from the Docker build context. Without it, docker build sends the entire project directory — including node_modules (100MB+), .git history, .env secrets, and test files — to the Docker daemon before processing begins, slowing builds significantly. Always exclude: .git, .env, node_modules/, venv/, __pycache__, tests/, and any sensitive credential files.
What does docker compose up -d do and how does it differ from docker compose up?
docker compose up starts all services in the foreground — you see all container logs in the terminal and Ctrl+C stops all services. docker compose up -d (detached mode) starts all services in the background, returning the terminal immediately. Logs can be followed with docker compose logs -f. Use -d in production and when running multiple services you don't want to block the terminal.
How do containers communicate with each other in Docker Compose?
Docker Compose creates an internal bridge network for all services in the same docker-compose.yml. Each service is reachable from other services using the service name as a hostname. If you have services named 'api' and 'db', the api service connects to PostgreSQL at host 'db', port 5432 — no IP addresses needed. This works because Docker's internal DNS resolves service names to container IP addresses automatically.
Explain Docker's layer caching mechanism and how to optimize for it.
Every Dockerfile instruction creates a layer with a SHA256 hash. Docker checks if the instruction and all preceding layers are unchanged — if so, it reuses the cached layer (cache hit). The cache is invalidated for an instruction if: the instruction itself changed, a preceding layer changed, or a file copied with COPY/ADD changed. Optimization: put rarely-changing instructions (FROM, system package installs, dependency installs) before frequently-changing ones (COPY source code). This makes dependency installation hit cache on most code-only builds.
What is a multi-stage Docker build? Give a real example.
Multi-stage builds use multiple FROM instructions, each starting a new stage. The final image can COPY files from intermediate stages, discarding everything else. Example: Stage 1 (node:20 builder) installs all dependencies and runs npm run build to compile React. Stage 2 (nginx:alpine) copies only the /dist output from Stage 1. Result: 25MB production image instead of 350MB — no Node.js, no build tools, no source code in production.
How do you handle database migrations in a containerized environment?
Never run migrations as part of the main application container startup (CMD). This causes problems with horizontal scaling (multiple API containers running migrations simultaneously causes conflicts) and with health checks (container isn't healthy until migration finishes). Options: run migrations as a separate docker compose exec command after deployment, use an init container that runs migrations and completes before the API starts, or use migration tools that lock safely (Alembic, Django migrations).
What is the difference between a bind mount and a named volume?
Bind mount (./host/path:/container/path): maps a specific host filesystem path into the container. Changes on host immediately visible inside container — ideal for development live-reload. Platform-specific path issues on Windows. Named volume (volume_name:/container/path): Docker manages the storage location on the host. Portable across platforms, correct permissions, better performance on macOS. Use bind mounts for source code in development; named volumes for database data and persistent application state.
How do you pass secrets to containers securely in production?
Never hardcode in Dockerfile ENV, never commit .env files. Options: Pass via docker run -e from CI/CD environment variables (encrypted in CI). Docker secrets (Swarm-native) mount secrets as files in /run/secrets/. Kubernetes Secrets (base64-encoded, stored in etcd). Cloud-native: AWS Secrets Manager / GCP Secret Manager — application fetches secrets at startup via SDK. HashiCorp Vault — centralized secrets management with dynamic credentials, fine-grained access control, and audit logs.
What is Docker's internal DNS and how does service discovery work?
Docker's embedded DNS server resolves container/service names to IP addresses within the same network. In Docker Compose, every service name becomes a DNS hostname. The 'api' service connects to PostgreSQL at host 'db:5432' — Docker resolves 'db' to the PostgreSQL container's internal IP. This works because all Compose services are on the same default bridge network unless configured otherwise. Custom networks provide additional isolation: services on different networks cannot reach each other by name.
How do you debug a container that exits immediately after starting?
1) docker ps -a to see exited containers. 2) docker logs container_name to read stdout/stderr from the crashed container. 3) Override the CMD to keep it alive: docker run -it myapp:1.0 bash — this drops into a shell instead of running the application, allowing you to investigate the filesystem and run commands manually. 4) docker inspect container_name for exit code and OOMKilled flag. 5) Check that required environment variables are set — many apps exit immediately on missing config.
How do you implement a CI/CD pipeline with Docker?
Standard pipeline: 1) Checkout code. 2) Build Docker image (with layer cache). 3) Run tests inside a container from that image (with service containers for database and cache). 4) Run security scan (Trivy) on the image. 5) If tests pass and it's a merge to main: push image to registry with git SHA tag. 6) Deploy: update ECS task definition / Cloud Run revision / docker compose on server to use new SHA tag. The image that passed tests is exactly what runs in production.
What are Docker health checks and how do they affect deployments?
HEALTHCHECK in a Dockerfile defines a command Docker runs periodically to verify the container is working correctly. Docker marks the container as healthy, unhealthy, or starting. Compose uses condition: service_healthy in depends_on to wait for a service to be healthy before starting dependent services. In production, orchestration platforms (ECS, Kubernetes) use health checks to: determine when a new deployment is ready to receive traffic, remove unhealthy containers from the load balancer, and restart containers that become unhealthy.
What is Docker BuildKit and why is it important?
BuildKit is the modern Docker build backend (default since Docker 23.0). Key improvements over legacy builder: parallel stage building (multi-stage builds run in parallel, not sequentially), cache mounts (RUN --mount=type=cache,target=/root/.cache/pip pip install caches between builds without baking into image), secrets mounts (RUN --mount=type=secret,id=mysecret passes secrets to build without creating layers), better output with progress display, and SBOM/provenance attestations for supply chain security.
How do you optimize Docker image size?
Use minimal base images (python:3.12-slim vs python:3.12 — saves 900MB). Multi-stage builds — don't include build tools in runtime image. Chain RUN commands: RUN apt-get update && apt-get install -y pkg && rm -rf /var/lib/apt/lists/* — single layer, no apt cache. pip install --no-cache-dir. Copy only needed files (good .dockerignore). Use .dockerignore to exclude tests, docs, .git. Consider distroless images (no shell, no package manager) for maximum security and minimal size.
How would you architect a zero-downtime deployment system for a containerized application?
Blue-green deployment: run old (blue) and new (green) containers simultaneously. New container goes through health checks. Once healthy, load balancer switches traffic from blue to green. Remove blue container. If health checks fail, keep blue running and alert. Rolling deployment (Kubernetes default): replace containers one by one — maintain n-1 healthy containers during deployment. Canary: route 5% of traffic to new version, monitor error rates, gradually increase to 100% if healthy. Requires: idempotent migrations (run before deployment), stateless containers, sticky sessions handled at load balancer.
Explain the relationship between Docker and Kubernetes.
Docker packages applications as container images and runs individual containers. Kubernetes is an orchestration platform that manages containers at scale across a cluster of machines. Kubernetes doesn't run containers directly — it calls a container runtime (containerd, which is Docker's core runtime) to start containers. Kubernetes adds: declarative deployments (desired state), horizontal auto-scaling, rolling updates, self-healing (restart failed pods), service discovery, load balancing, and configuration management. Docker is the packaging standard. Kubernetes is the deployment and operations platform built on that standard.
What is a Docker build context and how does it affect build performance?
The build context is the set of files sent to the Docker daemon when docker build runs. By default it's the entire directory specified (usually .). The daemon must receive the entire context before starting any instruction — large contexts (node_modules, .git, build artifacts) cause multi-minute delays before the first FROM even executes. Optimization: comprehensive .dockerignore, build from a minimal directory containing only needed files, use --build-context in BuildKit to provide named additional contexts. docker build -f Dockerfile .. sends context from the parent directory but uses the Dockerfile in the current directory.
How do you handle persistent storage for stateful applications in container environments?
Stateful applications (databases, file storage) in containers require careful volume management. Local named volumes work for single-host deployments but don't work across multiple hosts. For multi-host: use cloud storage volumes (AWS EBS/EFS, GCP Persistent Disk) mounted to the specific node running the database container, or run databases outside the container cluster entirely (managed databases: RDS, Cloud SQL, Atlas) — the recommended approach for production. Object storage (S3, GCS) for uploaded files. Redis or DynamoDB for session state. Design for stateless application containers wherever possible.
What are container security scanning tools and how do you integrate them into CI?
Trivy (free, fast, comprehensive): scans OS packages and application dependencies for known CVEs. Grype (Anchore, free): similar scope to Trivy. Docker Scout (Docker-native, paid tiers). Integration in CI: run scanner on the built image before pushing, exit with non-zero code on HIGH/CRITICAL vulnerabilities to fail the build. Set up base image update automation (Dependabot, Renovate) to get new base image versions when vulnerabilities are patched. Maintain SBOM (Software Bill of Materials) with docker build --sbom=true for supply chain auditability.
How do you implement container resource management in a multi-tenant environment?
In Docker Compose: deploy.resources.limits (memory, cpus) and reservations. In Kubernetes: resources.limits and resources.requests per container — limits prevent runaway consumption, requests determine scheduling decisions. For true multi-tenancy: separate Kubernetes namespaces per tenant with ResourceQuota objects (total CPU/memory per namespace) and LimitRange objects (per-pod defaults and maxima). Network policies to prevent cross-tenant communication. cgroups v2 (default in modern Linux) provides more precise resource accounting than v1.
How does container networking work at the Linux kernel level?
Docker uses Linux kernel features: network namespaces create isolated network stacks (each container has its own eth0 interface with its own IP). Virtual ethernet pairs (veth) connect the container namespace to the host namespace. A bridge (docker0 for default network, custom bridges for user-defined networks) connects all veth interfaces for containers on the same network. iptables rules handle NAT (port publishing: host:8000 → container:8000) and network policy. Overlay networks (Docker Swarm, Kubernetes) use VXLAN tunneling to extend layer-2 connectivity across physical hosts.
What is GitOps and how does it relate to Docker deployments?
GitOps is a deployment methodology where the desired state of all infrastructure and applications is declared in Git repositories. Changes to deployed systems happen only through Git commits — no manual kubectl apply or SSH deployments. Implementation: a Git repository contains Kubernetes manifests or Helm chart values with specific image tags (SHA-based). ArgoCD or Flux watches the repository and automatically syncs the cluster to match. CI pipeline updates the image tag in the deployment repository after a successful build. The audit trail for any deployment change is a Git commit — who changed what, when, reviewed by whom.
How do you manage configuration and secrets across development, staging, and production container environments?
Configuration per environment: environment-specific .env files (not committed), injected via CI/CD platform secrets (GitHub Actions secrets, GitLab CI variables). Kubernetes ConfigMaps for non-sensitive config, Secrets for sensitive values (base64-encoded in etcd, ideally encrypted at rest). Sealed Secrets for encrypting Kubernetes Secrets in Git. Centralized secrets management: HashiCorp Vault with Kubernetes auth backend (pods authenticate via service account), AWS Secrets Manager with IAM role (no credential needed), or GCP Secret Manager with Workload Identity. 12-factor app principle: all config comes from environment variables, never from code or images.
What are the key differences between Docker Swarm and Kubernetes?
Docker Swarm: simple, built into Docker, low learning curve, adequate for smaller deployments (10–50 containers). Native docker-compose.yml-compatible (deploy section). Limited ecosystem — no Helm, no CRDs, no auto-scaling beyond replica counts. Kubernetes: complex, industry standard, enormous ecosystem (Helm, Operators, ArgoCD, Prometheus). Auto-scaling (HPA, KEDA), advanced scheduling, multi-cloud portability, comprehensive RBAC, extensible with Custom Resource Definitions. Kubernetes is the right choice for production scale. Swarm is the right choice for teams that need simple multi-container orchestration without Kubernetes complexity.