Cloud Infrastructure and Architecture
Expert-defined terms from the Certificate in Cloud Transformation Management course at LearnUNI. Free to read, free to share, paired with a professional course.
Access Control List (ACL) Related terms #
Identity and Access Management, Security Group An ACL is a set of rules that define inbound or outbound traffic permissions for network interfaces or storage resources. For example, an ACL may allow HTTP traffic from a specific IP range while denying all other protocols. Practical application includes fine‑grained network segmentation in multi‑tenant environments. Challenges involve maintaining rule consistency across distributed firewalls and avoiding rule conflicts that can cause unintended service disruptions.
Availability Zone (AZ) Related terms #
Region, Fault Domain An AZ is an isolated location within a cloud region, equipped with independent power, cooling, and networking. Deploying workloads across multiple AZs enhances resilience; for instance, a web tier can be spread across three AZs to survive a single‑site outage. The main challenge is managing data replication latency and ensuring stateful services remain synchronized across zones.
Baseline Architecture Related terms #
Reference Architecture, Design Pattern Baseline Architecture describes a minimal, repeatable set of components and configurations that serve as a starting point for cloud projects. It typically includes core networking, identity, and monitoring services. Using a baseline accelerates onboarding but may limit flexibility if the baseline does not align with specific compliance or performance requirements.
Batch Processing Related terms #
Stream Processing, ETL Batch processing handles large volumes of data in discrete chunks, often scheduled during off‑peak hours. A typical use case is nightly payroll calculation using a cloud‑based data warehouse. The main challenge is ensuring timely completion and handling job failures without impacting downstream analytics.
Blue‑Green Deployment Related terms #
Canary Release, Rolling Update Blue‑green deployment creates two identical production environments; traffic is switched from the “blue” (current) to the “green” (new) version after validation. This technique reduces downtime and enables rapid rollback. Challenges include synchronizing databases and managing cost overhead of maintaining duplicate environments.
Cache‑Aside Pattern Related terms #
Write‑Through Cache, Read‑Through Cache In the cache‑aside pattern, an application reads from the cache and falls back to the primary data store on a miss, then populates the cache. For example, a microservice may retrieve product details from Redis, loading the database only when needed. The difficulty lies in cache invalidation and ensuring consistency between cache and source data.
Capacity Planning Related terms #
Scaling, Load Forecasting Capacity planning predicts resource requirements based on projected workloads, ensuring sufficient compute, storage, and network capacity. Cloud tools can auto‑scale but accurate forecasts prevent over‑provisioning and cost overruns. Challenges include handling unpredictable traffic spikes and aligning forecasts with contractual billing periods.
Certificate Authority (CA) Related terms #
PKI, TLS A CA issues digital certificates that bind cryptographic keys to identities, enabling secure communication. In cloud environments, a private CA may issue certificates for internal services, while a public CA secures external endpoints. Managing certificate lifecycles and preventing expired or compromised certificates are common operational challenges.
Change Management Related terms #
Incident Management, Release Management Change management governs how modifications to cloud infrastructure are proposed, reviewed, approved, and implemented. Using automated pipelines and approval gates reduces human error. The primary challenge is balancing speed of delivery with risk mitigation, especially in highly regulated industries.
Cloud Access Security Broker (CASB) Related terms #
DLP, IAM A CASB provides visibility and control over cloud service usage, enforcing policies such as data loss prevention and encryption. For instance, a CASB can block uploads of sensitive files to unauthorized SaaS apps. Integration complexity and latency introduced by policy enforcement are frequent obstacles.
Cloud Bursting Related terms #
Hybrid Cloud, Elasticity Cloud bursting extends an on‑premises workload to the public cloud during peak demand, leveraging excess cloud capacity. A retail site may burst to the cloud during holiday traffic spikes. Challenges include data synchronization, latency, and cost predictability when workloads fluctuate.
Cloud Formation Template Related terms #
IaC, Terraform A CloudFormation template is a JSON or YAML file that defines AWS resources and their relationships, enabling declarative infrastructure provisioning. Templates can be version‑controlled and reused across environments. Complex templates can become difficult to maintain, and drift between the template and actual resources requires regular reconciliation.
Cloud Native Related terms #
Microservices, Containerization Cloud‑native describes applications designed to fully exploit cloud capabilities such as elasticity, resilience, and managed services. A cloud‑native app may consist of containerized microservices orchestrated by Kubernetes. Migrating legacy monoliths to a cloud‑native architecture often encounters challenges around data consistency and skill gaps.
Cloud Service Provider (CSP) Related terms #
IaaS, PaaS A CSP delivers computing resources, storage, networking, and managed services over the internet. Major CSPs include AWS, Azure, and Google Cloud. Selecting a CSP requires evaluating service breadth, compliance certifications, and pricing models. Vendor lock‑in and data sovereignty are primary concerns.
Cluster Autoscaler Related terms #
Horizontal Pod Autoscaler, Scaling Policy The Cluster Autoscaler automatically adjusts the number of worker nodes in a Kubernetes cluster based on pending pod resource requests. It helps maintain performance while minimizing cost. Misconfiguration can lead to rapid node churn, causing instability and increased expenses.
Cold Start Related terms #
Warm Start, Serverless A cold start occurs when a serverless function or container is invoked for the first time, requiring provisioning of compute resources, which adds latency. For latency‑sensitive APIs, cold starts can degrade user experience. Mitigation techniques include provisioned concurrency or keeping a minimum number of instances warm.
Configuration Drift Related terms #
IaC, Reconciliation Configuration drift happens when the actual state of resources diverges from the declared state in infrastructure‑as‑code files. Over time, unmanaged changes can cause security gaps or performance issues. Regular drift detection scans and automated remediation are essential to maintain compliance.
Container Registry Related terms #
Docker Hub, Artifact Repository A container registry stores and distributes container images, supporting versioning and vulnerability scanning. Examples include Amazon ECR, Azure Container Registry, and Docker Hub. Securing registries, managing image lifecycle, and preventing malicious images from being deployed are ongoing operational challenges.
Continuous Integration (CI) Related terms #
CI/CD, Build Pipeline CI automates the building and testing of code changes as they are committed, ensuring early detection of defects. In cloud projects, CI pipelines often trigger infrastructure provisioning tests. Scaling CI pipelines for large codebases and reducing flaky test failures are common pain points.
Continuous Delivery (CD) Related terms #
CI, Release Automation CD extends CI by automatically deploying validated changes to staging or production environments. It enables rapid, reliable releases. Implementing CD demands robust automated testing, feature flag management, and rollback mechanisms to mitigate production risk.
Control Plane Related terms #
Data Plane, Management Plane The control plane manages the configuration, orchestration, and policy enforcement of cloud resources. In Kubernetes, the API server and scheduler comprise the control plane. Ensuring high availability of the control plane is critical; failures can render the entire cluster unmanageable.
Cost Allocation Tag Related terms #
Tagging, Chargeback Cost allocation tags are key‑value pairs applied to resources to attribute spend to departments, projects, or cost centers. Accurate tagging enables transparent chargeback models. Inconsistent tag enforcement leads to orphaned costs and reporting inaccuracies.
Data Lake Related terms #
Data Warehouse, Object Storage A data lake stores raw, unstructured, and structured data at scale, typically using object storage like S3. It enables analytics and machine learning on diverse data sets. Governance, security, and data cataloging are major challenges to prevent the lake from becoming a “data swamp.”
Data Residency Related terms #
Sovereignty, Compliance Data residency refers to the physical location where data is stored, often mandated by regulations. Cloud providers offer region‑specific services to meet residency requirements. Architects must design data replication and backup strategies that respect jurisdictional constraints while maintaining performance.
Data Warehouse Related terms #
OLAP, ETL A data warehouse is a relational database optimized for analytical queries, providing structured, transformed data for reporting. Services like Amazon Redshift and Snowflake exemplify cloud data warehouses. Challenges include schema evolution, query performance tuning, and managing cost‑effective storage tiers.
Dedicated Host Related terms #
Bare Metal, Instance A dedicated host provides physical servers reserved for a single tenant, offering isolation and compliance benefits. It is useful for workloads requiring specific CPU architectures or licensing constraints. Utilization optimization is critical, as under‑used hosts increase expenses.
Disaster Recovery (DR) Related terms #
RTO, RPO DR defines processes and architectures to restore services after catastrophic failures. Cloud DR options include multi‑region replication and automated failover. Balancing recovery time objectives (RTO) with recovery point objectives (RPO) against cost is a frequent trade‑off.
Edge Computing Related terms #
CDN, Fog Computing Edge computing processes data near its source, reducing latency and bandwidth usage. Use cases include IoT sensor analytics and real‑time video processing. Deploying and managing edge nodes introduces challenges in security patching, monitoring, and consistent configuration across distributed sites.
Elastic Load Balancer (ELB) Related terms #
Application Load Balancer, Network Load Balancer An ELB distributes incoming traffic across multiple compute instances, providing high availability and fault tolerance. For example, an ELB can route HTTP requests to a pool of web servers in different AZs. Improper health‑check settings can cause traffic to be sent to unhealthy instances, degrading user experience.
Encryption‑at‑Rest Related terms #
KMS, Transparent Data Encryption Encryption‑at‑rest protects stored data using cryptographic keys, preventing unauthorized access if storage media is compromised. Cloud KMS services manage keys and integrate with storage services. Key rotation policies and access control to the KMS itself are critical to avoid key exposure.
Encryption‑in‑Transit Related terms #
TLS, VPN Encryption‑in‑transit secures data moving between clients, services, and storage using protocols like TLS. For instance, API calls between microservices should be encrypted to prevent eavesdropping. Managing certificate lifecycles and ensuring mutual authentication can be operationally complex.
Endpoint Security Related terms #
Zero Trust, CASB Endpoint security protects devices that access cloud resources, enforcing policies such as device posture checks and token‑based authentication. Integrating endpoint solutions with identity providers enables unified access control. The challenge lies in scaling enforcement across diverse device types and operating systems.
Fault Tolerant Architecture Related terms #
High Availability, Redundancy Fault tolerance ensures a system continues operating despite component failures, typically through redundancy and automatic failover. Designing a multi‑AZ, multi‑region architecture exemplifies fault tolerance. Complexity grows with each added redundancy layer, increasing operational overhead.
Feature Flag Related terms #
Canary Release, Dark Launch Feature flags enable runtime toggling of functionality without redeploying code. They support gradual rollouts and quick rollbacks. Overuse can lead to flag debt, where stale flags accumulate and create maintenance burdens.
File Storage Service Related terms #
Object Storage, Block Storage File storage provides a shared file system interface (e.G., NFS) for workloads needing hierarchical file organization. Services like Amazon FSx or Azure Files are common. Performance tuning, scalability limits, and ensuring consistent latency across regions are typical concerns.
Flat Network Related terms #
Segmented Network, VPC A flat network places all resources on a single broadcast domain, simplifying connectivity but reducing isolation. It is often used in small test environments. Security risks increase because lateral movement is easier; segmentation via subnets or security groups is recommended for production.
FaaS (Function as a Service) Related terms #
Serverless, Event‑Driven FaaS allows developers to deploy individual functions that execute in response to events, with the platform handling scaling and provisioning. Example: A Lambda function that processes S3 upload events. Cold start latency and limited execution time are common constraints.
Federated Identity Related terms #
SAML, OIDC Federated identity enables users to authenticate using an external identity provider, allowing single sign‑on across cloud services. An organization may use Azure AD to grant access to AWS resources via SAML. Mapping attribute claims and maintaining trust relationships can be intricate.
Fleet Management Related terms #
Auto‑Scaling Group, Instance Group Fleet management automates lifecycle operations for a collection of compute instances, handling provisioning, health monitoring, and decommissioning. Tools such as AWS Auto Scaling or Google Instance Groups simplify this. Challenges include defining appropriate scaling policies and avoiding rapid scaling oscillations.
Firecracker MicroVM Related terms #
MicroVM, Lightweight VM Firecracker is an open‑source virtualization technology that runs minimal microVMs for serverless workloads, offering near‑container startup speed with VM isolation. It powers services like AWS Lambda. Integrating Firecracker into existing orchestration pipelines requires custom tooling and security hardening.
Flat File Database Related terms #
NoSQL, Key‑Value Store Flat file databases store data in simple files without complex indexing, suitable for lightweight configuration storage. While easy to use, they lack scalability and transactional guarantees required for production workloads. Migration to a managed database service is often necessary as data volume grows.
Geographic Redundancy Related terms #
Multi‑Region, Disaster Recovery Geographic redundancy replicates data and services across distinct regions to protect against regional outages. For example, a multi‑region Aurora cluster replicates writes across continents. Network latency and data sovereignty rules can complicate design decisions.
GitOps Related terms #
IaC, CI/CD GitOps treats Git repositories as the single source of truth for both application code and infrastructure configuration, automating deployments via pull‑request workflows. Tools like Argo CD or Flux implement GitOps. The approach demands strict repository governance and can be hindered by large binary artifacts that do not version well in Git.
Hybrid Cloud Related terms #
Multi‑Cloud, Cloud Bursting Hybrid cloud combines on‑premises infrastructure with public cloud resources, enabling workload portability and flexibility. A common pattern is extending a private data center with a public cloud for burst capacity. Managing consistent security policies and data synchronization across environments is a key challenge.
Identity Federation Related terms #
SAML, OIDC Identity federation enables multiple identity providers to share authentication data, allowing users to access resources across domains without separate credentials. For instance, an employee can use corporate AD credentials to log into a SaaS CRM. Configuring trust relationships and handling attribute mapping are frequent sources of error.
Infrastructure as Code (IaC) Related terms #
Terraform, CloudFormation IaC defines infrastructure using declarative code, enabling version control, repeatable deployments, and automated testing. A Terraform script can provision a VPC, subnets, and EC2 instances in a single run. Challenges include state management, handling drift, and ensuring that code changes do not unintentionally disrupt production workloads.
Instance Metadata Service (IMDS) Related terms #
IAM Role, Instance Profile IMDS provides runtime information about an instance, such as IAM role credentials, without embedding secrets in the OS. Applications can query IMDS to obtain temporary tokens for AWS API calls. Misconfiguring IMDS access can expose credentials to malicious processes; using IMDSv2 with session tokens mitigates this risk.
Inter‑Region VPC Peering Related terms #
Transit Gateway, VPN Inter‑region VPC peering connects virtual networks across different cloud regions, enabling low‑latency traffic without traversing the public internet. Use cases include replicating databases across regions for DR. Peering does not support transitive routing, so designing a hub‑spoke topology may require a transit gateway.
Internet of Things (IoT) Hub Related terms #
Device Twins, MQTT An IoT hub centrally manages device connections, telemetry ingestion, and command delivery. Azure IoT Hub and AWS IoT Core are examples. Scaling to millions of devices demands careful partitioning and handling of message throttling. Security of device credentials remains a primary concern.
Just‑In‑Time (JIT) Access Related terms #
Privileged Access Management, MFA JIT access grants elevated permissions only for a limited time, reducing attack surface. Cloud IAM can issue temporary admin roles that expire after an hour. Implementing JIT requires integration with ticketing systems and robust audit logging to ensure accountability.
Kubernetes Cluster Related terms #
Control Plane, Node Pool A Kubernetes cluster consists of a control plane that manages the desired state and a set of worker nodes that run containers. Deploying a production‑grade cluster often involves multiple node pools for workloads with differing resource profiles. Operational challenges include securing the API server, managing upgrades, and monitoring cluster health.
Kubernetes Operator Related terms #
Custom Resource Definition, Controller An Operator extends Kubernetes with domain‑specific logic, automating the lifecycle of complex stateful applications like databases. Operators watch custom resources and act to provision, scale, or backup the underlying service. Writing a robust operator demands deep knowledge of both the application and the Kubernetes API.
Latency‑Sensitive Application Related terms #
Edge Computing, CDN Latency‑sensitive applications require sub‑millisecond response times, such as high‑frequency trading platforms. Deploying these workloads close to users via edge nodes or specialized low‑latency regions mitigates network delay. Achieving required latency often conflicts with cost constraints and necessitates careful performance testing.
Least Privilege Related terms #
RBAC, IAM The principle of least privilege restricts users and services to only the permissions needed for their tasks. Implementing least privilege in cloud IAM reduces the blast radius of compromised credentials. Ongoing challenge is balancing security with operational efficiency, as overly restrictive policies can impede legitimate workflows.
Load Balancer as a Service (LBaaS) Related terms #
ELB, Ingress LBaaS provides managed load balancing without requiring users to provision or maintain hardware. Cloud providers offer LBaaS that supports layer‑4 (TCP) and layer‑7 (HTTP) traffic. Configuring health checks, session persistence, and SSL termination correctly is essential to avoid service disruptions.
Log Analytics Related terms #
SIEM, Observability Log analytics aggregates, indexes, and queries log data to derive operational insights. Services like AWS CloudWatch Logs Insights or Azure Log Analytics enable real‑time troubleshooting. Managing log volume, retention policies, and ensuring compliance with data protection regulations are common hurdles.
Managed Service Related terms #
SaaS, PaaS A managed service abstracts underlying infrastructure, allowing customers to focus on application logic. Examples include managed databases, caches, and messaging queues. While reducing operational overhead, customers must trust the provider’s security controls and understand service‑level agreements (SLAs).
Multi‑AZ Deployment Related terms #
High Availability, Fault Domain Deploying resources across multiple Availability Zones provides redundancy against zone‑level failures. A typical pattern places database replicas in separate AZs and routes traffic through an ELB. Complexity rises when handling cross‑AZ data replication latency and ensuring consistent configuration across zones.
Multi‑Cloud Strategy Related terms #
Hybrid Cloud, Vendor Lock‑In A multi‑cloud strategy uses services from two or more cloud providers to avoid dependence on a single CSP and to leverage best‑of‑breed services. For example, using Google BigQuery for analytics while hosting workloads on AWS. Challenges include unified governance, cross‑provider networking, and cost management.
Network Security Group (NSG) Related terms #
Firewall, Security Group An NSG defines inbound and outbound traffic rules for subnets or network interfaces, acting as a virtual firewall. In Azure, NSGs are commonly attached to subnets. Overly permissive rules can expose resources, while overly restrictive rules may block legitimate traffic; regular audits are required.
Network Time Protocol (NTP) Related terms #
Time Synchronization, Clock Drift NTP synchronizes system clocks across distributed components, essential for certificate validation and log correlation. Cloud VMs typically use provider‑hosted NTP servers. Misconfiguration can cause authentication failures and data inconsistency across services.
Object Storage Related terms #
S3, Blob Storage Object storage stores unstructured data as objects identified by keys, offering virtually unlimited scalability. It is ideal for backups, media archives, and data lakes. Challenges include managing lifecycle policies, ensuring consistent performance for high‑throughput workloads, and protecting against accidental deletions.
On‑Demand Instance Related terms #
Spot Instance, Reserved Instance On‑demand instances are billed per hour or second with no long‑term commitment, providing flexibility for unpredictable workloads. They are useful for short‑lived batch jobs. Higher cost compared to reserved or spot pricing makes them less economical for sustained workloads.
Orchestration Related terms #
Scheduler, Workflow Engine Orchestration automates the coordination of multiple services, handling dependencies and scaling. Kubernetes is a container orchestration platform; Apache Airflow orchestrates data pipelines. Designing resilient orchestration flows requires handling retries, idempotency, and monitoring for failures.
Performance Baseline Related terms #
Benchmark, SLA A performance baseline defines expected metrics such as latency, throughput, and error rates for a service under normal conditions. Establishing a baseline enables detection of regressions. Baselines must be regularly updated to reflect changes in workload or infrastructure.
Policy as Code Related terms #
IaC, OPA Policy as Code expresses security and compliance rules in a programmable format, enabling automated enforcement. Open Policy Agent (OPA) can evaluate policies during CI pipelines. Maintaining policy repositories and ensuring they stay in sync with evolving regulatory requirements can be demanding.
Private Link Related terms #
VPC Endpoint, Service Endpoint Private Link provides a secure, private connection between a VPC and a cloud service without traversing the public internet. It is used to access managed databases or SaaS APIs privately. Configuring DNS resolution and managing endpoint policies are typical operational steps.
Public Cloud Related terms #
Private Cloud, Hybrid Cloud Public cloud delivers resources over the internet on a shared, multi‑tenant infrastructure owned by a CSP. It offers elasticity, pay‑as‑you‑go pricing, and a broad service catalog. Organizations must address concerns about data privacy, compliance, and potential vendor lock‑in.
Quantum‑Resistant Encryption Related terms #
Post‑Quantum Cryptography, KMS Quantum‑resistant encryption algorithms are designed to remain secure against attacks from quantum computers. Cloud providers are beginning to offer key management services that support post‑quantum algorithms. Migration paths and performance impacts are still being evaluated.
Queueing Service Related terms #
Message Bus, Pub/Sub A queueing service decouples producers and consumers, enabling asynchronous processing and load leveling. Examples include Amazon SQS and Azure Queue Storage. Proper visibility timeout and dead‑letter handling are essential to avoid message loss or duplication.
Rate Limiting Related terms #
Throttling, API Gateway Rate limiting restricts the number of requests a client can make within a time window, protecting services from overload. API gateways often enforce rate limits based on API keys. Setting limits too low can degrade legitimate user experience; too high can expose services to abuse.
Region Related terms #
Availability Zone, Edge Location A region is a geographic area containing multiple AZs, providing a logical boundary for resource placement and data residency. Deploying resources within the same region reduces latency. Cross‑region traffic incurs higher costs and may be subject to data transfer regulations.
Resource Quota Related terms #
Limits, Namespace Resource quotas cap the amount of compute, storage, or other resources a project or namespace can consume. They prevent accidental over‑provisioning and enforce cost controls. Misconfigured quotas can cause deployment failures and disrupt development pipelines.
Reverse Proxy Related terms #
Ingress, API Gateway A reverse proxy forwards client requests to backend services, often handling TLS termination, load balancing, and caching. Nginx and Envoy are common reverse proxies. Configuration errors can expose internal services or cause routing loops.
Role‑Based Access Control (RBAC) Related terms #
IAM, Least Privilege RBAC assigns permissions to roles rather than individual users, simplifying permission management. In Kubernetes, RBAC governs API access for service accounts. Over‑assignment of broad roles can undermine security; regular role reviews are necessary.
Scalability Related terms #
Elasticity, Horizontal Scaling Scalability is the ability of a system to handle increased load by adding resources. Horizontal scaling adds more instances, while vertical scaling enlarges existing instances. Designing for scalability requires stateless services, idempotent operations, and automated provisioning.
Serverless Architecture Related terms #
FaaS, Event‑Driven Serverless abstracts server management, allowing developers to focus on code that runs in response to events. Benefits include automatic scaling and reduced operational overhead. Limitations involve vendor‑specific runtime constraints, cold start latency, and difficulty in debugging.
Service Mesh Related terms #
Sidecar Proxy, Istio A service mesh provides a dedicated infrastructure layer for handling service‑to‑service communication, offering features like traffic shaping, observability, and security. Istio and Linkerd are popular implementations. Deploying a mesh adds complexity and resource overhead, requiring careful performance testing.
Service Level Agreement (SLA) Related terms #
SLO, SLIs An SLA defines the expected uptime, performance, and support guarantees a provider commits to. Cloud services often publish SLA percentages (e.G., 99.9% Uptime). Understanding SLA terms helps design compensation strategies for downtime; however, SLAs rarely cover application‑level failures.
Service Level Objective (SLO) Related terms #
SLA, SLI An SLO is a target metric for a specific service indicator, such as 99.9% Request latency under 200 ms. SLOs guide alerting thresholds and capacity planning. Setting realistic SLOs requires historical data and clear stakeholder expectations.
Service Level Indicator (SLI) Related terms #
SLO, Metric An SLI measures a specific aspect of service performance, such as error rate or response time. SLIs feed into SLO calculations. Selecting meaningful SLIs is critical; overly granular metrics can cause alert fatigue, while coarse metrics may hide problems.
Shared Responsibility Model Related terms #
CSP, Customer The shared responsibility model delineates security duties between the CSP (e.G., Physical security, hypervisor) and the customer (e.G., Data encryption, IAM). Understanding this division prevents gaps in compliance. Misinterpretation can lead to unprotected data or unnecessary effort.
Sidecar Pattern Related terms #
Service Mesh, Proxy The sidecar pattern runs auxiliary processes alongside a primary application container, handling concerns like logging, monitoring, or security. Envoy sidecars are typical in service meshes. Managing sidecar lifecycle and resource consumption adds operational overhead.
SLA Breach Related terms #
Compensation, Service Credit An SLA breach occurs when a provider fails to meet the agreed performance or availability targets, often triggering service credits. Monitoring SLA compliance requires precise measurement of uptime and latency. Disputes may arise over metric definitions and measurement windows.
Snapshot Related terms #
Backup, Restore A snapshot captures the state of a volume or database at a point in time, enabling quick restores. Cloud snapshots are incremental, storing only changed blocks. Over‑reliance on snapshots without testing restores can lead to unverified recovery processes.
Software‑Defined Networking (SDN) Related terms #
VLAN, Overlay Network SDN abstracts network control from hardware, allowing programmable configuration via APIs. In cloud, SDN enables dynamic routing, security policies, and network segmentation. Complexity arises from integrating SDN controllers with existing legacy networks.
Spillover Capacity Related terms #
Over‑Provisioning, Elasticity Spillover capacity refers to excess resource capacity that can absorb unexpected demand spikes without immediate scaling. Cloud providers often allocate buffer capacity to meet bursty workloads. Relying on spillover without proper monitoring can result in throttling when buffer is exhausted.
Stateful Service Related terms #
Stateless, Persistence A stateful service retains data across requests, such as databases or session stores. Deploying stateful services in containers requires persistent storage solutions like CSI drivers. Challenges include ensuring data durability, handling node failures, and performing graceful scaling.
Static IP Address Related terms #
Elastic IP, DNS A static IP remains constant over time, useful for services that require a fixed endpoint. In cloud, Elastic IPs or reserved IPs provide static addressing. Managing IP address allocation and avoiding exhaustion are operational concerns.
Static Site Hosting Related terms #
CDN, Object Storage Static site hosting serves pre‑rendered HTML, CSS, and JavaScript files directly from object storage, often fronted by a CDN. Services like AWS Amplify or Azure Static Web Apps simplify deployment. Limitations include lack of server‑side processing; dynamic functionality must be added via serverless APIs.
Storage Class Related terms #
Tiering, Lifecycle Policy Storage classes define performance and cost characteristics for objects (e.G., Standard, Infrequent Access, Glacier). Selecting appropriate classes reduces cost while meeting access latency needs. Misclassification can lead to higher-than‑expected storage bills or delayed retrieval times.
Subnet Related terms #
VPC, CIDR Block A subnet is a segmented IP address range within a VPC, used to group resources with similar networking requirements. Public subnets host internet‑facing resources; private subnets host databases. Proper subnet sizing and routing table configuration are essential to avoid IP exhaustion and routing loops.
Supply Chain Attack Related terms #
Dependency Risk, SBOM A supply chain attack compromises software components during build or distribution, injecting malicious code. Cloud workloads can be affected via compromised container images. Mitigation includes using signed images, scanning for vulnerabilities, and maintaining a Software Bill of Materials (SBOM).
Synchronous Replication Related terms #
Asynchronous Replication, RPO Synchronous replication writes data to primary and secondary locations simultaneously, guaranteeing zero data loss (RPO = 0). It is used for high‑availability databases. The trade‑off is increased latency and higher bandwidth consumption.
Tag Governance Related terms #
Cost Allocation, Policy as Code Tag governance enforces consistent tagging across resources to enable accurate cost reporting and automation. Tools can validate tags during resource creation. Poor governance leads to orphaned resources and inaccurate chargeback.
Technical Debt Related terms #
Refactoring, Legacy System Technical debt accumulates when shortcuts are taken in architecture or code, creating future maintenance burdens. In cloud projects, debt may arise from using ad‑hoc scripts instead of IaC. Regular refactoring and debt tracking help maintain agility.
Telemetry Related terms #
Metrics, Tracing Telemetry collects data about system performance, health, and usage, feeding monitoring dashboards and alerting systems. Cloud-native telemetry often uses OpenTelemetry standards. Ensuring low overhead and securing telemetry data are key considerations.
Threat Modeling Related terms #
Risk Assessment, Attack Surface Threat modeling identifies potential security threats, attack vectors, and mitigations for a system. Techniques like STRIDE help structure analysis. Conducting threat modeling early in design reduces costly retrofits.
Time‑to‑Market Related terms #
CI/CD, Agility Time‑to‑market measures how quickly a product moves from concept to production release. Cloud automation, IaC, and serverless reduce development cycles. Pressure to accelerate can compromise testing depth and security reviews.
Transient Failure Related terms #
Retry Logic, Circuit Breaker A transient failure is a temporary error that may succeed upon retry, such as a brief network timeout. Implementing exponential backoff and circuit breakers improves resilience. Misidentifying permanent failures as transient can cause endless retries.
Traffic Mirroring Related terms #
Packet Capture, Intrusion Detection Traffic mirroring duplicates network packets to a monitoring appliance for analysis. In cloud, VPC traffic mirroring helps detect anomalies. Mirrored traffic can increase bandwidth usage and raise privacy concerns if sensitive data is captured.
Trusted Execution Environment (TEE) Related terms #
Secure Enclave, Confidential Computing A TEE isolates code execution to protect data in use, providing hardware‑based security guarantees. Cloud providers offer confidential VMs with TEEs. Integration complexities and performance overhead are challenges to adoption.
Unified Threat Management (UTM) Related terms #
Firewall, IDS/IPS UTM consolidates multiple security functions (firewall, intrusion detection, anti‑malware) into a single appliance or service. Cloud UTM solutions simplify policy management but may become a single point of failure if not architected redundantly.
Virtual Private Cloud (VPC) Related terms #
Subnet, NAT Gateway A VPC is an isolated virtual network within a public cloud, allowing granular control over IP addressing, routing, and security. It enables hybrid connectivity via VPN or Direct Connect. Misconfigured route tables or security groups can unintentionally expose resources.
Virtual Machine (VM) Related terms #
Instance, Hypervisor A VM emulates a physical computer, running its own OS and applications. Cloud VMs are provisioned on-demand, supporting a wide range of workloads. Over‑provisioning VM size leads to wasted spend; under‑provisioning causes performance bottlenecks.