This is the multi-page printable view of this section. Click here to print.
Getting started
1 - Introduction to Kubernetes operators
What are Kubernetes Operators?
Kubernetes operators are software extensions that manage both cluster and non-cluster resources on behalf of Kubernetes. The Java Operator SDK (JOSDK) makes it easy to implement Kubernetes operators in Java, with APIs designed to feel natural to Java developers and framework handling of common problems so you can focus on your business logic.
Why Use Java Operator SDK?
JOSDK provides several key advantages:
- Java-native APIs that feel familiar to Java developers
- Automatic handling of common operator challenges (caching, event handling, retries)
- Production-ready features like observability, metrics, and error handling
- Simplified development so you can focus on business logic instead of Kubernetes complexities
Learning Resources
Getting Started
- Introduction to Kubernetes operators - Core concepts explained
- Implementing Kubernetes Operators in Java - Introduction talk
- Kubernetes operator pattern documentation - Official Kubernetes docs
Deep Dives
- Problems JOSDK solves - Technical deep dive
- Why Java operators make sense - Java in cloud-native infrastructure
- Building a Kubernetes operator SDK for Java - Framework design principles
Tutorials
- Writing Kubernetes operators using JOSDK - Step-by-step blog series
2 - Bootstrapping and samples
Creating a New Operator Project
Using the Maven Plugin
The simplest way to start a new operator project is using the provided Maven plugin, which generates a complete project skeleton:
mvn io.javaoperatorsdk:bootstrapper:[version]:create \
-DprojectGroupId=org.acme \
-DprojectArtifactId=getting-started
This command creates a new Maven project with:
- A basic operator implementation
- Maven configuration with required dependencies
- Generated CustomResourceDefinition (CRD)
Building Your Project
Build the generated project with Maven:
mvn clean install
The build process automatically generates the CustomResourceDefinition YAML file that youโll need to apply to your Kubernetes cluster.
Exploring Sample Operators
The sample-operators directory contains real-world examples demonstrating different JOSDK features and patterns:
Available Samples
- Purpose: Creates NGINX webservers from Custom Resources containing HTML code
- Key Features: Multiple implementation approaches using both low-level APIs and higher-level abstractions
- Good for: Understanding basic operator concepts and API usage patterns
- Purpose: Manages database schemas in MySQL instances
- Key Features: Demonstrates managing non-Kubernetes resources (external systems)
- Good for: Learning how to integrate with external services and manage state outside Kubernetes
- Purpose: Manages Tomcat instances and web applications
- Key Features: Multiple controllers managing related custom resources
- Good for: Understanding complex operators with multiple resource types and relationships
Running the Samples
Prerequisites
The easiest way to try samples is using a local Kubernetes cluster:
Step-by-Step Instructions
Apply the CustomResourceDefinition:
kubectl apply -f target/classes/META-INF/fabric8/[resource-name]-v1.yml
Run the operator:
mvn exec:java -Dexec.mainClass="your.main.ClassName"
Or run your main class directly from your IDE.
Create custom resources: The operator will automatically detect and reconcile custom resources when you create them:
kubectl apply -f examples/sample-resource.yaml
Detailed Examples
For comprehensive setup instructions and examples, see:
- MySQL Schema sample README
- Individual sample directories for specific setup requirements
Next Steps
After exploring the samples:
- Review the patterns and best practices guide
- Learn about implementing reconcilers
- Explore dependent resources and workflows for advanced use cases
3 - Patterns and best practices
This document describes patterns and best practices for building and running operators, and how to implement them using the Java Operator SDK (JOSDK).
See also best practices in the Operator SDK.
Implementing a Reconciler
Always Reconcile All Resources
Reconciliation can be triggered by events from multiple sources. It might be tempting to check the events and only reconcile the related resource or subset of resources that the controller manages. However, this is considered an anti-pattern for operators.
Why this is problematic:
- Kubernetesโ distributed nature makes it difficult to ensure all events are received
- If your operator misses some events and doesnโt reconcile the complete state, it might operate with incorrect assumptions about the cluster state
- Always reconcile all resources, regardless of the triggering event
JOSDK makes this efficient by providing smart caches to avoid unnecessary Kubernetes API server access and ensuring your reconciler is triggered only when needed.
Since thereโs industry consensus on this topic, JOSDK no longer provides event access from Reconciler
implementations starting with version 2.
Event Sources and Caching
During reconciliation, best practice is to reconcile all dependent resources managed by the controller. This means comparing the desired state with the actual cluster state.
The Challenge: Reading the actual state directly from the Kubernetes API Server every time would create significant load.
The Solution: Create a watch for dependent resources and cache their latest state using the Informer pattern. In JOSDK, informers are wrapped into EventSource
to integrate with the frameworkโs eventing system via the InformerEventSource
class.
How it works:
- New events trigger reconciliation only when the resource is already cached
- Reconciler implementations compare desired state with cached observed state
- If a resource isnโt in cache, it needs to be created
- If actual state doesnโt match desired state, the resource needs updating
Idempotency
Since all resources should be reconciled when your Reconciler
is triggered, and reconciliations can be triggered multiple times for any given resource (especially with retry policies), itโs crucial that Reconciler
implementations be idempotent.
Idempotency means: The same observed state should always result in exactly the same outcome.
Key implications:
- Operators should generally operate in a stateless fashion
- Since operators usually manage declarative resources, ensuring idempotency is typically straightforward
Synchronous vs Asynchronous Resource Handling
Sometimes your reconciliation logic needs to wait for resources to reach their desired state (e.g., waiting for a Pod
to become ready). You can approach this either synchronously or asynchronously.
Asynchronous Approach (Recommended)
Exit the reconciliation logic as soon as the Reconciler
determines it cannot complete at this point. This frees resources to process other events.
Requirements: Set up adequate event sources to monitor state changes of all resources the operator waits for. When state changes occur, the Reconciler
is triggered again and can finish processing.
Synchronous Approach
Periodically poll resourcesโ state until they reach the desired state. If done within the reconcile
method, this blocks the current thread for potentially long periods.
Recommendation: Use the asynchronous approach for better resource utilization.
Why Use Automatic Retries?
Automatic retries are enabled by default and configurable. While you can deactivate this feature, we advise against it.
Why retries are important:
- Transient network errors: Common in Kubernetesโ distributed environment, easily resolved with retries
- Resource conflicts: When multiple actors modify resources simultaneously, conflicts can be resolved by reconciling again
- Transparency: Automatic retries make error handling completely transparent when successful
Managing State
Thanks to Kubernetes resourcesโ declarative nature, operators dealing only with Kubernetes resources can operate statelessly. They donโt need to maintain resource state information since it should be possible to rebuild the complete resource state from its representation.
When State Management Becomes Necessary
This stateless approach typically breaks down when dealing with external resources. You might need to track external state for future reconciliations.
Anti-pattern: Putting state in the primary resourceโs status sub-resource
- Becomes difficult to manage with large amounts of state
- Violates best practice: status should represent actual resource state, while spec represents desired state
Recommended approach: Store state in separate resources designed for this purpose:
- Kubernetes Secret or ConfigMap
- Dedicated Custom Resource with validated structure
Handling Informer Errors and Cache Sync Timeouts
You can configure whether the operator should stop when informer errors occur on startup.
Default Behavior
By default, if thereโs a startup error (e.g., the informer lacks permissions to list target resources for primary or secondary resources), the operator stops immediately.
Alternative Configuration
Set the flag to false
to start the operator even when some informers fail to start. In this case:
- The operator continuously retries connection with exponential backoff
- This applies both to startup failures and runtime problems
- The operator only stops for fatal errors (currently when a resource cannot be deserialized)
Use case: When watching multiple namespaces, itโs better to start the operator so it can handle other namespaces while resolving permission issues in specific namespaces.
Cache Sync Timeout Impact
The stopOnInformerErrorDuringStartup
setting affects cache sync timeout behavior:
- If
true
: Operator stops on cache sync timeout - If
false
: After timeout, the controller starts reconciling resources even if some event source caches havenโt synced yet
Graceful Shutdown
You can provide sufficient time for the reconciler to process and complete ongoing events before shutting down. Simply set an appropriate duration value for reconciliationTerminationTimeout
using ConfigurationServiceOverrider
.
final var overridden = new ConfigurationServiceOverrider(config)
.withReconciliationTerminationTimeout(Duration.ofSeconds(5));
final var operator = new Operator(overridden);