Installing on Kubernetes

You can install GridGain 9 and run a GridGain cluster on Kubernetes cluster. This section describes all the necessary steps, as well as provides the configurations and manifests that you can copy and paste into your environment.

Prerequisites

Recommended Kubernetes Version

GridGain is tested on Kubernetes version 1.26.

Installation Steps

Create ConfigMaps

Create the GridGain configuration file and get a license. The minimum node configuration is as follows:

gridgain-config.conf

ignite: {
 network: {
 # GridGain 9 node port
 port = 3344
 nodeFinder = {
 netClusterNodes = [
 # Kubernetes service to access the GridGain 9 cluster on the Kubernetes network
 "gridgain-svc-headless:3344"
 ]
 }
 }
 storage: {
 profiles = [
 {
 engine = "aipersist"
 name = "default"
 replacementMode = "CLOCK"
 # Explicit storage size configuration
 sizeBytes = 2147483648
 }
 ]
 }
}

Place your license content in the license.conf file.

Create the ConfigMap object for GridGain configuration:

kubectl create configmap gridgain-config -n <namespace> --from-file=gridgain-config.conf

Create the ConfigMap object for the GridGain license:
```
kubectl create configmap gridgain-license -n <namespace> --from-file=license.conf
```
Replace <namespace> with the name of the namespace where you want to deploy GridGain.

In Kubernetes deployments, the gridgain-config.conf file is mounted as a read-only ConfigMap, so any attempt to update it with the node config update command will fail.

To update GridGain node configuration, modify the existing ConfigMap and restart all GridGain pods.
- Modify previously configured ConfigMap object:
  kubectl edit configmap gridgain-config -n <namespace>
- Restart GridGain pod, repeat for every pod:
  kubectl delete pod <GridGain pode name> -n <namespace>

Configure Environment Variables and Logging

In Kubernetes deployments all environment variables must be defined directly in the container specification, either in the StatefulSet manifests or in the Helm chart configuration.

JVM and Memory Configuration

To configure JVM options (such as heap size, Metaspace, and Java agent), define the GRIDGAIN9_EXTRA_JVM_ARGS environment variable in your manifests.

For example, in a StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
 name: gridgain-cluster
 namespace: <namespace>
spec:
 ...
 template:
 spec:
 terminationGracePeriodSeconds: 60000
 containers:
 - name: gridgain-node
 env:
 # Must be specified to ensure that GridGain 9 cluster replicas are visible to each other.
 - name: GRIDGAIN9_EXTRA_JVM_ARGS
 value: "-javaagent:/agent/jmx.jar=9404:/opt/jmx/jmx.yaml-Xms1g-Xmx3g-XX:MaxMetaspaceSize=256m-XX:MaxDirectMemorySize=4g"

When using the Helm chart, define the same variable through the extraEnvVars field:

extraEnvVars:
 - name: GRIDGAIN9_EXTRA_JVM_ARGS
 value: "-javaagent:/agent/jmx.jar=9404:/opt/jmx/jmx.yaml-Xms1g-Xmx3g-XX:MaxMetaspaceSize=256m-XX:MaxDirectMemorySize=4g"

To generate JVM diagnostic files inside Kubernetes/Docker, pass the relevant JVM options through GRIDGAIN9_EXTRA_JVM_ARGS:

env:
 - name: GRIDGAIN9_EXTRA_JVM_ARGS
 value: "-XX:+HeapDumpOnOutOfMemoryError-XX:HeapDumpPath=/opt/gridgain/work/diagnostics-XX:ErrorFile=/opt/gridgain/work/diagnostics/hs_err_pid%p.log-Xlog:gc*:file=/opt/gridgain/work/log/gc.log:time,level,tags"

Ensure that the output directories exist and are writable inside the pod. Declare persistent volumes for diagnostic outputs and mount them into the container:

volumeMounts:
 - name: diagnostics
 mountPath: /opt/gridgain/work/diagnostics
 - name: logs
 mountPath: /opt/gridgain/work/log
volumes:
 - name: diagnostics
 persistentVolumeClaim:
 claimName: gg9-diagnostics-pvc
 - name: logs
 persistentVolumeClaim:
 claimName: gg9-logs-pvc

Logging Configuration

By default, the Docker image ships with a gridgain.java.util.logging.properties configuration that enables only console logging.

To modify the default logging behavior (e.g., enabling file-based logging or using log4j2), create a ConfigMap with a custom gridgain.java.util.logging.properties file and mount it into each pod.

Create the ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
 name: gg9-gridgain9-logging
 namespace: "yournamespace"
data:
 gridgain.java.util.logging.properties: |-
 handlers=java.util.logging.FileHandler
 java.util.logging.SimpleFormatter.format = [%1$tF %1$tT] [%4$s] %5$s%6$s%n
 java.util.logging.FileHandler.pattern = /opt/gridgain/etc/gridgain9-%u-%g.log
 java.util.logging.FileHandler.formatter = org.apache.ignite.internal.lang.JavaLoggerFormatter
 java.util.logging.FileHandler.level = WARNING
 java.util.logging.FileHandler.encoding = UTF-8

Mount it in the StatefulSet:

volumeMounts:
 - name: logging
 mountPath: /opt/gridgain/etc/gridgain.java.util.logging.properties
 subPath: gridgain.java.util.logging.properties
...
volumes:
 - name: logging
 configMap:
 defaultMode: 420
 name: gg9-gridgain9-logging

If you use Helm, the same can be defined via configMaps in values.yaml:

configMaps:
 logging:
 name: gridgain.java.util.logging.properties
 path: /opt/gridgain/etc/gridgain.java.util.logging.properties
 subpath: gridgain.java.util.logging.properties
 content: |
 handlers=java.util.logging.FileHandler
 java.util.logging.SimpleFormatter.format = [%1$tF %1$tT] [%4$s] %5$s%6$s%n
 java.util.logging.FileHandler.pattern = /opt/gridgain/etc/gridgain9-%u-%g.log
 java.util.logging.FileHandler.formatter = org.apache.ignite.internal.lang.JavaLoggerFormatter
 java.util.logging.FileHandler.level = WARNING
 java.util.logging.FileHandler.encoding = UTF-8

Once both are configured, directory—file logging becomes active on startup.

Default Storage Configuration

In Kubernetes deployments, the default storage profile is defined in the ConfigMap created during installation.

You can use this configuration:

storage: {
 profiles = [
 {
 engine = "aipersist"
 name = "default"
 replacementMode = "CLOCK"
 sizeBytes = 2147483648
 }
 ]
}

Or see the example on how to redefine it for the Helm chart.

Create and Deploy the Service

Depending on your requirements, define and deploy a Kubernetes service. Gridgain 9 use two types of services: one for internal cluster discovery, and the other — for external client access.

First, choose a type of service you need and prepare the service.yaml file.

For communication inside the Kubernetes cluster, Use a headless service by setting the clusterIP parameter to None. This will expose each pod’s IP, enabling GridGain to be partition‐aware: clients discover every node’s address, determine which partition resides on which node, and send requests directly where the data is located.

service.yaml

apiVersion: v1
kind: Service
metadata:
 # The name must be equal to netClusterNodes.
 name: gridgain-svc-headless
 # Place your namespace name here.
 namespace: <namespace>
spec:
 clusterIP: None
 internalTrafficPolicy: Cluster
 ipFamilies:
 - IPv4
 ipFamilyPolicy: SingleStack
 ports:
 - name: management
 port: 10300
 protocol: TCP
 targetPort: 10300
 - name: rest
 port: 10800
 protocol: TCP
 targetPort: 10800
 - name: cluster
 port: 3344
 protocol: TCP
 targetPort: 3344
 selector:
 # Must be equal to the label set for pods.
 app: gridgain
 # Include not-yet-ready nodes.
 publishNotReadyAddresses: True
 sessionAffinity: None
 type: ClusterIP

Use a LoadBalancer service to allow external clients to connect. Keep in mind, that with this option you giving up partition awareness.

If your environments does not support LoadBalancer, you can use type: NodePort instead. Refer to the Kubernetes documentation for details.

apiVersion: v1
kind: Service
metadata:
 name: gridgain-loadbalancer
 labels:
 app: gridgain
spec:
 type: LoadBalancer
 selector:
 app: gridgain
 ports:
 - name: rest
 protocol: TCP
 port: 10800
 targetPort: 10800
 - name: client
 port: 10300
 protocol: TCP
 targetPort: 10300

Then apply the service.yaml file to set up this service:

kubectl apply -f service.yaml

Deploy the StatefulSet

Prepare the statefulset.yaml file for StatefulSet deployment:

statefulset.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
 # The cluster name.
 name: gridgain-cluster
 # Place your namespace name.
 namespace: <namespace>
spec:
 # The initial number of pods to be started by Kubernetes.
 replicas: 2
 # Kubernetes service to access the GridGain 9 cluster on the Kubernetes network.
 serviceName: gridgain-svc-headless
 selector:
 matchLabels:
 app: gridgain
 template:
 metadata:
 labels:
 app: gridgain
 spec:
 terminationGracePeriodSeconds: 60000
 containers:
 # Custom pod name.
 - name: gridgain-node
 # Limits and requests for the GridGain container.
 resources:
 limits:
 cpu: "4"
 memory: 4Gi
 requests:
 cpu: "4"
 memory: 4Gi
 env:
 # Must be specified to ensure that GridGain 9 cluster replicas are visible to each other.
 - name: GRIDGAIN_NODE_NAME
 valueFrom:
 fieldRef:
 fieldPath: metadata.name
 # GridGain 9 working directory.
 - name: GRIDGAIN_WORK_DIR
 value: /gg9-work
 # GridGains Docker image and it's version.
 image: gridgain/gridgain9:9.1.16
 ports:
 - containerPort: 10300
 - containerPort: 10800
 - containerPort: 3344
 volumeMounts:
 # The config will be placed at this path in the container.
 - mountPath: /opt/gridgain/etc/gridgain-config.conf
 name: config-vol
 subPath: gridgain-config.conf
 # The license will be placed at this path in the container.
 - mountPath: /opt/gridgain/etc/license.conf
 name: license-vol
 subPath: license.conf
 # GridGain 9 working directory.
 - mountPath: /gg9-work
 name: persistence
 volumes:
 - name: config-vol
 configMap:
 name: gridgain-config
 - name: license-vol
 configMap:
 name: gridgain-license
 volumeClaimTemplates:
 - apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
 name: persistence
 spec:
 accessModes:
 - ReadWriteOnce
 resources:
 requests:
 storage: 10Gi # Provide enough space for your application data.
 volumeMode: Filesystem

Apply the statefulset.yaml file to deploy the main components of GridGain 9:
```
kubectl apply -f statefulset.yaml
```

Wait for Pods to Start

Monitor the status of the pods:
```
kubectl get pods -n <namespace> -w
```
Ensure that all pods' STATUS is Running before proceeding.

Deploy the Job

Prepare the job.yaml file for deploying the job:

job.yaml

apiVersion: batch/v1
kind: Job
metadata:
 name: cluster-init
 # Place your namespace name here.
 namespace: <namespace>
spec:
 template:
 spec:
 containers:
 # Command to init the cluster. URL and host must be the name of the service you created before. Port is 10300 as the management port.
 - args:
 - -ec
 - |
 apt update && apt-get install -y bind9-host
 GG_NODES=$(host -tsrv _cluster._tcp.gridgain-svc-headless | grep 'SRV record' | awk '{print 8ドル}' | awk -F. '{print 1ドル}' | paste -sd ',')
 /opt/gridgain9cli/bin/gridgain9 cluster init --name=gridgain --url=http://gridgain-svc-headless:10300 --license=/opt/gridgain/etc/license.conf
 command:
 - /bin/sh
 # Specify the Docker image with the GridGain 9 CLI and its version.
 image: gridgain/gridgain9:9.1.16
 imagePullPolicy: IfNotPresent
 name: cluster-init
 resources: {}
 volumeMounts:
 # The license required to be mounted to cluster-init job.
 - mountPath: /opt/gridgain/etc/license.conf
 name: license-vol
 subPath: license.conf
 restartPolicy: Never
 terminationGracePeriodSeconds: 120
 volumes:
 - name: license-vol
 configMap:
 name: gridgain-license

Apply the job.yaml file to complete installation.
```
kubectl apply -f job.yaml
```

Installation Verification

Check the status of all resources in your namespace:
```
kubectl get all -n <namespace>
```
Ensure that all components are running as expected, without errors, and that the initialization job is in the Completed status.
Verify that your cluster is initialized and running.
```
kubectl exec -it gridgain-cluster-0 bash -n <namespace>
/opt/gridgain9cli/bin/gridgain9 cluster status
```
The command output must include the name of your cluster and the number of nodes. The status must be ACTIVE.

Optional: KEDA Configuration

You can configure KEDA to automatically scale the cluster based on your needs, ensuring optimal resource optimization and performance. This implementation uses Prometheus to monitor cluster load.

To enable KEDA scaling for your cluster:

Add the necessary Helm repositories, install KEDA and Prometheus:

helm install keda kedacore/keda --namespace keda --create-namespace
helm install prometheus prometheus-community/prometheus --namespace keda -f prometheus-values.yaml

Deploy the KEDA configurations:

The keda-scaled-object.yaml configuration defines the scaling rules for the GridGain cluster:

keda-scaled-object.yaml

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
 name: gridgain-autoscale
 namespace: gridgain
spec:
 scaleTargetRef:
 kind: StatefulSet
 name: gridgain-cluster
 pollingInterval: 30
 cooldownPeriod: 120
 minReplicaCount: 2 # Set initial number of replics
 maxReplicaCount: 5
 advanced:
 horizontalPodAutoscalerConfig:
 behavior:
 scaleDown:
 selectPolicy: Disabled
 scaleUp:
 stabilizationWindowSeconds: 60 # Increase if needed
 selectPolicy: Max
 policies:
 - type: Pods
 value: 1
 periodSeconds: 120
 triggers:
 # Uncomment and modify the following section to enable CPU usage based autoscaling
 # - type: prometheus
 # metadata:
 # name: cpu-usage
 # serverAddress: http://prometheus-server.keda.svc.cluster.local:80
 # query: sum(os_system_load_average{job="gridgain"})
 # threshold: "0.8"
 # activationThreshold: "0.6"
 - type: prometheus
 name: heap-memory-usage
 metadata:
 serverAddress: http://prometheus-server.keda.svc.cluster.local:80
 query: |
 sum(jvm_memory_committed_bytes{area="heap", job="gridgain"} / jvm_memory_max_bytes{area="heap", job="gridgain"})
 threshold: "0.7"
 - type: prometheus
 name: nonheap-memory-usage
 metadata:
 serverAddress: http://prometheus-server.keda.svc.cluster.local:80
 query: |
 sum(jvm_memory_committed_bytes{area="nonheap", job="gridgain"} / jvm_memory_max_bytes{area="nonheap", job="gridgain"})
 threshold: "0.7"

The keda-recovery-scaled-job.yaml configuration handles rebuilding CMG nodes in the GridGain cluster.

keda-scaled-object.yaml

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
 name: gridgain-recovery
 namespace: gridgain
spec:
 jobTargetRef:
 template:
 spec:
 # securityContext:
 # runAsUser: 0
 # runAsGroup: 0
 # fsGroup: 0
 containers:
 - name: recovery
 image: gridgain/gridgain9:9.1.1
 command: ["/bin/bash", "/scripts/recovery.sh"]
 volumeMounts:
 - name: script-vol
 mountPath: /scripts
 restartPolicy: Never
 volumes:
 - name: script-vol
 configMap:
 name: gridgain-recovery-script
 defaultMode: 0777
 backoffLimit: 1
 pollingInterval: 30
 successfulJobsHistoryLimit: 2
 failedJobsHistoryLimit: 3
 maxReplicaCount: 1
 triggers:
 # - type: prometheus
 # metadata:
 # serverAddress: http://prometheus-server.keda.svc.cluster.local:80
 # query: avg(os_system_load_average{job="gridgain"})
 # threshold: "0.8"
 # activationThreshold: "0.6"
 - type: prometheus
 name: heap-memory-usage
 metadata:
 serverAddress: http://prometheus-server.keda.svc.cluster.local:80
 query: |
 avg(jvm_memory_committed_bytes{area="heap", job="gridgain"} / jvm_memory_max_bytes{area="heap", job="gridgain"})
 threshold: "0.7"
 - type: prometheus
 name: nonheap-memory-usage
 metadata:
 serverAddress: http://prometheus-server.keda.svc.cluster.local:80
 query: |
 avg(jvm_memory_committed_bytes{area="nonheap", job="gridgain"} / jvm_memory_max_bytes{area="nonheap", job="gridgain"})
 threshold: "0.7"

You can deploy the above configurations with the following commands:

kubectl apply -n gridgain -f keda-scaled-object.yaml
kubectl apply -n gridgain -f keda-recovery-scaled-job.yaml

Installation Troubleshooting

If any issues occur during the installation:

Check the logs of specific pods:
```
kubectl logs <pod-name> -n <namespace>
```
Review events in the namespace:
```
kubectl get events -n <namespace>
```

Troubleshooting via REST

This approach is intended for short-lived debugging and testing sessions, not for long-term or external access.

Forward the REST port to a specific GridGain 9 replica:

kubectl port-forward <pod-name> 10300:10300 -n <namespace>

Example:

kubectl port-forward gridgain-cluster-0 10300:10300 -n gridgain

Open a new terminal and run REST requests to the http://localhost:10300 , for example:

curl 'http://localhost:10300/management/v1/cluster/state'
{"title":"Cluster is not initialized","status":409,"detail":"Cluster is not initialized. Call /management/v1/cluster/init in order to initialize cluster."}

curl 'http://localhost:10300/management/v1/node/info'
{"name":"gridgain-cluster-0","jdbcPort":10800}

For more information refer to the documentation:

Kubernetes: Use Port Forwarding to Access Applications in a Cluster
kubectl command reference: kubectl port-forward
GridGain 9 REST API docs: REST API

Limitations and Considerations

When running GridGain 9 in a Kubernetes environment, the node configuration becomes read-only and cannot be modified by using the gridgain9 node config update CLI command. This is by design, as node configuration is managed via Kubernetes resources. To change your configuration:

Manually update the corresponding ConfigMap;
Restart all cluster pods by executing kubectl delete pod for each replica.

The updated configuration will take effect after the pods are recreated by the Kubernetes controller.

© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.

Last updated on Dec 23, 2025