Set up Elastic Stack on GKE
Stay organized with collections
Save and categorize content based on your preferences.
This tutorial shows you how to run Elastic Stack on GKE using the Elastic Cloud on Kubernetes (ECK) operator.
Elastic Stack is a popular open source solution used for logging, monitoring, and analyzing data in real-time. Using Elastic Stack on GKE, you can benefit from the scalability and reliability provided by GKE Autopilot and the powerful Elastic Stack features.
This tutorial is intended for Kubernetes administrators or site reliability engineers.
Objectives
- Create a GKE cluster.
- Deploy the ECK operator.
- Configure Elasticsearch clusters and Kibana using the ECK operator.
- Deploy a complete Elastic Stack using the ECK operator.
- Autoscale Elasticsearch clusters and upgrade the Elastic Stack deployment.
- Use Elastic Stack to monitor Kubernetes environments.
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get 300ドル in free credits to run, test, and deploy workloads.
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloudinit
-
Create or select a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
-
Create a Google Cloud project:
gcloud projects create PROJECT_ID
Replace
PROJECT_ID
with a name for the Google Cloud project you are creating. -
Select the Google Cloud project that you created:
gcloud config set project PROJECT_ID
Replace
PROJECT_ID
with your Google Cloud project name.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the GKE API:
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin
), which contains theserviceusage.services.enable
permission. Learn how to grant roles.gcloudservicesenablecontainer.googleapis.com
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloudinit
-
Create or select a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
-
Create a Google Cloud project:
gcloud projects create PROJECT_ID
Replace
PROJECT_ID
with a name for the Google Cloud project you are creating. -
Select the Google Cloud project that you created:
gcloud config set project PROJECT_ID
Replace
PROJECT_ID
with your Google Cloud project name.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the GKE API:
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin
), which contains theserviceusage.services.enable
permission. Learn how to grant roles.gcloudservicesenablecontainer.googleapis.com
-
Grant roles to your user account. Run the following command once for each of the following IAM roles:
roles/container.clusterAdmin
gcloudprojectsadd-iam-policy-bindingPROJECT_ID--member="user:USER_IDENTIFIER"--role=ROLE
Replace the following:
PROJECT_ID
: Your project ID.USER_IDENTIFIER
: The identifier for your user account. For example,myemail@example.com
.ROLE
: The IAM role that you grant to your user account.
- You must own a domain name. The domain name must be no longer than 63 characters. You can use Cloud Domains or another registrar.
Prepare the environment
In this tutorial, you use Cloud Shell to manage resources hosted
on Google Cloud. Cloud Shell is preinstalled with the software
you need for this tutorial, including
kubectl
,
Helm, and the gcloud CLI.
To set up your environment with Cloud Shell, follow these steps:
Launch a Cloud Shell session from the Google Cloud console, by clicking Cloud Shell activation icon Activate Cloud Shell in the Google Cloud console. This launches a session in the bottom pane of the Google Cloud console.
Add a Helm chart repository and update it:
helmrepoaddelastichttps://helm.elastic.co helmrepoupdate
Clone the GitHub repository:
gitclonehttps://github.com/GoogleCloudPlatform/kubernetes-engine-samples.git
Change to the working directory:
cdkubernetes-engine-samples/observability/elastic-stack-tutorial
Create a GKE cluster
Create a GKE cluster with control plane metrics collection enabled:
gcloudcontainerclusterscreate-autoelk-stack\
--location="us-central1"\
--monitoring="SYSTEM,WORKLOAD,API_SERVER,SCHEDULER,CONTROLLER_MANAGER"
Deploy the ECK operator
Elastic Cloud on Kubernetes (ECK) is a platform for deploying and managing the Elastic Stack on Kubernetes clusters.
ECK automates the deployment and management of Elastic Stack clusters, simplifying the process of setting up and maintaining Elastic Stack on Kubernetes. It provides a set of Kubernetes custom resources that you can use to create and configure Elasticsearch, Kibana, Application Performance Management Server, and other Elastic Stack components in Kubernetes. This lets developers and DevOps teams configure and manage Elastic Stack clusters at scale.
ECK supports multiple Elasticsearch nodes, automatic application failover, seamless upgrades, and SSL encryption. ECK also includes features that let you monitor and troubleshoot Elasticsearch performance.
Install the ECK Helm chart:
helmupgrade--install"elastic-operator""elastic/eck-operator"\ --version="2.8.0"\ --create-namespace\ --namespace="elastic-system"\ --set="resources.limits.cpu=250m"\ --set="resources.limits.memory=512Mi"\ --set="resources.limits.ephemeral-storage=1Gi"\ --set="resources.requests.cpu=250m"\ --set="resources.requests.memory=512Mi"\ --set="resources.requests.ephemeral-storage=1Gi"
Wait for the operator to be ready:
watchkubectlgetpods-nelastic-system
The output is similar to the following:
NAME READY STATUS RESTARTS AGE elastic-operator-0 1/1 Running 0 31s
When the operator
STATUS
isRunning
, return to the command line by pressingCtrl+C
.
Configure Elastic Stack with ECK
By using Elastic Stack with Elasticsearch, Kibana, and Elastic Agent working in Fleet mode, you can set up a powerful, scalable, and fully-managed solution for managing and visualizing data using Kibana.
Kibana is an open source data analytics and visualization tool that lets you search, analyze and visualize data in Elasticsearch.
Elastic Agent is a lightweight data shipper that collects data from different sources, such as logs or metrics, and automatically sends it to Elasticsearch.
Elastic Fleet is a mode of operation in which Elastic agents report to a central fleet server, which handles their configuration and management. The fleet server simplifies the deployment, configuration, and scaling of Elastic agents, making it easier to manage large and complex deployments.
Elasticsearch autoscaling is a self-monitoring feature that can report when additional resources are needed based on an operator-defined policy. For example, a policy might specify that a certain tier should scale based on available disk space. Elasticsearch can monitor the disk space and suggest scaling if it predicts a shortage, although it is still up to the operator to add the necessary resources. For more information about Elasticsearch autoscaling see Autoscaling in the Elasticsearch documentation.
Configure an Elasticsearch cluster
Elasticsearch provides a distributed, RESTful search and analytics engine designed to store and search large volumes of data quickly and efficiently.
When deploying Elastic Stack on Kubernetes, you should manage the VM settings,
specifically the vm.max_map_count setting
, which is required by
Elasticsearch. vm.max_map_count
specifies the number of memory areas that a
process can allocate to a file. Elasticsearch must have this value set to at
least 262144
to run optimally. For more information, see
Virtual memory
in the ECK documentation.
Review the following manifest:
# Copyright 2023 Google LLC # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. apiVersion:scheduling.k8s.io/v1 kind:PriorityClass metadata: name:user-daemonset-priority value:999999999 preemptionPolicy:PreemptLowerPriority globalDefault:false description:"UserDaemonSetpriority" --- apiVersion:apps/v1 kind:DaemonSet metadata: name:max-map-count-setter namespace:elastic-system labels: k8s-app:max-map-count-setter spec: selector: matchLabels: name:max-map-count-setter template: metadata: labels: name:max-map-count-setter spec: priorityClassName:user-daemonset-priority nodeSelector: cloud.google.com/compute-class:"Balanced" initContainers: -name:max-map-count-setter image:docker.io/bash:5.2.15 resources: requests: cpu:10m memory:10Mi ephemeral-storage:10Mi limits: cpu:50m memory:32Mi ephemeral-storage:10Mi securityContext: privileged:true runAsUser:0 command:["/usr/local/bin/bash","-e","-c","echo262144 > /proc/sys/vm/max_map_count"] containers: -name:sleep image:docker.io/bash:5.2.15 command:["sleep","infinity"] resources: requests: cpu:10m memory:10Mi ephemeral-storage:10Mi limits: cpu:10m memory:10Mi ephemeral-storage:10Mi
This manifest describes a DaemonSet that configures the kernel setting on the host directly. A DaemonSet is a Kubernetes controller that ensures that a copy of a Pod runs on each node in a cluster.
The preceding manifest is on an allowlist to run on Autopilot. Don't modify this manifest, including the container images.
Apply this manifest to your cluster:
kubectlapply-fmax-map-count-setter-ds.yaml
Review the following manifest:
apiVersion:elasticsearch.k8s.elastic.co/v1 kind:Elasticsearch metadata: name:elasticsearch namespace:elastic-system spec: version:"8.9.0" volumeClaimDeletePolicy:DeleteOnScaledownOnly podDisruptionBudget: spec: minAvailable:2 selector: matchLabels: elasticsearch.k8s.elastic.co/cluster-name:elasticsearch nodeSets: -name:default config: node.roles:["master","data","ingest","ml","remote_cluster_client"] podTemplate: metadata: labels: app.kubernetes.io/name:elasticsearch app.kubernetes.io/version:"8.9.0" app.kubernetes.io/component:"elasticsearch" app.kubernetes.io/part-of:"elk" spec: nodeSelector: cloud.google.com/compute-class:"Balanced" initContainers: -name:max-map-count-check command: -sh --c -while true; do mmc=$(cat /proc/sys/vm/max_map_count); if test ${mmc} -eq 262144; then exit 0; fi; sleep 1; done resources: requests: cpu:10m memory:16Mi ephemeral-storage:16Mi limits: cpu:10m memory:16Mi ephemeral-storage:16Mi containers: -name:elasticsearch resources: requests: cpu:990m memory:4080Mi ephemeral-storage:1008Mi limits: cpu:1000m memory:4080Mi ephemeral-storage:1008Mi env: -name:ES_JAVA_OPTS value:"-Xms2g-Xmx2g" count:3 volumeClaimTemplates: -metadata: name:elasticsearch-data# Do not change this name unless you set up a volume mount for the data path. spec: accessModes: -ReadWriteOnce resources: requests: storage:2Gi storageClassName:standard-rwo
This manifest defines an Elasticsearch cluster with the following fields:
initContainers
: waits for the virtual memory host's kernel settings to change.podDisruptionBudget
: specifies that the cluster won't be destroyed during the Pods' defragmentation process.config.node.roles
: Elasticsearch node roles configuration. For more information about node roles, see Node in the Elasticsearch documentation.
Apply this manifest to your cluster:
kubectlapply-felasticsearch.yaml
Wait for the Elasticsearch cluster to be ready:
watchkubectl--namespaceelastic-systemgetelasticsearches.elasticsearch.k8s.elastic.co
The output is similar to the following:
NAME HEALTH NODES VERSION PHASE AGE elasticsearch green 3 8.8.0 Ready 5m3s
When the Elasticsearch cluster
HEALTH
isgreen
andPHASE
isReady
, return to the command line by pressingCtrl+C
.
Configure Kibana
Review the following manifest:
apiVersion:kibana.k8s.elastic.co/v1 kind:Kibana metadata: name:kibana namespace:elastic-system spec: version:"8.9.0" count:1 elasticsearchRef: name:elasticsearch namespace:elastic-system http: tls: selfSignedCertificate: disabled:true config: server.publicBaseUrl:https://elk.BASE_DOMAIN xpack.reporting.kibanaServer.port:5601 xpack.reporting.kibanaServer.protocol:http xpack.reporting.kibanaServer.hostname:kibana-kb-http.elastic-system.svc xpack.fleet.agents.elasticsearch.hosts:["https://elasticsearch-es-http.elastic-system.svc:9200"] xpack.fleet.agents.fleet_server.hosts:["https://fleet-server-agent-http.elastic-system.svc:8220"] xpack.fleet.packages: -name:system version:latest -name:elastic_agent version:latest -name:fleet_server version:latest -name:kubernetes version:latest xpack.fleet.agentPolicies: -name:Fleet Server on ECK policy id:eck-fleet-server namespace:default monitoring_enabled: -logs -metrics unenroll_timeout:900 package_policies: -name:fleet_server-1 id:fleet_server-1 package: name:fleet_server -name:Elastic Agent on ECK policy id:eck-agent namespace:default monitoring_enabled: -logs -metrics unenroll_timeout:900 package_policies: -package: name:system name:system-1 -package: name:kubernetes name:kubernetes-1 podTemplate: metadata: labels: app.kubernetes.io/name:kibana app.kubernetes.io/version:"8.9.0" app.kubernetes.io/component:"ui" app.kubernetes.io/part-of:"elk" spec: containers: -name:kibana resources: requests: memory:1Gi cpu:500m ephemeral-storage:1Gi limits: memory:1Gi cpu:500m ephemeral-storage:1Gi
This manifest describes a Kibana custom resource that configures agent policies for the fleet server and agents.
Apply this manifest to your cluster:
kubectlapply-fkibana.yaml
Wait for the Pods to be ready:
watchkubectl--namespaceelastic-systemgetkibanas.kibana.k8s.elastic.co
The output is similar to the following:
NAME HEALTH NODES VERSION AGE kibana green 1 8.8.0 6m47s
When the Pods
HEALTH
isgreen
, return to the command line by pressingCtrl+C
.
Configure a load balancer to access Kibana
To access Kibana, create a Kubernetes Ingress object, a Google-managed certificate, a global IP address, and a DNS Zone.
Create global external IP address:
gcloudcomputeaddressescreate"elastic-stack"--global
Create a managed zone and record set in Cloud DNS:
gclouddnsmanaged-zonescreate"elk"\ --description="DNS Zone for Airflow"\ --dns-name="elk.BASE_DOMAIN"\ --visibility="public" gclouddnsrecord-setscreate"elk.BASE_DOMAIN"\ --rrdatas="$(gcloudcomputeaddressesdescribe"elastic-stack"--global--format="value(address)")"\ --ttl="300"\ --type="A"\ --zone="elk"
Delegate the DNS zone as a subdomain of the base domain by creating an NS record set with a name servers list. You can get a list of name servers using the following command:
gclouddnsrecord-setsdescribeelk.BASE_DOMAIN\ --type="NS"\ --zone="elk"\ --format="value(DATA)"
Review the following manifest:
# Copyright 2023 Google LLC # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. apiVersion:networking.gke.io/v1beta1 kind:FrontendConfig metadata: name:elastic-stack namespace:elastic-system spec: redirectToHttps: enabled:true responseCodeName:MOVED_PERMANENTLY_DEFAULT --- apiVersion:networking.gke.io/v1 kind:ManagedCertificate metadata: name:elastic-stack namespace:elastic-system spec: domains: -elk.BASE_DOMAIN --- apiVersion:networking.k8s.io/v1 kind:Ingress metadata: name:kibana namespace:elastic-system annotations: networking.gke.io/managed-certificates:elastic-stack networking.gke.io/v1beta1.FrontendConfig:elastic-stack kubernetes.io/ingress.global-static-ip-name:elastic-stack kubernetes.io/ingress.class:gce spec: defaultBackend: service: name:kibana-kb-http port: number:5601
This manifest describes a ManagedCertificate that provisions an SSL certificate to establish the TLS connection.
Apply the manifest to your cluster:
kubectlapply-fingress.yaml
Configure Elastic Agents
Review the following manifest:
apiVersion:agent.k8s.elastic.co/v1alpha1 kind:Agent metadata: name:fleet-server namespace:elastic-system spec: version:8.9.0 kibanaRef: name:kibana namespace:elastic-system elasticsearchRefs: -name:elasticsearch namespace:elastic-system mode:fleet fleetServerEnabled:true policyID:eck-fleet-server deployment: replicas:1 podTemplate: metadata: labels: app.kubernetes.io/name:fleet-server app.kubernetes.io/version:"8.9.0" app.kubernetes.io/component:"agent" app.kubernetes.io/part-of:"elk" spec: containers: -name:agent resources: requests: memory:512Mi cpu:250m ephemeral-storage:10Gi limits: memory:512Mi cpu:250m ephemeral-storage:10Gi volumes: -name:"agent-data" ephemeral: volumeClaimTemplate: spec: accessModes:["ReadWriteOnce"] storageClassName:"standard-rwo" resources: requests: storage:10Gi serviceAccountName:fleet-server automountServiceAccountToken:true securityContext: runAsUser:0
This manifest describes an Elastic Agent that configures a fleet server with ECK.
Apply this manifest to your cluster:
kubectlapply-ffleet-server-and-agents.yaml
Wait for the Pods to be ready:
watchkubectl--namespaceelastic-systemgetagents.agent.k8s.elastic.co
The output is similar to the following:
NAME HEALTH AVAILABLE EXPECTED VERSION AGE elastic-agent green 5 5 8.8.0 14m fleet-server green 1 1 8.8.0 16m
When the Pods
HEALTH
isgreen
, return to the command line by pressingCtrl+C
.
Configure logging and monitoring
Elastic Stack can use the kube-state-metrics exporter to collect cluster-level metrics.
Install kube-state-metrics:
helmrepoaddprometheus-communityhttps://prometheus-community.github.io/helm-charts helmrepoupdate helminstallkube-state-metricsprometheus-community/kube-state-metrics--namespaceelastic-system
Get the default Kibana
elastic
user credentials:kubectlgetsecretelasticsearch-es-elastic-user-oyaml-nelastic-system-ojsonpath='{.data.elastic}'|base64-d
Open
https://elk.BASE_DOMAIN
in your browser and login to Kibana with the credentials.From the menu, select Analytics, then Dashboards.
In the search text field, enter Kubernetes overview and select Overview dashboard to see base metrics.
Some of the dashboard panels might show no data or error messages because GKE limits access to some of the control plane endpoints that Kibana uses to get cluster metrics.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
Delete the project
Delete a Google Cloud project:
gcloud projects delete PROJECT_ID
Delete the individual resources
If you used an existing project and you don't want to delete it, delete the individual resources.
Delete the Elastic Stack components, ECK operator, and kube-state-metrics:
kubectl--namespaceelastic-systemdeleteingresses.networking.k8s.ioelastic-stack kubectl--namespaceelastic-systemdeletemanagedcertificates.networking.gke.ioelastic-stack kubectl--namespaceelastic-systemdeletefrontendconfigs.networking.gke.ioelastic-stack kubectl--namespaceelastic-systemdeleteagents.agent.k8s.elastic.coelastic-agent kubectl--namespaceelastic-systemdeleteagents.agent.k8s.elastic.cofleet-server kubectl--namespaceelastic-systemdeletekibanas.kibana.k8s.elastic.cokibana kubectl--namespaceelastic-systemdeleteelasticsearches.elasticsearch.k8s.elastic.coelasticsearch kubectl--namespaceelastic-systemdeletedaemonsets.appsmax-map-count-setter kubectl--namespaceelastic-systemdeletepvc--selector='elasticsearch.k8s.elastic.co/cluster-name=elasticsearch' helm--namespaceelastic-systemuninstallkube-state-metrics helm--namespaceelastic-systemuninstallelastic-operator
Delete the DNS record set, IP address, DNS managed zone, and GKE cluster:
gclouddnsrecord-setsdelete"elk.BASE_DOMAIN"\ --type="A"\ --zone="elk"\ --quiet gcloudcomputeaddressesdelete"elastic-stack"\ --global\ --quiet gclouddnsmanaged-zonesdelete"elk"--quiet gcloudcontainerclustersdelete"elk-stack"\ --location="us-central1"\ --quiet
What's next
- Explore reference architectures, diagrams, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.