Resources
6 min read
Creating OpenSearch clusters is crucial for organizations aiming to harness the power of distributed search and analytics. These clusters allow businesses to efficiently store, index, and examine extensive amounts of data in real time, offering valuable insights for decision-making and operational efficiency.
A significant advantage of creating OpenSearch clusters is that they support replication and shard allocation, which ensures high availability and fault tolerance. In the event of node failures or network disruptions, data remains accessible, and cluster operations continue uninterrupted, minimizing downtime and ensuring business continuity.
To ease the process of creating OpenSearch clusters, Terraform can be used. By using Terraform to create an OpenSearch cluster it streamlines the process of provisioning and managing infrastructure, enhances consistency and reliability, and improves collaboration and automation within your organization.
In this article, we will outline what Terraform is before showcasing a configuration guide that highlights how to create an OpenSearch cluster with Terraform.
Contents
What is Terraform?
Terraform is an open-source infrastructure as code (IaC) tool developed by HashiCorp. It enables users to outline and provision infrastructure resources using declarative configuration files, instead of manual processes or scripting. With Terraform, you can manage and automate the deployment of infrastructure components such as virtual machines, containers, networks, storage, and more across multiple cloud providers and on-premises environments.
Benefits of Using Terraform for OpenSearch Deployment
Using Terraform for OpenSearch deployment can streamline your configuration and offer a host of advantages. An example of one of these benefits is that Terraform enables you to outline your OpenSearch deployment infrastructure as code using a declarative configuration language. This allows you to version control your infrastructure, monitor changes, and easily reproduce or scale your deployment across various environments (e.g., development, staging, production) with consistency and reliability.
Also, with Terraform, you can simply scale your OpenSearch deployment up or down depending on changing workload demands. By altering the desired capacity and configuration parameters in your Terraform code, you can dynamically provision or de-provision resources to accommodate changing traffic or data volumes.
Lastly, Terraform enables collaboration among team members by allowing them to work together on infrastructure configurations using version control systems like Git. It also supports auditability and compliance by offering visibility into infrastructure changes, approvals, and history through detailed logs and documentation. This ensures that your organization maximizes efficiency whilst adhering to compliance requirements.
How To Create OpenSearch Clusters with Terraform
Prerequisites
Before beginning with this configuration it is important to ensure that you have some data sources running. These data sources are VPC, subnets, Route53 hosted zone, and ACM certificate. Commonly, you’ll already have VPC and subnets configured but you might need to create Route53 hosted zone and ACM certificate. We’ve included links that outline how to add Route53 hosted zone and ACM certificate resources to your Terraform codebase.
As this module is going to be reusable the majority of attributes will be defined as variables. Also, we created a file to keep all data resources.
#Define variables variable "vpc" { type = string }
variable "hosted_zone_name" { type = string }
#Define data resources data "aws_vpc" "selected" { tags = { Name = var.vpc } }
data "aws_subnets" "private" { filter { name = "tag:Name" values = ["private"] } filter { name = "vpc-id" values = [data.aws_vpc.selected.id] } }
data "aws_route53_zone" "opensearch" { name = var.hosted_zone_name }
data "aws_acm_certificate" "opensearch" { domain = var.hosted_zone_name }
#Define AWS caller identity and region data sources data "aws_caller_identity" "current" {}
data "aws_region" "current" {}
Resources
The services that we'll need to create to achieve our goal are an OpenSearch domain, Route53 record, Security group, CloudWatch log groups, and a SSM parameter. All the configurations will be kept in a file named main.tf.
Locals and Variables
As we're crafting an OpenSearch module, flexibility for future reuse is paramount. Therefore, the majority of specifications are parameterized as variables.
#Define Terraform version terraform { required_version = "~> 1.3.3" }
#Define variables variable "security_options_enabled" { type = bool default = true }
variable "volume_type" { type = string default = "gp2" }
variable "throughput" { type = number default = 100 }
variable "ebs_enabled" { type = bool default = true }
variable "ebs_volume_size" { type = number default = 50 }
variable "service" { type = string }
variable "instance_type" { type = string default = "t2.micro" }
variable "instance_count" { type = number default = 3 }
variable "dedicated_master_enabled" { type = bool default = false }
variable "dedicated_master_count" { type = number default = 3 }
variable "dedicated_master_type" { type = string default = "t2.micro" }
variable "zone_awareness_enabled" { type = bool default = true }
variable "engine_version" { type = string default = "7.10" }
#Define locals locals { domain = "${var.service}-engine" custom_domain = "${local.domain}.${data.aws_route53_zone.opensearch.name}" subnet_ids = slice(data.aws_subnets.private.ids, 0, var.instance_count) master_user = "${var.service}-masteruser" }
AWS VPN and Security Groups
We must establish a security group for the OpenSearch cluster, permitting inbound traffic solely through port 443 on TCP protocol within the VPC.
In our scenario, utilizing the OpenSearch API via Postman or accessing its dashboard via a browser was necessary. However, exposing OpenSearch to the public internet was not desirable. Hence, we've configured an AWS VPN.
An AWS VPN acts as a bridge between AWS VPCs and external networks. Therefore, being connected to the AWS VPN implies connectivity to a VPC, granting access to internal resources within that VPC.
To restrict access to our cluster exclusively via VPN, we must define an ingress rule for the OpenSearch security group, enabling inbound traffic solely from the VPN.
Add this resource to main.tf:
resource "aws_security_group" "opensearch_security_group" { name = "${local.domain}-sg" vpc_id = data.aws_vpc.selected.id description = "Allow inbound HTTPS traffic"
ingress { description = "HTTPS from VPC" from_port = 443 to_port = 443 protocol = "tcp"
cidr_blocks = [
data.aws_vpc.selected.cidr_block,
]
} }
resource "aws_security_group_rule" "allow_opensearch_ingress_vpn" { type = "ingress" from_port = 443 to_port = 443 protocol = "tcp" source_security_group_id = var.vpn_security_group_id security_group_id = aws_security_group.opensearch_security_group.id description = "Allow connections from AWS VPN" }
Cloudwatch Logs
We're incorporating log_publishing_options into the OpenSearch cluster configuration, necessitating the creation of multiple CloudWatch log groups to accommodate the various log types.
INDEX_SLOW_LOGS SEARCH_SLOW_LOGS ES_APPLICATION_LOGS
Add resource to main.tf: locals { log_groups = { index_slow_logs = "/aws/opensearch/${local.domain}/index-slow" search_slow_logs = "/aws/opensearch/${local.domain}/search-slow" es_application_logs = "/aws/opensearch/${local.domain}/es-application" } }
resource "aws_cloudwatch_log_group" "opensearch_log_groups" { for_each = local.log_groups
name = each.value retention_in_days = 14 }
Add Cloudwatch log resource policy: resource "aws_cloudwatch_log_resource_policy" "opensearch_log_resource_policy" { policy_name = "${local.domain}-domain-log-resource-policy"
policy_document = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Principal = { Service = "es.amazonaws.com" } Action = [ "logs:PutLogEvents", "logs:PutLogEventsBatch", "logs:CreateLogStream" ] Resource = [ for name, _ in local.log_groups : "${aws_cloudwatch_log_group.opensearch_log_groups[name].arn}:*" ] Condition = { StringEquals = { "aws:SourceAccount" = data.aws_caller_identity.current.account_id } ArnLike = { "aws:SourceArn" = "arn:aws:es:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:domain/${local.domain}" } } } ] }) }
OpenSearch Master User
We generate an OpenSearch master user and password and store it as an SSM parameter. The password is randomly generated during infrastructure creation and is not hardcoded. Upon Terraform completing the resource creation process, you can locate the password within the Systems Manager service on the AWS console.
resource "random_password" "opensearch_password" { length = 32 special = true }
resource "aws_ssm_parameter" "opensearch_master_user" { name = "/service/${var.service}/MASTER_USER" description = "OpenSearch password for ${var.service} domain" type = "SecureString" value = "${local.master_user},${random_password.opensearch_password.result}" }
OpenSearch Domain
resource "aws_opensearch_domain" "opensearch" { domain_name = local.domain engine_version = "OpenSearch_${var.engine_version}"
cluster_config { dedicated_master_count = var.dedicated_master_count dedicated_master_type = var.dedicated_master_type dedicated_master_enabled = var.dedicated_master_enabled instance_type = var.instance_type instance_count = var.instance_count zone_awareness_enabled = var.zone_awareness_enabled zone_awareness_config { availability_zone_count = var.zone_awareness_enabled ? length(local.subnet_ids) : null } }
advanced_security_options { enabled = var.security_options_enabled anonymous_auth_enabled = true internal_user_database_enabled = true master_user_options { master_user_name = local.master_user master_user_password = random_password.opensearch_password.result } }
encrypt_at_rest { enabled = true }
domain_endpoint_options { enforce_https = true tls_security_policy = "Policy-Min-TLS-1-2-2019-07"
custom_endpoint_enabled = true
custom_endpoint = local.custom_domain
custom_endpoint_certificate_arn = data.aws_acm_certificate.opensearch.arn
}
ebs_options { ebs_enabled = var.ebs_enabled volume_size = var.ebs_volume_size volume_type = var.volume_type throughput = var.throughput }
dynamic "log_publishing_options" { for_each = local.log_groups
content {
cloudwatch_log_group_arn = aws_cloudwatch_log_group.opensearch_log_groups[log_publishing_options.key].arn
log_type = log_publishing_options.value
}
}
node_to_node_encryption { enabled = true }
vpc_options { subnet_ids = local.subnet_ids
security_group_ids = [aws_security_group.opensearch_security_group.id]
}
access_policies = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "es:" Principal = "" Effect = "Allow" Resource = "arn:aws:es:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:domain/${local.domain}/*" } ] }) }
OpenSearch Custom Subdomain with Route53
AWS will generate two random URLs for Opensearch dashboard and API, You need to create a CNAME Route53 Record that points at those two URLs.
resource "aws_route53_record" "opensearch_domain_record" { zone_id = data.aws_route53_zone.opensearch.zone_id name = local.custom_domain type = "CNAME" ttl = 300
records = [aws_opensearch_domain.opensearch.endpoint] }
Define Terraform Module Block
To use your module, you can define a terraform module block, and fill out the variables, below is an example of how to do this.
module "opensearch" { source = "Path/to/ur/module" vpc = "vpc_name" hosted_zone_name = "your_hostedzone" engine_version = "2.3" security_options_enabled = true volume_type = "gp3" throughput = 250 ebs_enabled = true ebs_volume_size = 45 service = local.service instance_type = "m6g.large.search" instance_count = 3 dedicated_master_enabled = true dedicated_master_count = 3 dedicated_master_type = "m6g.large.search" zone_awareness_enabled = true }
After completing the setup, you can proceed to plan and apply your Terraform configuration. Once applied, you can navigate to the AWS console to verify the status of your OpenSearch engine. The cluster should be operational and ready for use.
If you've enjoyed this article why not read The Top 10 OpenSearch Plugins or Cassandra vs OpenSearch next?