User guide

Getting started with AWS

In this getting started guide, we walk through how to initialise Tarmak with a new Provider (AWS), a new Environment and then provision a Kubernetes cluster. This will comprise of Kubernetes master and worker nodes, etcd clusters, Vault and a bastion node with a public IP address (see Architecture overview for details of cluster components)

Prerequisites

Initialise configuration

Simply run tarmak init to initialise configuration for the first time. You will be prompted for the necessary configuration to set-up a new Provider (AWS) and Environment. The list below describes the questions you will be asked.

Note

If you are not using Vault’s AWS secret backend, you can authenticate with AWS in the same way as the AWS CLI. More details can be found at Configuring the AWS CLI.

  • Configuring a new Provider
    • Provider name: must be unique
    • Cloud: Amazon (AWS) is the default and only option for now (more clouds to come)
    • Credentials: Amazon CLI auth (i.e. env variables/profile) or Vault (optional)
    • Name prefix: for state buckets and DynamoDB tables
    • Public DNS zone: will be created if not already existing, must be delegated from the root
  • Configuring a new Environment
    • Environment name: must be unique
    • Project name: used for AWS resource labels
    • Project administrator mail address
    • Cloud region: pick a region fetched from AWS (using Provider credentials)
  • Configuring new Cluster(s)
    • Single or multi-cluster environment
    • Cloud availability zone(s): pick zone(s) fetched from AWS

Once initialised, the configuration will be created at $HOME/.tarmak/tarmak.yaml (default).

Create an AMI

Next we create an AMI for this environment by running tarmak clusters images build (this is the step that requires Docker to be installed locally).

% tarmak clusters images build
<output omitted>

Create the cluster

To create the cluster, run tarmak clusters apply.

% tarmak clusters apply
<output omitted>

Warning

The first time this command is run, Tarmak will create a hosted zone and then fail with the following error.

* failed verifying delegation of public zone 5 times, make sure the zone k8s.jetstack.io is delegated to nameservers [ns-100.awsdns-12.com ns-1283.awsdns-32.org ns-1638.awsdns-12.co.uk ns-842.awsdns-41.net]

When creating a multi-cluster environment, the hub cluster must first be applied . To change the current cluster use the flag --current-cluster. See tarmak cluster help for more information.

You should now change the nameservers of your domain to the four listed in the error. If you only wish to delegate a subdomain containing your zone to AWS without delegating the parent domain see Creating a Subdomain That Uses Amazon Route 53 as the DNS Service without Migrating the Parent Domain.

To complete the cluster provisioning, run tarmak clusters apply once again.

Note

This process may take 30-60 minutes to complete. You can stop it by sending the signal SIGTERM or SIGINT (Ctrl-C) to the process. Tarmak will not exit immediately. It will wait for the currently running step to finish and then exit. You can complete the process by re-running the command.

Destroy the cluster

To destroy the cluster, run tarmak clusters destroy.

% tarmak clusters destroy
<output omitted>

Note

This process may take 30-60 minutes to complete. You can stop it by sending the signal SIGTERM or SIGINT (Ctrl-C) to the process. Tarmak will not exit immediately. It will wait for the currently running step to finish and then exit. You can complete the process by re-running the command.

Configuration Options

After generating your tarmak.yaml configuration file there are a number of options you can set that are not exposed via tarmak init.

Pod Security Policy

Note: For cluster versions greater than 1.8.0 this is applied by default. For cluster versions before 1.6.0 is it not applied.

To enable Pod Security Policy to an environment, include the following to the configuration file under the kubernetes field of that environment:

kubernetes:
    podSecurityPolicy:
        enabled: true

The configuration file can be found at $HOME/.tarmak/tarmak.yaml (default). The Pod Security Policy manifests can be found within the tarmak directory at puppet/modules/kubernetes/templates/pod-security-policy.yaml.erb

Cluster Autoscaler

Tarmak supports deploying Cluster Autoscaler when spinning up a Kubernetes cluster to autoscale worker instance pools. The following tarmak.yaml snippet shows how you would enable Cluster Autoscaler.

kubernetes:
  clusterAutoscaler:
    enabled: true
...

The above configuration would deploy Cluster Autoscaler with an image of gcr.io/google_containers/cluster-autoscaler using the recommend version based on the version of your Kubernetes cluster. The configuration block accepts two optional fields of image and version allowing you to change these defaults. Note that the final image tag used when deploying Cluster Autoscaler will be the configured version prepended with the letter v.

The current implementation will configure the first instance pool of type worker in your cluster configuration to scale between minCount and maxCount. We plan to add support for an arbitrary number of worker instance pools.

Logging

Each Kubernetes cluster can be configured with a number of logging sinks. The only sink currently supported is Elasticsearch. An example configuration is shown below:

apiVersion: api.tarmak.io/v1alpha1
kind: Config
clusters:
  loggingSinks:
  - types:
    - application
    - platform
    elasticsearch:
      host: example.amazonaws.com
      port: 443
      logstashPrefix: test
      tls: true
      tlsVerify: false
      httpBasicAuth:
        username: administrator
        password: mypassword
  - types:
    - all
    elasticsearch:
      host: example2.amazonaws.com
      port: 443
      tls: true
      amazonESProxy:
        port: 9200
...

A full list of the configuration parameters are shown below:

  • General configuration parameters

    • types - the types of logs to ship. The accepted values are:

      • platform (kernel, systemd and platform namespace logs)
      • application (all other namespaces)
      • audit (apiserver audit logs)
      • all
  • Elasticsearch configuration parameters
    • host - IP address or hostname of the target Elasticsearch instance

    • port - TCP port of the target Elasticsearch instance

    • logstashPrefix - Shipped logs are in a Logstash compatible format. This field specifies the Logstash index prefix * tls - enable or disable TLS support

    • tlsVerify - force certificate validation (only valid when not using the AWS ES Proxy)

    • tlsCA - Custom CA certificate for Elasticsearch instance (only valid when not using the AWS ES Proxy)

    • httpBasicAuth - configure basic auth (only valid when not using the AWS ES Proxy)

      • username
      • password
    • amazonESProxy - configure AWS ES Proxy

      • port - Port to listen on (a free port will be chosen for you if omitted)

Setting up an AWS hosted Elasticsearch Cluster

AWS provides a hosted Elasticsearch cluster that can be used for log aggregation. This snippet will setup an Elasticsearch domain in your account and create a policy along with it that will allow shipping of logs into the cluster:

variable "name" {
  default = "tarmak-logs"
}

variable "region" {
  default = "eu-west-1"
}

provider "aws" {
  region = "${var.region}"
}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "es" {
  statement {
    actions = [
      "es:*",
    ]

    principals {
      type = "AWS"

      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root",
      ]
    }
  }
}

resource "aws_elasticsearch_domain" "es" {
  domain_name           = "${var.name}"
  elasticsearch_version = "6.2"

  cluster_config {
    instance_type = "t2.medium.elasticsearch"
  }

  ebs_options {
    ebs_enabled = true
    volume_type = "gp2"
    volume_size = 30
  }

  access_policies = "${data.aws_iam_policy_document.es.json}"
}

data "aws_iam_policy_document" "es_shipping" {
  statement {
    actions = [
      "es:ESHttpHead",
      "es:ESHttpPost",
      "es:ESHttpGet",
    ]

    resources = [
      "arn:aws:es:${var.region}:${data.aws_caller_identity.current.account_id}:domain/${var.name}/*",
    ]
  }
}

resource "aws_iam_policy" "es_shipping" {
  name        = "${var.name}-shipping"
  description = "Allows shipping of logs to elasticsearch"

  policy = "${data.aws_iam_policy_document.es_shipping.json}"
}

output "elasticsearch_endpoint" {
  value = "${aws_elasticsearch_domain.es.endpoint}"
}

output "elasticsearch_shipping_policy_arn" {
  value = "${aws_iam_policy.es_shipping.arn}"
}

Once terraform has been successfully run it will output, the resulting AWS Elasticsearch endpoint and the policy that allow shipping to it:

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Outputs:

elasticsearch_endpoint = search-tarmak-logs-xyz.eu-west-1.es.amazonaws.com
elasticsearch_shipping_policy_arn = arn:aws:iam::1234:policy/tarmak-logs-shipping

Both of those outputs can then be used in the tarmak configuration:

apiVersion: api.tarmak.io/v1alpha1
clusters:
- name: cluster
  loggingSinks:
  - types: ["all"]
    elasticsearch:
      host: ${elasticsearch_endpoint}
      tls: true
      amazonESProxy: {}
  amazon:
    additionalIAMPolicies:
    - ${elasticsearch_shipping_policy_arn}

OIDC Authentication

Tarmak supports authentication using OIDC. The following snippet demonstrates how you would configure OIDC authentication in tarmak.yaml. For details on the configuration options, visit the Kubernetes documentation here. Note that if the version of your cluster is less than 1.10.0, the signingAlgs parameter is ignored.

kubernetes:
    apiServer:
        oidc:
            clientID: 1a2b3c4d5e6f7g8h
            groupsClaim: groups
            groupsPrefix: "oidc:"
            issuerURL: https://domain/application-server
            signingAlgs:
            - RS256
            usernameClaim: preferred_username
            usernamePrefix: "oidc:"
...

For the above setup, ID tokens presented to the apiserver will need to contain claims called preferred_username and groups representing the username and groups associated with the client. These values will then be prepended with oidc: before authorisation rules are applied, so it is important that this is taken into account when configuring cluster authorisation.

Jenkins

You can install Jenkins as part of your hub. This can be achieved by adding an extra instance pool to your hub. This instance pool can be extended with an annotation tarmak.io/jenkins-certificate-arn. The value of this annotation will be ARN pointing to an Amazon Certificate. When you set this annotation, your Jenkins will be secured with HTTPS. You need to make sure your SSL certificate is valid for jenkins.<environment>.<zone>.

- image: centos-puppet-agent
  maxCount: 1
  metadata:
    annotations:
      tarmak.io/jenkins-certificate-arn: "arn:aws:acm:eu-west-1:228615251467:certificate/81e0c595-f5ad-40b2-8062-683b215bedcf"
    creationTimestamp: null
    name: jenkins
  minCount: 1
  size: large
  type: jenkins
  volumes:
  - metadata:
      creationTimestamp: null
      name: root
    size: 16Gi
    type: ssd
  - metadata:
      creationTimestamp: null
      name: data
    size: 16Gi
    type: ssd
...

Dashboard

Tarmak supports deploying Kubernetes Dashboard when spinning up a Kubernetes cluster. The following tarmak.yaml snippet shows how you would enable Kubernetes Dashboard.

kubernetes:
  dashboard:
    enabled: true
...

The above configuration would deploy Kubernetes Dashboard with an image of gcr.io/google_containers/kubernetes-dashboard-amd64 using the recommended version based on the version of your Kubernetes cluster. The configuration block accepts two optional fields of image and version allowing you to change these defaults. Note that the final image tag used when deploying Tiller will be the configured version prepended with the letter v.

Warning

Before Dashboard version 1.7, when RBAC is enabled (from Kubernetes version 1.6) cluster-wide cluster-admin privileges are granted to Dashboard. From Dashboard version 1.7, only minimal privileges are granted that allow Dashboard to work. See Dashboard’s access control documentation for more details.

Tiller

Tarmak supports deploying Tiller, the server-side component of Helm, when spinning up a Kubernetes cluster. Tiller is configured to listen on localhost only which prevents arbitrary Pods in the cluster connecting to its unauthenticated endpoint. Helm clients can still talk to Tiller by port forwarding through the Kubernetes API Server. The following tarmak.yaml snippet shows how you would enable Tiller.

kubernetes:
  tiller:
    enabled: true
    image: 2.9.1
...

The above configuration would deploy version 2.9.1 of Tiller with an image of gcr.io/kubernetes-helm/tiller. The configuration block accepts two optional fields of image and version allowing you to change these defaults. Note that the final image tag used when deploying Tiller will be the configured version prepended with the letter v. The version is particularly important when deploying Tiller since its minor version must match the minor version of any Helm clients.

Warning

Tiller is deployed with the cluster-admin ClusterRole bound to its service account and therefore has far reaching privileges. Helm’s security best practices should also be considered.

Prometheus

By default Tarmak will deploy a Prometheus and some exporters into the monitoring namespace. Using this config Prometheus could be disabled all together:

kubernetes:
  prometheus:
    enabled: false

Another possibility would be to use The Tarmak provisioned Prometheus only for scraping exporters on instances that are not part of the Kubernetes cluster. Using federation those metrics could then be integrated into an already existing Prometheus deployment. Get that behaviour you needs to set the configuration like that:

kubernetes:
  prometheus:
    enabled: true
    externalScrapeTargetsOnly: true

API Server

It is possible to let Tarmak create an public endpoint for your APIserver. This can be used together with Secure public endpoints.

kubernetes:
  apiServer:
    public: true

Additional IAM policies

Additional IAM policies can be added by adding those ARNs to the tarmak.yaml config. You can add additional IAM policies to the cluster and instance pool blocks. When you define additional IAM policies on both levels, they will be merged when applied to a specific instance pool.

Cluster

You can add additional IAM policies that will be added to all the instance pools of the whole cluster.

apiVersion: api.tarmak.io/v1alpha1
clusters:
- amazon:
    additionalIAMPolicies:
    - "arn:aws:iam::xxxxxxx:policy/policy_name"

Instance pool

It is possible to add extra policies to only a specific instance pool.

- image: centos-puppet-agent
  amazon:
    additionalIAMPolicies:
    - "arn:aws:iam::xxxxxxx:policy/policy_name"
  maxCount: 3
  metadata:
    name: worker
  minCount: 3
  size: medium
  subnets:
  - metadata:
    zone: eu-west-1a
  - metadata:
    zone: eu-west-1b
  - metadata:
    zone: eu-west-1c
  type: worker

Node Taints & Labels

You might have added additional instance pools for a specific workload. In these cases it might be useful to label and or taint the nodes in this instance pool.

You add labels and taints in the tarmak yaml like this:

- image: centos-puppet-agent
  maxCount: 3
  metadata:
    name: worker
  minCount: 3
  size: medium
  type: worker
  labels:
  - key: "ssd"
    value: "true"
  taints:
  - key: "gpu"
    value: "gtx1170"
    effect: "NoSchedule"

Note, these are only applied when the node is first registered. Changes to these values will not remove taints and labels from nodes that are already registered.

Cluster Services

Grafana

Grafana is deployed as part of Tarmak. You can access Grafana through a Kubernetes cluster service. Do the following steps to access Grafana:

  1. Create a proxy
tarmak kubectl proxy
  1. In the browser go to
http://127.0.0.1:8001/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy/