Site Reliability Engineer Blog

Recursive search through tree of files with golang

This is day 3 out of 100 days of code in go lang.

This code snippet does recursive search through directory tree. One improvement which can be made is to replace custom recursive function readFiles with filepath.Walk

Full code:

package main

import (
	"flag"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"os"
	"strings"
)

var path = flag.String("path", "", "file path to search in")
var search = flag.String("search", "", "search string to look for")

func main() {
	flag.Parse()
	fi, err := os.Stat(*path)
	if err != nil {
		log.Fatal(err)
	}
	//fix path if directory
	if fi.Mode().IsDir() {
		*path = strings.TrimRight(*path, "/") + "/"
		readFiles(*path, *search)
	} else {
		log.Fatal("path must be a directory, but file was provided: ", *path)
	}
}

func readFiles(path, search string) {
	files, err := ioutil.ReadDir(path)
	if err != nil {
		log.Fatal(err)
	}
	for _, file := range files {
		fullpath := path + file.Name()
		if file.Mode().IsDir() {
			readFiles(fullpath+"/", search)
		} else if file.Mode().IsRegular() {
			searchInFile(fullpath, search)
		}
	}
}

func searchInFile(fullpath, search string) {
	data, err := ioutil.ReadFile(fullpath)
	if err != nil {
		log.Fatal(err)
	}
	//need to check for file type to detect filter off non-text files
	fileType := http.DetectContentType(data)
	if strings.Index(fileType, "text") == -1 {
		//skip all non text files
		return
	}
	for _, line := range strings.Split(string(data), "\n") {
		if strings.Index(line, search) > -1 {
			fmt.Printf("%s: %s\n", fullpath, line)
		}
	}
}

Source code at Github

very simple grep tool in GO – search substring in files

Hi guys,
here is day 2 of 100 days of code.

This time it is very simple implementation of grep tool. What it does it search string in specific directory and report matching lines.
What have I learned is that strings.Index is very useful to work with text and http.DetectContentType was the way to detect binary files.

Full code:

package main

import (
	"flag"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"strings"
)

func main() {
	path := flag.String("path", "", "file path to search in")
	search := flag.String("search", "", "search string to look for")
	flag.Parse()

	files, err := ioutil.ReadDir(*path)
	if err != nil {
		log.Fatal(err)
	}
	for _, file := range files {
		if file.Mode().IsDir() {
			//to do: traverse directories recursively
			continue
		}
		data, err := ioutil.ReadFile(file.Name())
		if err != nil {
			log.Fatal(err)
		}
		//need to check for file type to detect binary content
		fileType := http.DetectContentType(data)
		for _, line := range strings.Split(string(data), "\n") {
			if strings.Index(line, *search) > -1 {
				if strings.Index(fileType, "text/plain") > -1 {
					fmt.Printf("%s: %s\n", file.Name(), line)
				} else {
					//best guess it is binary file
					//no need to go through all lines in binary file
					fmt.Printf("Binary file %s matches\n", file.Name())
					break
				}
			}
		}
	}
}

Source code at Github

Simple example of golang os.Args and flag module

Hi all,
this is day 1 of 100 days I am coding one go program a day. Wish me good luck!

In this example I will show how to parse command line arguments with flag module and os.Args array.

Flag module is very powerful and can be as simple as this:

//...
import "flag"
//...
name := flag.String("user", "", "user name")
flag.Parse()
fmt.Printf("user name is %s\n", *name)

To have the “same” functionality with os.Args you have to do some work. In following example I added support for –user, -user and user=<value> arguments.

//...
import "os"
//...
var userArg string
for index, arg := range os.Args {
	pattern := "-user="
	x := strings.Index(arg, pattern)
	if x > -1 {
		userArg = arg[x+len(pattern):]
		continue
	}
	if arg == "-user" || arg == "--user" {
		userArg = os.Args[index+1]
		continue
	}
}
fmt.Printf("user name is %s", userArg)

Full code:

package main

import (
	"flag"
	"fmt"
	"os"
	"strings"
)

func main() {
	name := flag.String("user", "", "user name")
	flag.Parse()
	fmt.Printf("user name is %s\n", *name)

	var userArg string
	for index, arg := range os.Args {
		pattern := "-user="
		x := strings.Index(arg, pattern)
		if x > -1 {
			userArg = arg[x+len(pattern):]
			continue
		}
		if arg == "-user" || arg == "--user" {
			userArg = os.Args[index+1]
			continue
		}
	}
	fmt.Printf("user name is %s", userArg)
}

Latest code examples see on GitHub

Continuous integration and deployment with Google Cloud Builder and Kubernetes

Pipeline of Continuous Integration(CI) for containers has several basic steps. Lets see what they are:

Setup a trigger

Listen to a change in repositories(github, bitbucket) such like pull request, new tag or new branch.

It is basic step for any CI/CD tool and for google cloud builder it is pretty trivial task to setup. Check out Container Registry – Build Triggers tool in google cloud console.

Build an image

When change to repository occur we want to start build of new Docker container image for a change. Good practice is to tag new image with branch name and git reference hash. E.g. master-00covfefe

With cloud builder you face two choices: use a Dockerfile or cloudbuild.yaml file. With Dockerfile option steps are predetermined and don’t give you too much flexibility.
With cloudbuild.yaml you can customise every step of your pipeline.
In the following example first command is doing a build step using Dockerfile and second command tag new image with branch-revision pattern(remember master-00covfefe):

steps:
- name: 'gcr.io/cloud-builders/docker'
  args: [ 'build', '-t', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app', '.' ]

- name: 'gcr.io/cloud-builders/docker'
  args: [ 'tag', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app:$BRANCH_NAME-$REVISION_ID']

Push new image to Container Registry

One important note that cloudbuild.yaml file has special directive “image” which publish image to registry, but that directive only executed at the end of all steps. So, in order to perform deployment step you need to publish image as a separate step.

- name: 'gcr.io/cloud-builders/docker'
  args: ['push', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app:$BRANCH_NAME-$REVISION_ID']

Deploy new image to Kubernetes

When new image is in registry it’s time to trigger deployment step. In this example it is deployment to Kubernetes cluster.
This step require Google Cloud Builder user to have Edit permissions to kubernetes cluster. In google cloud it is a user with “@cloudbuild.gserviceaccount.com” domain. You need to give that user Edit access to kubernetes using IAM console.
Second requirement is to specify zone and cluster cloudbuild.yaml using env variables. That will tell kubectl command to which cluster to deploy.

- name: 'gcr.io/cloud-builders/kubectl'
  args: ['set', 'image', 'deployment/my-nodejs-app-deployment', 'my-nodejs-app=eu.gcr.io/$PROJECT_ID/my-nodejs-app:$BRANCH_NAME-$REVISION_ID']
  env:
  - 'CLOUDSDK_COMPUTE_ZONE=europe-west1-d'
  - 'CLOUDSDK_CONTAINER_CLUSTER=staging-cluster'

What next

At this point the CI/CD job is done. Possible next steps to improve your pipeline can be:

Send notification to Slack or Hipchat to let everyone know about new version deployment.
Run user acceptance tests to check that all functions perform well.
Run load tests and stress tests to check that new version has no degradation in performance.

Full cloudbuild.yaml file example

steps:
#build steps
- name: 'gcr.io/cloud-builders/docker'
  args: [ 'build', '-t', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app', '.' ]

- name: 'gcr.io/cloud-builders/docker'
  args: [ 'tag', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app:$BRANCH_NAME-$REVISION_ID']

- name: 'gcr.io/cloud-builders/docker'
  args: ['push', 'eu.gcr.io/$PROJECT_ID/my-nodejs-app:$BRANCH_NAME-$REVISION_ID']

#deployment step
- name: 'gcr.io/cloud-builders/kubectl'
  args: ['set', 'image', 'deployment/my-nodejs-app-deployment', 'my-nodejs-app=eu.gcr.io/$PROJECT_ID/my-nodejs-app:$BRANCH_NAME-$REVISION_ID']
  env:
  - 'CLOUDSDK_COMPUTE_ZONE=europe-west1-d'
  - 'CLOUDSDK_CONTAINER_CLUSTER=staging-cluster'

#image update steps(two tags: latest and branch-revision)
images:
- 'eu.gcr.io/$PROJECT_ID/my-nodejs-app'
- 'eu.gcr.io/$PROJECT_ID/my-nodejs-app:$BRANCH_NAME-$REVISION_ID'

#tags for container builder
tags:
  - "frontend"
  - "nodejs"
  - "dev-team-1"

Undocumented locust load API calls

Hey folks,

today I found that locust load do support API calls, so you can grab statistics and manage it’s execution.

If you like me hear about it first time here is few examples:

Start locust task

curl -XPOST $LOCUST_URL/swarm -d"locust_count=100&hatch_rate=10"

Get stats

curl $LOCUST_URL/stats/requests

Stop locust task

curl $LOCUST_URL/stop

All requests API endpoints can be found in locust source code on github

Another useful feature which is worth to mention is locust extension ability.
Using it you can add your own routes to API

from locust import web

@web.app.route("/added_page")
def my_added_page():
    return "Another page"

Unlike API calls extensions is documented.

Happy coding everyone!

Prepare Application Launch Checklist

Introduction

Application Launch checklist is aimed for Devops, Sysops and anyone whois job to make website available and reliable.
The checklist better works for applications which are going to be Live in near feature, but also useful to validate your Devops processes for already running applications.

This checklist is a complied notes from Launch Checklist for Google Cloud Platform. It is mostly targeted on Devops work routines and in a nutshell explain first and necessary Devops steps into launching applications.

Software Architecture Documentation

Create an Architectural Summary. Include an overall architectural diagram, a summary of the process flows, detail the service interaction points.
List and describe how each service is used. Include use of any 3rd-party APIs.
Make it easy accessible and available – the best as wiki pages.

Builds and Releases

Document your build and release, configuration, and security management processes.
Automate build process. Include automated testing and packaging.
Automate release process to provision package between environments. Include rollback functionality.
Version your configuration and put it into Configuration Management system like Saltstack, Puppet or Ansible.
Simulate build and release failures. Are you able to roll back effectively? Is the process documented?

Disaster recovery

Document your routine backup, regular maintenance, and disaster recovery processes.
Test your restore process with real data. Determine time required for a full restore and reflect this in the disaster recovery processes.
Automate as much as possible.
Simulate major outages and test your Disaster Recovery processes
Simulate individual services failure to test your incidents recovery process

Monitoring

Document and define your system monitoring and alerting processes.
Validate that your system monitoring and alerting are sufficient and effective.

Final thoughts

I can not overstate how much final outcome depend on the level of interaction between Developers, Sysops and Devops teams in your organisation.

After application will go live convert it to a training program for every new devop before she will start site support.

Making simple Splunk Nginx dashboard

As a DevOps guy I often do incident analysis, post deployment monitoring and usual logs checks. If you also is using Splunk as me when let me show for you few effective Splunk commands for Nginx logs monitoring.

Extract fileds

To make commands works Nginx log fields have to be extracted into variables.
Where are 2 ways to extract fields:

By default Splunk recognise “access_combined” log format which is default format for Nginx. If it is your case congratulations nothing to do for you!
For custom format of logs you will need to create regular expression. Splunk has built in user interface to extract fields or you can provide regular expression manually.

Website traffic over time and error rate

Unexpected spike in traffic or in error rate are always first thing to look for. Following command build a time chart with response codes. Codes 200/300 is your normal traffic and 400/500 is errors.

timechart count(status) span=1m by status

Response time

How do you know if your website running slowly?
For response time I suggest to use 20, 85 and 95 percentile as metrics.
You also can think of average response time metric, but low average response time doesn’t show that website is OK, so I am not using that metric in the query.

timechart perc20(request_time), perc85(request_time), perc95(request_time) span=1m

Traffic by IP

Checking which IPs are most popular is a good way to spot bad guys or misbehaving bot.

top limit=20 clientip

Top of error page

Looking for pages which produce most errors like 500 Internal Server Error or not found pages like 404? Following two queries give you exactly that information.
Top error pages

search status >= 500 | stats count(status) as cnt by uri, status | sort cnt desc

Top 40x error pages

search status >= 400 AND status < 500 | stats count(status) by uri, status | sort cnt desc

Number of timeouts(>30s) per upstream

When you are using Nginx as a proxy server it is very useful to see if any of upstreams are getting timeouts.
Timeouts could be a symptom for: slow application performance, not enough system resources or just upstream server is down.

search upstream_response_time >= 30 | stats count(upstream_response_time) as upstreams by upstream

Most time consuming upstreams

Most time consuming upstreams showing which of servers are already overloaded by requests and giving you a hint when application needs to be scaled

stats sum(upstream_response_time), count(upstream) by upstream

In conclusion

Splunk functions like timechart, stats and top is your best friends for data aggregation. They are like unix tools – the more tools you know the more easier is to build powerful commands.