how to sort map values in go

Hi guys,
this is day 11 out of 100 days of go coding.

This time it is very short demo of go maps and simple sort by map values. Note that the example below will not work correctly if values has duplicates.

package main

import (
	"fmt"
	"sort"
)

func main() {
        //init a map with strings keys and int values
	var m = make(map[string]int)
	m["Sun"] = 7
	m["Sat"] = 6
	m["Fri"] = 5
	m["Thu"] = 4
	m["Mon"] = 1
	m["Tue"] = 2
	m["Wed"] = 3

	//reverse keys and values of m map
	var days = make(map[int]string)
	//array of indexes for sorting
	var daykeys = make([]int, len(m))

	fmt.Print("Unsorted days of week\n")
	counter := 0
	for k, v := range m {
		fmt.Printf("%s -> %d\n", k, v)
		days[v] = k
		daykeys[counter] = v
		counter++
	}
	//sort indexes array here
	sort.Ints(daykeys)

	fmt.Print("Sorted days of week\n")
	for _, v := range daykeys {
		fmt.Printf("%s\n", days[v])
	}
}

I believe this code has a lot of room for optimization, so if you know how please write me in comments or ping me in twitter with your version.

Happy coding everyone!

Example of using struct in go for beginners

Hi, this is day 9 out of 100 days of code in go.

Today I am playing with struct type. Struct is a go way to define custom types. Use it when standard types doesn’t suite you. Enough words, lets see code examples.

Classic example Employee type

// Employee - new type
type Employee struct {
	ID      int
	Name    string
	Manager *Employee //reference to itself 
}

ID and Name fields is quite obvious. Manager field is the same type as struct itself, so it must be a reference type and in our business logic will point to Manager data.

Lets create first employee

func main() {
	// Example of initializing new Employee
	var worker Employee
	worker.ID = 1
	worker.Name = "Petia Pyatochkin"
	PrintEmployee(worker)
}

// PrintEmployee - print data in nice format
func PrintEmployee(e Employee) {
	fmt.Printf("ID = %d\nName = %s\n", e.ID, e.Name)
}

Struct fields are accessible using dot notation.

Lets define a manager now and improve print function, so it will print Manger if such exist

func main() {
	// Example of initializing new struct data
	var worker Employee
	worker.ID = 1
	worker.Name = "Petia Pyatochkin"
	PrintEmployee(worker)

	// Struct can reference on it's own type
	//Lets define manager for an employee
	var manager Employee
	manager.ID = 2
	manager.Name = "Middle Level"
        //using pointer we create a reference to manager struct and keep hierachy of oranization
	worker.Manager = &manager
	PrintEmployee(worker)
}

// PrintEmployee - print data in nice format
func PrintEmployee(e Employee) {
	fmt.Printf("ID = %d\nName = %s\n", e.ID, e.Name)
	//if e.Manager is defined print manager data
	if e.Manager != nil {
		fmt.Printf("Manager of %s:\n", e.Name)
		//recursively go through managers tree
		PrintEmployee(*e.Manager)
		return
	}
	fmt.Print("----------\n\n")
}

New employee manager is created the same way as first worker variable. When using a reference to new manager variable we assign a manager to worker.

PrintEmployee function has some changes too. First it has new check if Manager reference is not nil and if so using recursion it prints Manager data.

Lets add another Employee, so we will have 3 level management organization

unc main() {
	// Example of initializing new struct data
	var worker Employee
	worker.ID = 1
	worker.Name = "Petia Pyatochkin"
	PrintEmployee(worker)

	// Struct can reference on it's own type
	//Lets define manager for an employee
	var manager Employee
	manager.ID = 2
	manager.Name = "Middle Level"
	//using pointer we create a reference to manager struct and keep hierachy of oranization
	worker.Manager = &manager
	PrintEmployee(worker)

	//define fields ID and Name using struct literals
	var cto = Employee{ID: 3, Name: "cto"}
	manager.Manager = &cto
	//should print 3 level org structure
	PrintEmployee(worker)
}

Note that new cto variable is using struct literals to define fields values in variable initialization step.

Output of above will be 3 level organization structure:

$ go run struct.go
ID = 1
Name = Petia Pyatochkin
----------

ID = 1
Name = Petia Pyatochkin
Manager of Petia Pyatochkin:
ID = 2
Name = Middle Level
----------

ID = 1
Name = Petia Pyatochkin
Manager of Petia Pyatochkin:
ID = 2
Name = Middle Level
Manager of Middle Level:
ID = 3
Name = cto
----------

That’s what you need to know about struct type to use it efficiently in you go programs.

Happy coding!

Source code at Github.

Transparent HTTP proxy with filter by user agent using golang

This is day 10 out of 100 Days of golang coding!

Idea for today is to build transparent http proxy with ability to filter traffic. As for a filter I will use user agent which is common practice to filter traffic.

So, for the task I have used two go modules: user_agent and goproxy:

go get github.com/mssola/user_agent
go get gopkg.in/elazarl/goproxy.v1

First I have to setup a proxy and set a decide which hosts I want to match:

proxy := goproxy.NewProxyHttpServer()
proxy.Verbose = true
proxy.OnRequest(goproxy.ReqHostMatches(regexp.MustCompile("^.*$"))).DoFunc(
// skipped function body for now
)
log.Fatal(http.ListenAndServe(":8080", proxy))

Regex “^.*$” is set to match all hosts.

Second I setup user agent parser and filter by bot and browser:

func(r *http.Request, ctx *goproxy.ProxyCtx) (*http.Request, *http.Response) {
  //parse user agent string
  ua := user_agent.New(r.UserAgent())
  bro_name, _ := ua.Browser()
  if ua.Bot() || bro_name == "curl" {
    return r, goproxy.NewResponse(r,
      goproxy.ContentTypeText, http.StatusForbidden,
      "Don't waste your time!")
  }
  return r, nil
}

That’s all for coding. Now take a look at test cases:

Use case 1 – curl command no user agent set

Use case 2 – curl with normal browser user agent

http_proxy=http://127.0.0.1:8080 curl -i -H"User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36" http://twitter.com

In that case we have got requests from twitter server which was 301 to https.

Use case 3 – curl with bot user agent

http_proxy=http://127.0.0.1:8080 curl -i -H"User-Agent:Googlebot" http://twitter.com

Full version of proxy with verbose information of user agent parsing:

package main

import (
	"fmt"
	"log"
	"net/http"
	"regexp"

	"github.com/mssola/user_agent"
	"gopkg.in/elazarl/goproxy.v1"
)

func main() {
	proxy := goproxy.NewProxyHttpServer()
	proxy.Verbose = true
	proxy.OnRequest(goproxy.ReqHostMatches(regexp.MustCompile("^.*$"))).DoFunc(
		func(r *http.Request, ctx *goproxy.ProxyCtx) (*http.Request, *http.Response) {
			//parse user agent string
			ua := user_agent.New(r.UserAgent())
			fmt.Printf("Is mobile: %v\n", ua.Mobile()) // => false
			fmt.Printf("Is bot: %v\n", ua.Bot())       // => false
			fmt.Printf("Mozilla: %v\n", ua.Mozilla())  // => "5.0"

			fmt.Printf("Platform: %v\n", ua.Platform()) // => "X11"
			fmt.Printf("OS: %v\n", ua.OS())             // => "Linux x86_64"

			nameE, versionE := ua.Engine()
			fmt.Printf("Engine: %v\n", nameE)            // => "AppleWebKit"
			fmt.Printf("Engine version: %v\n", versionE) // => "537.11"

			nameB, versionB := ua.Browser()
			fmt.Printf("Browser: %v\n", nameB)            // => "Chrome"
			fmt.Printf("Browser version: %v\n", versionB) // => "23.0.1271.97"

			if ua.Bot() || nameB == "curl" {
				return r, goproxy.NewResponse(r,
					goproxy.ContentTypeText, http.StatusForbidden,
					"Don't waste your time!")
			}
			return r, nil
		})
	log.Fatal(http.ListenAndServe(":8080", proxy))
}

 

Source code available at GitHub.

Count words frequency reading standard input with go

Hi guys,
today is day 8 out of 100 days of code.

The task for today is to count words frequency reading data from stdin.  Input is scanned using bufio *Scanner with a split by words(ScanWords).
Lets see the code:

package main

import (
	"bufio"
	"fmt"
	"os"
)

func main() {
	//map to store words frequency
	words := make(map[string]int)
	//we will read from file
	in := bufio.NewReader(os.Stdin)
	scanner := bufio.NewScanner(in)
	//we ask scanner to split input by words for us
	scanner.Split(bufio.ScanWords)
	count := 0
	//scan the inpurt
	for scanner.Scan() {
		//get input token - in our case a word and update it's frequence
		words[scanner.Text()]++
		count++
	}
	if err := scanner.Err(); err != nil {
		fmt.Fprintln(os.Stderr, "reading input:", err)
	}
	fmt.Printf("Total words: %d\n", count)
	fmt.Printf("Words frequency: \n")
	//todo: sort words by values for nice print
	for k, v := range words {
		fmt.Printf("%s:%d\n", k, v)
	}
}

Results are stored in a map words. The property of map that it doesn’t have defined order of keys, so the bonus here is to sort words map by frequency of words(by values).

See source code at GitHub

Simple example of packages and imports in golang

This is day 7 out of 100 days of golang coding.

By requests of my friends this post is about packages, files and imports.

Packages in go is the same as modules and libraries in other languages. To make a package you need two decide two things: package path and package name.

In my example package name is helloworld and because it is day7 it is inside day7 folder. Full path looks like: github.com/vorozhko/go-tutor/day7/helloworld.
Full path is calculated based on $GOPATH variable which is ~/go by default, but can be set to any other in .bashrc file.

All files with the same package name will belong to one package and will share private and public variable like it is one single file. To make variable or function public it’s name must start with capital letter. Lets see an example:

hello.go

 

// Package helloworld - test example
package helloworld

import "fmt"

// prefix - define default prefix 
const prefix = "Hello World!"

// say - print string
func say(str string) {
	fmt.Print(str)
}

world.go

// Package helloworld - test example
package helloworld

// SayName print str text with predefined prefix
// prefix and say is visible inside helloworld package 
func SayName(str string) {
	say(prefix + str)
}

Function ‘say’ and constant  ‘prefix’ only visible inside package helloworld.
Function SayName will be visible outside of the package.

You can import new package from any place in your workspace using following code:

main.go

package main

import "github.com/vorozhko/go-tutor/day7/helloworld"

func main() {
	helloworld.SayName(" - test")
}

To import a package you have to use full path to the package relatively to your $GOPATH folder.
From here we can call helloworld.SayName, because it is visible outside of the package.

To manage imports in your go files it is recommended to use goimports tool. It will automatically insert package declaration in import as necessary. It also supported by many code editors.

 

very basic web crawler on golang

Hi guys,

this is day 6 out 100 days of code on go lang.

Following code implement recursive fetching of internal links on given web page.

What I want to do next is to add goroutines in action, because fetching process is nice to have run in parallel. Any suggestion how to do it?

package main

//todo:
// - every link must be visited only once [done]
// - keep a map of visited links [done]
// - fix links and page concatination [done]
// - extract domain from request uri to simplify crawling [done]
// - rework how internal links are selected for crawling
// - fix how crawling settings are set like depth and maxLinks

import (
	"flag"
	"fmt"
	"io"
	"log"
	"net/http"
	"time"

	"golang.org/x/net/html"
)

//hash of visited links to prevent double visit
var visitedLinks map[string]bool
var baseURL = flag.String("url", "", "start url")

func main() {
	visitedLinks = make(map[string]bool)
	flag.Parse()

	if *baseURL == "" {
		log.Fatal("--url paramters is required")
	}

	visitedLinks[*baseURL] = false

	//set parameters for crawling
	crawl("/")
}

func crawl(link string) {
	//check if link already visited
	if visitedLinks[link] {
		return
	}
	//set link as visited
	visitedLinks[link] = true
	fmt.Printf("Crawling %s ..................\n\n", *baseURL+link)
	resp, err := http.Get(*baseURL + link)
	if err != nil {
		log.Fatal(err)
	}
	defer resp.Body.Close()

	linkCounter := 0
	for _, href := range getLinks(resp.Body) {
		//todo: rework how links are selected
		if len(href) > 0 && string(href[0]) == "/" && // only internal links
			href != link { //skip current page
			if len(href) > 1 && href[1] == '/' { //skip external links which start with //
				continue
			}
			linkCounter++
			//fmt.Printf("Found: %s\n", href)
			crawl(href)
			time.Sleep(time.Second * 1)
		}
	}
}

//Collect all links from response body and return it as an array of strings
func getLinks(body io.Reader) []string {
	var links []string
	z := html.NewTokenizer(body)
	for {
		tt := z.Next()

		switch tt {
		case html.ErrorToken:
			return links
		case html.StartTagToken, html.EndTagToken:
			token := z.Token()
			if "a" == token.Data {
				for _, attr := range token.Attr {
					if attr.Key == "href" {
						links = append(links, attr.Val)
					}

				}
			}

		}
	}
}

Source code on github

Get all links from html page with go lang

Hi guys,
this is day 5 out of 100 days of code!

Today I have coded html href links parser. This is part of web crawler project about which I will post in following days.

package main

import (
	"io"
	"log"
	"net/http"
        "fmt"

	"golang.org/x/net/html"
)

func main() {
    resp, err := http.Get("https://golang.org/")
    if err != nil {
        log.Fatal(err)
    }
    for _, v := range getLinks(resp.Body) {
        fmt.Println(v)
    }
}

//Collect all links from response body and return it as an array of strings
func getLinks(body io.Reader) []string {
	var links []string
	z := html.NewTokenizer(body)
	for {
		tt := z.Next()

		switch tt {
		case html.ErrorToken:
			//todo: links list shoudn't contain duplicates
			return links
		case html.StartTagToken, html.EndTagToken:
			token := z.Token()
			if "a" == token.Data {
				for _, attr := range token.Attr {
					if attr.Key == "href" {
						links = append(links, attr.Val)
					}

				}
			}

		}
	}
}

checking current time with go lang channels

This is day 4 out of 100 days of code.
Two code snippets today. First is very simple go channel.
It get current system time and post it to a channel then on other end message retrieved out of the channel and printed.

package main

import (
	"fmt"
	"time"
)

func main() {
	messages := make(chan string)
	go func() {
		for {
			messages <- fmt.Sprintf("Time now %s\n", time.Now())
			time.Sleep(100 * time.Millisecond)
		}
	}()
	for {
		msg := <-messages
		fmt.Println(msg)
	}
}

Second code snippet is a program which check two strings for anagram. When one string is equal reverse of other it is anagram.

package main

import (
	"flag"
	"fmt"
)

func main() {
	str1 := flag.String("first", "", "first string for anagram check")
	str2 := flag.String("second", "", "second string for anagram check")
	flag.Parse()
	checkForAnagrams(*str1, *str2)
}

func checkForAnagrams(str1, str2 string) {
	if len(str1) != len(str2) {
		fmt.Printf("%s and %s are not anagrams", str1, str2)
		return
	}

	for i := 0; i < len(str1); i++ {
		if str1[i] != str2[len(str2)-i-1] {
			fmt.Printf("%s and %s are not anagrams", str1, str2)
			return
		}
	}
	fmt.Printf("%s and %s are anagrams", str1, str2)
}

 

Recursive search through tree of files with golang

This is day 3 out of 100 days of code in go lang.

This code snippet does recursive search through directory tree. One improvement which can be made is to replace custom recursive function readFiles with filepath.Walk

Full code:

package main

import (
	"flag"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"os"
	"strings"
)

var path = flag.String("path", "", "file path to search in")
var search = flag.String("search", "", "search string to look for")

func main() {
	flag.Parse()
	fi, err := os.Stat(*path)
	if err != nil {
		log.Fatal(err)
	}
	//fix path if directory
	if fi.Mode().IsDir() {
		*path = strings.TrimRight(*path, "/") + "/"
		readFiles(*path, *search)
	} else {
		log.Fatal("path must be a directory, but file was provided: ", *path)
	}
}

func readFiles(path, search string) {
	files, err := ioutil.ReadDir(path)
	if err != nil {
		log.Fatal(err)
	}
	for _, file := range files {
		fullpath := path + file.Name()
		if file.Mode().IsDir() {
			readFiles(fullpath+"/", search)
		} else if file.Mode().IsRegular() {
			searchInFile(fullpath, search)
		}
	}
}

func searchInFile(fullpath, search string) {
	data, err := ioutil.ReadFile(fullpath)
	if err != nil {
		log.Fatal(err)
	}
	//need to check for file type to detect filter off non-text files
	fileType := http.DetectContentType(data)
	if strings.Index(fileType, "text") == -1 {
		//skip all non text files
		return
	}
	for _, line := range strings.Split(string(data), "\n") {
		if strings.Index(line, search) > -1 {
			fmt.Printf("%s: %s\n", fullpath, line)
		}
	}
}

Source code at Github

very simple grep tool in GO – search substring in files

Hi guys,
here is day 2 of 100 days of code.

This time it is very simple implementation of grep tool. What it does it search string in specific directory and report matching lines.
What have I learned is that strings.Index is very useful to work with text and http.DetectContentType was the way to detect binary files.

Full code:

package main

import (
	"flag"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"strings"
)

func main() {
	path := flag.String("path", "", "file path to search in")
	search := flag.String("search", "", "search string to look for")
	flag.Parse()

	files, err := ioutil.ReadDir(*path)
	if err != nil {
		log.Fatal(err)
	}
	for _, file := range files {
		if file.Mode().IsDir() {
			//to do: traverse directories recursively
			continue
		}
		data, err := ioutil.ReadFile(file.Name())
		if err != nil {
			log.Fatal(err)
		}
		//need to check for file type to detect binary content
		fileType := http.DetectContentType(data)
		for _, line := range strings.Split(string(data), "\n") {
			if strings.Index(line, *search) > -1 {
				if strings.Index(fileType, "text/plain") > -1 {
					fmt.Printf("%s: %s\n", file.Name(), line)
				} else {
					//best guess it is binary file
					//no need to go through all lines in binary file
					fmt.Printf("Binary file %s matches\n", file.Name())
					break
				}
			}
		}
	}
}

Source code at Github