Paulo Suzart

Functional programming and a bit of a lot.

Going back to Go (golang)

Intro

Hi people, a while ago I made a clear expression of my kinda frustration with clojure ecosystem in the clojure mailling list. I’m not going to share the link here because my post itself was confusing and answers even more.

In summary the furstration comes from TOO TOO TOO MUCH '[ANN]' posts on the list as if it was open source and more specifically, few of them are really bringing value to clojure as other languages have strong libs/tools.

Ok, lets move on. Back in 2011 I wrote a very simple tool called gb just for the sake of learning Go. In the end it is quite good code and useful if you want to build and use it. It is inspired by Apache Benchmark with less features.

After that, I stopped with Go and went deep in clojure. But the destiny brought me back to Go (I can write about later). I could write a lot about Why Go, but you can find good articles and videos out there. In summary I would say:

What Am I doing?

I’m writing some kind of tracking system built with [go standard lib](http://golang.org/pkg/, Gorilla HTTP Toolkit and BoltDB. I’ll not share any of my code now, hope I can share the whole tool that is 60% done and evolving fast.

BoltDB is “pure Go key/value store” with the goals of providing “a simple, fast, and reliable database for projects that don’t require a full database server such as Postgres or MySQL”.

I hit BoltDB while learning about Implementing a Key Value Storate. It is a very interesting topic but I have no time - and possibily no brain - to implement mine. I then went for a simple persistent queue (like (Kestrel)[https://github.com/twitter/kestrel]), but I made the favor to myself of rm -rf my source code folder that had no git. Ok, it is already overtaken :(

These days, @BoltDB shared with me a interesting benchmark:

# Sequentially insert 1M key/value pairs (in 1000 record batches).
$ bolt bench --count 1000000 --batch-size 1000
# Write 3.939999671s  (3.939us/op)  (253871 op/sec)
# Read  1.003326413s  (40ns/op) (25000000 op/sec)
 
---------
 
# Randomly insert 1M key/value pairs (in 1000 record batches).
$ bolt bench --count 1000000 --batch-size 1000 --write-mode rnd
# Write 56.84787703s  (56.847us/op) (17591 op/sec)
# Read  1.010560605s  (42ns/op) (23809523 op/sec)

If in the end can reach half of such throughtput after adding some logic on top fo BoltDB, I’ll many times more than enough.

Just to not have a blog post without code. Lets take a look at sample extracted from real code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
type BoltStorage struct {
  DB         *bolt.DB
  writerChan chan [3]interface{} //not so agnostic but enough now
}

func (this *BoltStorage) writer() {
  for data := range this.writerChan {
    bucket := data[0].(string)
    keyId := data[1].(string)
    dataBytes := data[2].([]byte)
    err := this.DB.Update(func(tx *bolt.Tx) error {
      sesionBucket, err := tx.CreateBucket([]byte(bucket))
      if err != nil {
        return err
      }
      return sesionBucket.Put([]byte(keyId), dataBytes)
    })
    if err != nil {
      // TODO: Handle instead of panic
      panic(err)
    }
  }
}

func NewBoltStorage(dbPath string) *BoltStorage {
  db, err := bolt.Open(dbPath, 0666, nil)
  writerChan := make(chan [3]interface{})
  boltStorage := &BoltStorage{DB: db, writerChan: writerChan}

  go boltStorage.writer()
  if err != nil {
    panic(err)
  }
  return boltStorage
}

// somewhere else
writerChan <- [3]interface{}{"3212123", "1478812031", data}

Notice that there is a single channel consumed by a go routine. This is the guy that will interact with BoltDB. I’m also using keys that can be ordered while using ForEach to iterate over a Bucket of tracked data.

Notice how simple it is to create a go routine at the line. And even easier it is to create a channel. Not only that, the BoltDB API is straightforward as you can see.

BoltDB organizes data into buckets and then keys and values. An important feature of BoltDB is the transactions. This is a requirement for dealing with multiple insertions (the data itself and any other metadata stored somewhere else).

Ok, hope I can finish this project as soon as possible so I can share more thoughts and lessons learned.

Cheers!

comments powered by Disqus