3
\$\begingroup\$

I'm learning go. I wrote a simple logparser for SLF4J in Python some time ago and tried to port it to go as an exercise. The algorithm is identical, but the go-solution isn't quite as fast (the Python solution is about 1.5 times faster for large enough logfiles). Is there any way to get more speed out of it? It do not care, which one is the fastest, but I wanted to know if I could do better with go. Any inspiration is welcome.

A typical logline looks like

[INFO 05:00:07] CoolServiceImpl.executeReminderCC(601) | ==================Finish transaction with id (REMINDER, 1394424006830)

Others include Stacktraces which span multiple lines.

My code:

package main
import (
 "bufio"
 "flag"
 "fmt"
 "io"
 "log"
 "os"
 "regexp"
 "strings"
)
func isStampRelevant(level, timestamp string) bool {
 return level == "ERROR" && timestamp > "10:00:00" && timestamp < "10:10:00"
}
func isMessageRelevant(lineBuffer string) bool {
 return strings.Contains(lineBuffer, "4020829010703")
}
func readFile(filename string) {
 lineStart := regexp.MustCompile("(TRACE|DEBUG|INFO|WARN|ERROR|FATAL).*?(\\d{2}:\\d{2}:\\d{2})")
 lineBuffer := ""
 fileHandle, err := os.Open(filename)
 if err != nil {
 log.Fatal(err)
 }
 buffer := bufio.NewReader(fileHandle)
 for {
 line, isPrefix, err := buffer.ReadLine()
 if err == io.EOF {
 break
 }
 if err != nil {
 log.Fatal(err)
 }
 if isPrefix {
 log.Fatal("Error: Unexpected long line reading", fileHandle.Name())
 }
 currentLine := string(line)
 matches := lineStart.FindStringSubmatch(currentLine)
 if len(matches) == 3 {
 level := matches[1]
 timestamp := matches[2]
 if len(lineBuffer) > 0 {
 if isMessageRelevant(lineBuffer) {
 fmt.Println(lineBuffer)
 }
 }
 if isStampRelevant(level, timestamp) {
 lineBuffer = currentLine
 } else {
 lineBuffer = ""
 }
 } else if len(lineBuffer) > 0 {
 lineBuffer += currentLine + "\n"
 }
 }
 if len(lineBuffer) > 0 {
 if isMessageRelevant(lineBuffer) {
 fmt.Println(lineBuffer)
 }
 }
}
func printUsage() {
 fmt.Fprintf(os.Stderr, "usage: %s [inputfile]\n", os.Args[0])
 flag.PrintDefaults()
 os.Exit(2)
}
func main() {
 if len(os.Args) < 2 {
 printUsage()
 }
 filename := os.Args[1]
 readFile(filename)
} 
asked Apr 18, 2014 at 7:56
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

You should use bufio.Scanner rather than a bufio.Reader. It won't make your code any faster, but it'll simplify things a bit.

One reason your code is slow is that you're converting every line from []byte to string. You can avoid this by using the raw bytes from scanner.Bytes() until you need to print the string. The regexp module provides functions for matching against []byte, so the translation is rather easy.

A secondary improvement would be to change your regexp to:

`^\[(TRACE|DEBUG|INFO|WARN|ERROR|FATAL) (\d{2}:\d{2}:\d{2})\]`

By anchoring your regexp and tightening the syntax it recognizes, you'll avoid scanning the entire log line when the regexp doesn't match. Use of raw strings (with backtick), means you don't have to escape the backslashes. This probably won't be a big speed improvement though, since most of your log lines should be single lines.

In terms of structuring the code, I'd probably separate out the part that parses the logs from the part that selects log entries and prints them out.

Perhaps something like this:

type LogReader struct {
 s bufio.Scanner
 ... some internal state
}
type LogEntry struct {
 level []byte
 timestamp []byte
 lines [][]byte
}
func (lr *logReader) Next() (LogEntry, error) {
 ... code here
}
func (lr *logReader) Done() bool {
 ... code here
}

This code should deal entirely in []byte and not string, for the reasons given before.

With this separation, the main code is much easier to read, and you have a nice LogReader abstraction that you can write unit tests for. Here's the main loop using a LogReader:

lr := NewLogReader(filename)
for !lr.Done() {
 e, err := lr.Next()
 if err != nil {
 ... handle errors
 }
 if isStampRelevant(e.level, e.timestamp) && isMessageRelevant(e.level, e.lines) {
 fmt.Println(string(bytes.Join(e.lines, []byte("\n")))
 }
 }
answered Apr 21, 2014 at 10:27
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.