Below is my client code which stream all the customer URLs from a golang grpc server. It takes Request
input parameter and streams customer URLs based on matching a particular clientId
. In my this code, I am streaming all customer URLs for ClientId 12345 and it works fine.
I am also creating an XML file with all the URLs in it for particular clientId
as shown below. I need to split an XML file into multiple XML files for same clientId
to ensure it conforms to these requirements:
- A single XML file should not be more than 50MB max. It can be approximate, doesn't have to be accurate.
- A single XML file should not have more than 50K URLs max.
I know it's weird that 50k URL limit will be reached sooner than 50MB limit but this is the requirement I have. Based on above logic, I need to make multiple XML files for particular clientId
. All those multiple files can be like 12345_abc_1.xml
, 12345_abc_2.xml
or any other better naming format.
Below is the code I got which works fine but I am opting for code review to see if anything can be improved or optimized so that it can be run efficiently.
func main() {
// this "clientId" will be configurable in future
clientId := 12345
timeout := time.Duration(1000) * time.Millisecond
ctx, _ := context.WithTimeout(context.Background(), timeout)
conn, err := grpc.DialContext(ctx, "localhost:50005", grpc.WithInsecure())
if err != nil {
log.Fatalf("can not connect with server %v", err)
}
// create stream
client := pb.NewCustomerServiceClient(conn)
req := &pb.Request{ClientId: clientId}
stream, err := client.FetchResponse(context.Background(), req)
if err != nil {
log.Fatalf("open stream error %v", err)
}
// create new object to populate all URL data in memory
urlHolder := NewClient()
t := time.Unix(0, 0).UTC()
done := make(chan bool)
go func() {
// create new object to populate all URL data in memory
urlHolder := NewClient()
urlCounter := 0
byteCounter := 0
fileCounter := 0
for {
resp, err := stream.Recv()
if err == io.EOF {
done <- true
file, _ := os.Create(fmt.Sprintf("%d_abc_%d.xml", clientId, fileCounter))
urlHolder.WriteTo(file)
return
}
if err != nil {
log.Fatalf("can not receive %v", err)
}
log.Printf("Resp received: %s", resp.GetCustomerUrl())
// I add the bytes of the URL here as a return
urlBytes := urlHolder.Add(&URL{
Loc: resp.GetCustomerUrl(),
LastMod: &t,
ChangeFreq: Daily,
Priority: 10.2,
})
byteCounter += urlBytes
urlCounter += 1
if byteCounter > 49000000 || urlCounter >= 50000 {
file, _ := os.Create(fmt.Sprintf("%d_abc_%d.xml", clientId, fileCounter))
urlHolder.WriteTo(file)
urlHolder = NewClient() // create a new object for next loop
fileCounter += 1 // prepare fileCounter for next loop
byteCounter = 0 // restart count variables
urlCounter = 0
}
}
}()
<-done
log.Printf("finished")
}
Here is my urlholder.go
file:
type URL struct {
Loc string `xml:"loc"`
LastMod *time.Time `xml:"lastmod"`
ChangeFreq ChangeFreq `xml:"changefreq"`
Priority float32 `xml:"priority"`
}
type UrlMap struct {
XMLName xml.Name `xml:"urlset"`
Xmlns string `xml:"xmlns,attr"`
URLs []*URL `xml:"url"`
Minify bool `xml:"-"`
}
func NewClient() *UrlMap {
return &UrlMap{
Xmlns: "http://www.sitemaps.org/schemas/sitemap/0.9",
URLs: make([]*URL, 0),
}
}
func (s *UrlMap) Add(u *URL) (int) {
s.URLs = append(s.URLs, u)
var urlBytes []byte
var err error
urlBytes, err = xml.Marshal(u) // Transform to bytes using xml.Marshal or xml.MarshalIndent
if err != nil {
panic(err) // or return the error
}
return len(urlBytes)
}
// WriteTo writes XML encoded urlMap to given io.Writer.
func (s *UrlMap) WriteTo(w io.Writer) (n int64, err error) {
cw := NewCounterWriter(w)
_, err = cw.Write([]byte(xml.Header))
if err != nil {
return cw.Count(), err
}
en := xml.NewEncoder(cw)
if !s.Minify {
en.Indent("", " ")
}
err = en.Encode(s)
cw.Write([]byte{'\n'})
return cw.Count(), err
}
Here is my CounterWriter
class -
type CounterWriter struct {
writer io.Writer
count int64
}
var _ io.Writer = (*CounterWriter)(nil)
func NewCounterWriter(w io.Writer) (cw *CounterWriter) {
return &CounterWriter{
writer: w,
}
}
func (cw *CounterWriter) Write(p []byte) (n int, err error) {
n, err = cw.writer.Write(p)
cw.count = cw.count + int64(n)
return n, err
}
// Count returns the number of bytes written to the Writer.
func (cw *CounterWriter) Count() (n int64) {
return cw.count
}
Problem Statement
I am looking for code review on above code:
- Is there anything that can be improved in terms of multiple XML file generation logic? I modified my
Add
method inurlholder.go
file to return bytes (which isint
) but I also haveWriteTo
method in same go file which also returns bytes (which isint64
). Can we simplify my above multipleXML
file generation logic? - Also do we need to use interface to split out above class and make it more clean?
I am just trying to learn on how to write good go code as this code will be running in production.
-
1\$\begingroup\$ I changed the title. Please check that I haven't misrepresented your code, and correct it if I have. \$\endgroup\$Toby Speight– Toby Speight2022年02月21日 08:22:14 +00:00Commented Feb 21, 2022 at 8:22
-
\$\begingroup\$ @TobySpeight Thank you Toby. Appreciate the edit. Looks good. \$\endgroup\$AndyP– AndyP2022年02月21日 17:27:05 +00:00Commented Feb 21, 2022 at 17:27
1 Answer 1
Your code is very clear and readable in general. I came up with a very nitpicky list of style suggestions :)
A few comments on main()
time.Duration(1000) * time.Millisecond
can be written as just1000 * time.Millisecond
- Call the cancel function from
context.WithTimeout
, otherwise your program has a resource leak (not that it matters from main, but still).ctx, cancel := context.WithTimeout(...) defer cancel()
- Use
ctx
consistently, there are some places inmain
where it refers tocontext.Background()
instead. - Handle the error from
os.Create
, or your code will be impossible to debug when it fails. - Similarly, you probably want to print an error if
urlHolder.WriteTo
fails. - I would also suggest making a helper function for creating the XML file and writing to it, in particular if you ever change the file naming pattern you would need to change it in two places.
- I think there's no need for a goroutine in main? You can just run it directly unless I'm missing something.
A few things on urlholder.go
- Time
time.Time
type is almost always used as a value and not as a pointer. TheURL.LastMod
field should be atime.Time
, not a*time.Time
. - This is subjective, but
UrlMap
should be spelled asURLMap
. - No need to
make
a slice of size zero, just usenil
. UrlMap.Add
should return an error instead of panicing.- The
UrlMap.Add
method uses a different method of serialization thanWriteTo
, so the byte count may not be correct. I would suggest callingWriteTo
from insideAdd
, as followsfunc (m *UrlMap) Add(u *URL) int { s.URLs = append(s.URLs, u) var out strings.Builder m.WriteTo(&out) return len(out.String()) }
I'm not sure what the point of CounterWriter
is when the result of urlHolder.WriteTo
is ignored anyway.
And since you asked about interfaces, there's no need to use an interface here. In general you should only refactor to use interfaces when you already have two implementations in mind. Using an interface preemptively for a "cleaner" design is an example of YAGNI: you ain't gonna need it.
-
\$\begingroup\$ Thanks for good suggestion. Can you provide an example for this point
making a helper function for creating the XML file
? Just confuse on how will I separate it out? or you mean to say just moveos.Create
line into a separate function? Also you mentionedno need for a goroutine in main
point - I am kinda confuse on what does this mean. If you can provide an example basis on my suggestion then it will help me great. \$\endgroup\$AndyP– AndyP2022年02月22日 07:01:00 +00:00Commented Feb 22, 2022 at 7:01 -
\$\begingroup\$ Also most importantly confusion I had earlier was I am returning bytes from
Add
method and also returning bytes fromWriteTo
method too. So I was thinking do I really need to return bytes fromAdd
method to deal with multiple XML file generation or I can also useWriteTo
method too? I mean what does make sense most for my use case? \$\endgroup\$AndyP– AndyP2022年02月22日 07:02:10 +00:00Commented Feb 22, 2022 at 7:02 -
\$\begingroup\$ I meant that
os.Create
andurlHolder.WriteTo
should together go in a helper function, but it's pretty subjective. Or just moving the file name pattern to a shared global variable would also reduce the risk of forgetting to change it in two places. \$\endgroup\$rose– rose2022年02月22日 13:35:15 +00:00Commented Feb 22, 2022 at 13:35 -
\$\begingroup\$ Your
main()
launches a goroutine and waits on it with<-done
, I think all the code inside thego func() {...}
can just be written directly in the top-levelmain
function. \$\endgroup\$rose– rose2022年02月22日 13:36:41 +00:00Commented Feb 22, 2022 at 13:36 -
1\$\begingroup\$ My suggestion is using the fact that
WriteTo
accepts anio.Writer
. If you pass an open file toWriteTo
, it will write to the file. If you pass astrings.Builder
toWriteTo
, it will instead serialize the result into a string rather than writing to a file. \$\endgroup\$rose– rose2022年02月22日 20:45:19 +00:00Commented Feb 22, 2022 at 20:45