【go程序】bytes.Buffer在http客户端中的坑

引子

将bytes.Buffer中的内容通过http发送出去，然后查看bytes.Buffer中的内容。代码如下：

package main
import (
    "fmt"
    "bytes"
    "net/http"
)

func main() {
    var buf bytes.Buffer
    buf.Write([]byte("hello world"))
    fmt.Println("before write, buf length:", buf.Len())
    client := &http.Client{}
    req, err := http.NewRequest("POST", "http://127.0.0.1:12345/test", &buf)
    _, err = client.Do(req)
    if err != nil {
        panic(err)
    }
    fmt.Println("after write, buf length:", buf.Len())
}

运行程序输出如下：

before write, buf length: 11
after write, buf length: 0

可以看出，在http.NewRequest操作buf之后，整个buf被重置。如果随后我们还继续使用这个buf中的内容的话，必然会发生意想不到的错误。

溯源分析

作者是个好奇猫，有时候会钻牛角尖。即使以上的现象告诉我们在经过http请求之后，buf会被清空，但总归需要找到一些证据，以上的结论只是推论。好的，以下作者使用golandIDE对以上代码设断点，逐步调试，找到一些证据如下，详细溯源过程不再分析，有兴趣可以自行溯源。

在client.Do(req)中，http库会启动两个routine来分别监测写请求和读响应两个操作（防止在写数据的过程中服务器强行终止）。读响应这里不多讨论，写请求的函数原型是func (pc *persistConn) writeLoop()，在这个函数中调用wr.req.Request.write方法，在这个方法中先是写http头，然后开始我们的正戏，写http body。函数实现如下：

func (t *transferWriter) WriteBody(w io.Writer) error {
    var err error
    var ncopy int64
        
    // Write body
    if t.Body != nil {
    	if chunked(t.TransferEncoding) {
    		if bw, ok := w.(*bufio.Writer); ok && !t.IsResponse {
    			w = &internal.FlushAfterChunkWriter{Writer: bw}
    		}
    		cw := internal.NewChunkedWriter(w)
    		_, err = io.Copy(cw, t.Body)
    		if err == nil {
    			err = cw.Close()
    		}
    	} else if t.ContentLength == -1 {
    		ncopy, err = io.Copy(w, t.Body)
    	} else {
    		ncopy, err = io.Copy(w, io.LimitReader(t.Body, t.ContentLength))
    		if err != nil {
    			return err
    		}
    		var nextra int64
    		nextra, err = io.Copy(ioutil.Discard, t.Body) // 我们只关心这行
    		ncopy += nextra
    	}
    	if err != nil {
    		return err
    	}
    }
    if t.BodyCloser != nil {
    	if err := t.BodyCloser.Close(); err != nil {
    		return err
    	}
    }
    
    if !t.ResponseToHEAD && t.ContentLength != -1 && t.ContentLength != ncopy {
    	return fmt.Errorf("http: ContentLength=%d with Body length %d",
    		t.ContentLength, ncopy)
    }
    
    if chunked(t.TransferEncoding) {
    	// Write Trailer header
    	if t.Trailer != nil {
    		if err := t.Trailer.Write(w); err != nil {
    			return err
    		}
    	}
    	// Last chunk, empty trailer
    	_, err = io.WriteString(w, "\r\n")
    }
    return err
}

以上代码有些长，然而我们只关心nextra, err = io.Copy(ioutil.Discard, t.Body)这行，因为正是在这步操作之后，我们的buf被清空。首先我们来看一下io.Copy的说明文档：

原型：func Copy(dst Writer, src Reader) (written int64, err error)

If src implements the WriterTo interface, the copy is implemented by calling src.WriteTo(dst). Otherwise, if dst implements the ReaderFrom interface, the copy is implemented by calling dst.ReadFrom(src).

以上可知如果src有WriteTo方法的话，就调用src的WriteTo方法。轻易可知bytes.Buffer是有这个方法的，继续看文档：

原型：func (b *Buffer) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes data to w until the buffer is drained or an error occurs. The return value n is the number of bytes written; it always fits into an int, but it is int64 to match the io.WriterTo interface. Any error encountered during the write is also returned.

以上说只有buffer中的数据耗尽或错误发生时，才会停止。注意此处的修饰词drained，猜测应该是这里将Buffer“卸磨杀驴”的。okay，talk is easy, show me the code:

func (b *Buffer) WriteTo(w io.Writer) (n int64, err error) {
	b.lastRead = opInvalid
	if b.off < len(b.buf) {
		nBytes := b.Len()
		m, e := w.Write(b.buf[b.off:])
		if m > nBytes {
			panic("bytes.Buffer.WriteTo: invalid Write count")
		}
		b.off += m
		n = int64(m)
		if e != nil {
			return n, e
		}
		// all bytes should have been written, by definition of
		// Write method in io.Writer
		if m != nBytes {
			return n, io.ErrShortWrite
		}
	}
	// Buffer is now empty; reset.
	b.Truncate(0)  // 注意此处
	return
}

以上我们终于找到了“罪魁祸首”b.Truncate(0)。这个方法官方文档的说法如下：

Truncate discards all but the first n unread bytes from the buffer but continues to use the same allocated storage. It panics if n is negative or greater than the length of the buffer.

简单说就是把除了n个未读取数据以外的内容清空。看代码也可以佐证这一点：

func (b *Buffer) Truncate(n int) {
	b.lastRead = opInvalid
	switch {
	case n < 0 || n > b.Len():
		panic("bytes.Buffer: truncation out of range")
	case n == 0:
		// Reuse buffer space.
		b.off = 0
	}
	b.buf = b.buf[0 : b.off+n]
}

Here it is!!! b.buf = b.buf[0 : b.off+n].

终于溯源成功，根据得出的结论我们可以写另一个程序测试一下。

另一个例子

package main

import (
	"bytes"
	"fmt"
	"io/ioutil"
)

func main() {
	var buf bytes.Buffer
	buf.Write([]byte("hello world"))
	fmt.Println("before write, buf length:", buf.Len())

	n, err := buf.WriteTo(ioutil.Discard)
	if err != nil {
		panic(err)
	}
	fmt.Printf("write %d bytes\n", n)
	fmt.Println("after write, buf length:", buf.Len())

	buf.Write([]byte("hello world"))
	fmt.Println("before read, buf length:", buf.Len())
	buf2 := make([]byte, len("hello world"))
	n2, err := buf.Read(buf2)
	fmt.Printf("read %d bytes\n", n2)
	if err != nil {
		panic(err)
	}
	fmt.Println("after read, buf length:", buf.Len())
}

输出如下：

before write, buf length: 11
write 11 bytes
after write, buf length: 0
before read, buf length: 11
read 11 bytes
after read, buf length: 0

结论

bytes.Buffer中的数据无论以何种形式被读取之后，就会被清空Reset。fmt打印Buffer内容并不会将缓冲区的数据读取。

本站总访问量次本站访客数人次本文总阅读量次