0doc.go 9.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247
  1. // Copyright (c) 2012-2018 Ugorji Nwoke. All rights reserved.
  2. // Use of this source code is governed by a MIT license found in the LICENSE file.
  3. /*
  4. Package codec provides a
  5. High Performance, Feature-Rich Idiomatic Go 1.4+ codec/encoding library
  6. for binc, msgpack, cbor, json.
  7. Supported Serialization formats are:
  8. - msgpack: https://github.com/msgpack/msgpack
  9. - binc: http://github.com/ugorji/binc
  10. - cbor: http://cbor.io http://tools.ietf.org/html/rfc7049
  11. - json: http://json.org http://tools.ietf.org/html/rfc7159
  12. - simple:
  13. To install:
  14. go get github.com/ugorji/go/codec
  15. This package will carefully use 'unsafe' for performance reasons in specific places.
  16. You can build without unsafe use by passing the safe or appengine tag
  17. i.e. 'go install -tags=safe ...'. Note that unsafe is only supported for the last 3
  18. go sdk versions e.g. current go release is go 1.9, so we support unsafe use only from
  19. go 1.7+ . This is because supporting unsafe requires knowledge of implementation details.
  20. For detailed usage information, read the primer at http://ugorji.net/blog/go-codec-primer .
  21. The idiomatic Go support is as seen in other encoding packages in
  22. the standard library (ie json, xml, gob, etc).
  23. Rich Feature Set includes:
  24. - Simple but extremely powerful and feature-rich API
  25. - Support for go1.4 and above, while selectively using newer APIs for later releases
  26. - Excellent code coverage ( > 90% )
  27. - Very High Performance.
  28. Our extensive benchmarks show us outperforming Gob, Json, Bson, etc by 2-4X.
  29. - Careful selected use of 'unsafe' for targeted performance gains.
  30. 100% mode exists where 'unsafe' is not used at all.
  31. - Lock-free (sans mutex) concurrency for scaling to 100's of cores
  32. - Multiple conversions:
  33. Package coerces types where appropriate
  34. e.g. decode an int in the stream into a float, etc.
  35. - Corner Cases:
  36. Overflows, nil maps/slices, nil values in streams are handled correctly
  37. - Standard field renaming via tags
  38. - Support for omitting empty fields during an encoding
  39. - Encoding from any value and decoding into pointer to any value
  40. (struct, slice, map, primitives, pointers, interface{}, etc)
  41. - Extensions to support efficient encoding/decoding of any named types
  42. - Support encoding.(Binary|Text)(M|Unm)arshaler interfaces
  43. - Decoding without a schema (into a interface{}).
  44. Includes Options to configure what specific map or slice type to use
  45. when decoding an encoded list or map into a nil interface{}
  46. - Encode a struct as an array, and decode struct from an array in the data stream
  47. - Option to encode struct keys as numbers (instead of strings)
  48. (to support structured streams with fields encoded as numeric codes)
  49. - Comprehensive support for anonymous fields
  50. - Fast (no-reflection) encoding/decoding of common maps and slices
  51. - Code-generation for faster performance.
  52. - Support binary (e.g. messagepack, cbor) and text (e.g. json) formats
  53. - Support indefinite-length formats to enable true streaming
  54. (for formats which support it e.g. json, cbor)
  55. - Support canonical encoding, where a value is ALWAYS encoded as same sequence of bytes.
  56. This mostly applies to maps, where iteration order is non-deterministic.
  57. - NIL in data stream decoded as zero value
  58. - Never silently skip data when decoding.
  59. User decides whether to return an error or silently skip data when keys or indexes
  60. in the data stream do not map to fields in the struct.
  61. - Detect and error when encoding a cyclic reference (instead of stack overflow shutdown)
  62. - Encode/Decode from/to chan types (for iterative streaming support)
  63. - Drop-in replacement for encoding/json. `json:` key in struct tag supported.
  64. - Provides a RPC Server and Client Codec for net/rpc communication protocol.
  65. - Handle unique idiosyncrasies of codecs e.g.
  66. - For messagepack, configure how ambiguities in handling raw bytes are resolved
  67. - For messagepack, provide rpc server/client codec to support
  68. msgpack-rpc protocol defined at:
  69. https://github.com/msgpack-rpc/msgpack-rpc/blob/master/spec.md
  70. Extension Support
  71. Users can register a function to handle the encoding or decoding of
  72. their custom types.
  73. There are no restrictions on what the custom type can be. Some examples:
  74. type BisSet []int
  75. type BitSet64 uint64
  76. type UUID string
  77. type MyStructWithUnexportedFields struct { a int; b bool; c []int; }
  78. type GifImage struct { ... }
  79. As an illustration, MyStructWithUnexportedFields would normally be
  80. encoded as an empty map because it has no exported fields, while UUID
  81. would be encoded as a string. However, with extension support, you can
  82. encode any of these however you like.
  83. Custom Encoding and Decoding
  84. This package maintains symmetry in the encoding and decoding halfs.
  85. We determine how to encode or decode by walking this decision tree
  86. - is type a codec.Selfer?
  87. - is there an extension registered for the type?
  88. - is format binary, and is type a encoding.BinaryMarshaler and BinaryUnmarshaler?
  89. - is format specifically json, and is type a encoding/json.Marshaler and Unmarshaler?
  90. - is format text-based, and type an encoding.TextMarshaler?
  91. - else we use a pair of functions based on the "kind" of the type e.g. map, slice, int64, etc
  92. This symmetry is important to reduce chances of issues happening because the
  93. encoding and decoding sides are out of sync e.g. decoded via very specific
  94. encoding.TextUnmarshaler but encoded via kind-specific generalized mode.
  95. Consequently, if a type only defines one-half of the symmetry
  96. (e.g. it implements UnmarshalJSON() but not MarshalJSON() ),
  97. then that type doesn't satisfy the check and we will continue walking down the
  98. decision tree.
  99. RPC
  100. RPC Client and Server Codecs are implemented, so the codecs can be used
  101. with the standard net/rpc package.
  102. Usage
  103. The Handle is SAFE for concurrent READ, but NOT SAFE for concurrent modification.
  104. The Encoder and Decoder are NOT safe for concurrent use.
  105. Consequently, the usage model is basically:
  106. - Create and initialize the Handle before any use.
  107. Once created, DO NOT modify it.
  108. - Multiple Encoders or Decoders can now use the Handle concurrently.
  109. They only read information off the Handle (never write).
  110. - However, each Encoder or Decoder MUST not be used concurrently
  111. - To re-use an Encoder/Decoder, call Reset(...) on it first.
  112. This allows you use state maintained on the Encoder/Decoder.
  113. Sample usage model:
  114. // create and configure Handle
  115. var (
  116. bh codec.BincHandle
  117. mh codec.MsgpackHandle
  118. ch codec.CborHandle
  119. )
  120. mh.MapType = reflect.TypeOf(map[string]interface{}(nil))
  121. // configure extensions
  122. // e.g. for msgpack, define functions and enable Time support for tag 1
  123. // mh.SetExt(reflect.TypeOf(time.Time{}), 1, myExt)
  124. // create and use decoder/encoder
  125. var (
  126. r io.Reader
  127. w io.Writer
  128. b []byte
  129. h = &bh // or mh to use msgpack
  130. )
  131. dec = codec.NewDecoder(r, h)
  132. dec = codec.NewDecoderBytes(b, h)
  133. err = dec.Decode(&v)
  134. enc = codec.NewEncoder(w, h)
  135. enc = codec.NewEncoderBytes(&b, h)
  136. err = enc.Encode(v)
  137. //RPC Server
  138. go func() {
  139. for {
  140. conn, err := listener.Accept()
  141. rpcCodec := codec.GoRpc.ServerCodec(conn, h)
  142. //OR rpcCodec := codec.MsgpackSpecRpc.ServerCodec(conn, h)
  143. rpc.ServeCodec(rpcCodec)
  144. }
  145. }()
  146. //RPC Communication (client side)
  147. conn, err = net.Dial("tcp", "localhost:5555")
  148. rpcCodec := codec.GoRpc.ClientCodec(conn, h)
  149. //OR rpcCodec := codec.MsgpackSpecRpc.ClientCodec(conn, h)
  150. client := rpc.NewClientWithCodec(rpcCodec)
  151. Running Tests
  152. To run tests, use the following:
  153. go test
  154. To run the full suite of tests, use the following:
  155. go test -tags alltests -run Suite
  156. You can run the tag 'safe' to run tests or build in safe mode. e.g.
  157. go test -tags safe -run Json
  158. go test -tags "alltests safe" -run Suite
  159. Running Benchmarks
  160. Please see http://github.com/ugorji/go-codec-bench .
  161. Caveats
  162. Struct fields matching the following are ignored during encoding and decoding
  163. - struct tag value set to -
  164. - func, complex numbers, unsafe pointers
  165. - unexported and not embedded
  166. - unexported embedded non-struct
  167. - unexported embedded pointers (from go1.10)
  168. Every other field in a struct will be encoded/decoded.
  169. Embedded fields are encoded as if they exist in the top-level struct,
  170. with some caveats. See Encode documentation.
  171. */
  172. package codec
  173. // TODO:
  174. // - In Go 1.10, when mid-stack inlining is enabled,
  175. // we should use committed functions for writeXXX and readXXX calls.
  176. // This involves uncommenting the methods for decReaderSwitch and encWriterSwitch
  177. // and using those (decReaderSwitch and encWriterSwitch) in all handles
  178. // instead of encWriter and decReader.
  179. // The benefit is that, for the (En|De)coder over []byte, the encWriter/decReader
  180. // will be inlined, giving a performance bump for that typical case.
  181. // However, it will only be inlined if mid-stack inlining is enabled,
  182. // as we call panic to raise errors, and panic currently prevents inlining.
  183. // - Clean up comments in the codebase
  184. // Remove all unnecesssary comments, so code is clean.
  185. //
  186. // PUNTED:
  187. // - To make Handle comparable, make extHandle in BasicHandle a non-embedded pointer,
  188. // and use overlay methods on *BasicHandle to call through to extHandle after initializing
  189. // the "xh *extHandle" to point to a real slice.
  190. // - Before each release, look through and fix padding for each type, to eliminate false sharing
  191. // - pooled objects: decNaked, codecFner, typeInfoLoadArray, typeInfo,
  192. // - small objects that we allocate and modify much (should be in owned cache lines)
  193. // - Objects used a lot (must live in own cache lines)
  194. // Decoder, Encoder, etc
  195. // - In all above, arrange values modified together to be close to each other.
  196. // Note: we MOSTLY care about the bottom part.