什么是protobuf?
一种序列化结构数据的方式,与语言无关,与平台无关,可扩展。和json一样,但是更小更快。
V3语法介绍
举一个简单的例子,一个分页查询的请求参数,新建一个.proto文件并写入:
syntax = "proto3";
message SearchRequest {
string query = 1;
int32 page_num = 2;
int32 page_size = 3;
}
.proto文件的第一行指定了当前正在使用proto3语法,如果未写,proto的编译器会默认使用proto2的语法进行解析。
SearchRequest
定义了三个字段,它们都有类型和名称,并且在最后给了一个唯一编号,编号的作用是在序列化后的二进制序列中标识字段。当编号为1-15的范围内时,在编码时只需要一个字节,而16-2047就需要两个字节,具体原因可以了解一下protobuf的编码原理。所以,1-15的编号最好要保留给那些最经常出现的字段。你能使用的最小编号是1,最大是$2^{29}$-1,不过你不能使用19000-19999,这些编号为protobuf的实现而预保留了,如果你使用了,是过不了编译的。
在单个.proto文件中能定义多个message类型:
syntax = "proto3";
message SearchRequest {
string query = 1;
int32 page_num = 2;
int32 page_size = 3;
}
message SearchResponse {
repeated string data = 1;
int32 total = 2;
}
repeated
表示该字段是一个可重复值,可以理解为数组。
注释的写法:
syntax = "proto3";
/* 请求结构 */
message SearchRequest {
string query = 1;
int32 page_num = 2;
int32 page_size = 3;
}
/* 响应结构 */
message SearchResponse {
repeated string data = 1;
int32 total = 2; // 总数
}
如果我们在更新message type时,是直接删除或者注释了字段,在之后其他人来更新时可能会复用被删除或注释的字段编号,如果这时加载了相同proto的旧版本,那么就会导致一些错误。所以protobuf提供了保留字段的机制,使用reserved
关键字,可以保留编号或者字段名。
message Foo {
reserved 2, 15, 9 to 11;
reserved "foo", "bar";
int32 foo = 2; // 会报错
}
枚举:
/* 请求结构 */
message SearchRequest {
string query = 1;
int32 page_num = 2;
int32 page_size = 3;
enum Corpus {
UNIVERSAL = 0;
WEB = 1;
IMAGES = 2;
LOCAL = 3;
NEWS = 4;
PRODUCTS = 5;
VIDEO = 6;
}
Corpus corpus = 4;
}
有一点需要注意,枚举的第一个字段值必须为0,因为这样我们就能用0作为数字编号的默认值,并且为了兼容proto2,枚举的第一个字段即为默认值。
枚举的别名,如果某时候希望多个枚举字段代表同一个值,可以使用别名机制,需要设置option allow_alias
为true
:
syntax = "proto3";
/* 请求结构 */
message SearchRequest {
string query = 1;
int32 page_num = 2;
int32 page_size = 3;
enum Corpus {
UNIVERSAL = 0;
WEB = 1;
IMAGES = 2;
LOCAL = 3;
NEWS = 4;
PRODUCTS = 5;
VIDEO = 6;
}
Corpus corpus = 4;
}
enum Corpus {
UNIVERSAL = 0;
WEB = 1;
IMAGES = 2;
LOCAL = 3;
NEWS = 4;
PRODUCTS = 5;
VIDEO = 6;
}
message MyMessage1 {
enum EnumAllowingAlias {
option allow_alias = true; // 允许别名
UNKNOWN = 0;
STARTED = 1;
RUNNING = 1;
}
EnumAllowingAlias enum = 1;
SearchRequest.Corpus corpus = 2;
Corpus corpus2 = 3;
}
如上,枚举的声明可以不在message的声明中,在一个message中也可以引用其他message中声明的枚举。
枚举值的范围是一个32位的int,因为枚举在传输过程中使用的是变长编码,所以负数是很低效的,不推荐。
导入其他.proto文件,先在goland中配置proto寻找路径,添加本地项目地址。
之后在.proto文件中使用import关键字进行导入:
import "pb/route_guide.proto";
此时根路径为项目根目录
proto中的map:
message M {
map<string, SearchRequest> req_map = 1;
}
or
message MapFieldEntry {
key_type key = 1;
value_type value = 2;
}
repeated MapFieldEntry map_field = N;
注意:
- Map fields cannot be
repeated
. - Map中的数据顺序不是固定的
生成代码-以go为例
要生成go代码,我们要在proto文件里添加option go_package
的定义, The option defines the import path of the package which will contain all the generated code for this file, the Go package name will be the last path component of the import path.
syntax = "proto3";
package tutorial; // 该package声明是proto文件的命名空间,在message重名时能够通过命名空间进行区分
option go_package = "github.com/yicixin/mysite/pb";
为了生成代码,还需要安装一些proto插件(proto二进制文件已提前安装好):
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
接下来使用proto命令进行生成, 因为生成的是go代码,所以使用的是–go_out,其他语言也有类似的选项,$SRC_DIR
(where your application’s source code lives – the current directory is used if you don’t provide a value), the $DST_DIR
(where you want the generated code to go; often the same as $SRC_DIR
), and the path to your .proto
.
protoc -I=$SRC_DIR --go_out=$DST_DIR $SRC_DIR/addressbook.proto
高级选项:
详见:https://developers.google.com/protocol-buffers/docs/reference/go-generated
The protocol buffer compiler produces Go output when invoked with the go_out
flag. The argument to the go_out
flag is the directory where you want the compiler to write your Go output.
protobuf的编译器将生成go代码,当调用时带上了go_out
标志,go_out标志的参数是你想存放go文件的位置。
protoc --proto_path=. \
--go_out=. --go_opt=paths=source_relative pb/address.proto
Where in the output directory the generated .pb.go
file is placed depends on the compiler flags. There are several output modes:
- If the
paths=import
flag is specified, the output file is placed in a directory named after the Go package’s import path. For example, an input fileprotos/buzz.proto
with a Go import path ofexample.com/project/protos/fizz
results in an output file atexample.com/project/protos/fizz/buzz.pb.go
. This is the default output mode if apaths
flag is not specified. - If the
module=$PREFIX
flag is specified, the output file is placed in a directory named after the Go package’s import path, but with the specified directory prefix removed from the output filename. For example, an input fileprotos/buzz.proto
with a Go import path ofexample.com/project/protos/fizz
andexample.com/project
specified as themodule
prefix results in an output file atprotos/fizz/buzz.pb.go
. Generating any Go packages outside the module path results in an error. This mode is useful for outputting generated files directly into a Go module. - If the
paths=source_relative
flag is specified, the output file is placed in the same relative directory as the input file. For example, an input fileprotos/buzz.proto
results in an output file atprotos/buzz.pb.go
.
引用
Language Guide (proto3) | Protocol Buffers | Google Developers