上一篇文章介绍了分布式链路的标准 OpenTracing, 在 OpenTracing 的官网我们可以看到这样一条信息 OpenTracing and OpenCensus have merged to form OpenTelemetry!。 可以看到 OpenTracing 和 OpenCensus 已经被合并为 OpenTelemetry 了。

OpenCensus 是什么呢? OpenTracing 是最早为分布式追踪制定了一套平台无关、厂商无关的协议标准的项目,并以此成为了 CNCF 的孵化项目。 在之后,谷歌牵头,微软加入,创建了 OpenCensus 项目统一 Metrics 基础指标监控的使用方式,还做了 OpenTracing 的老本行:分布式追踪。

OpenTelemetry

OpenTelemetry 的自身定位十分明确:数据采集和标准规范的统一,对于数据如何去使用、存储、展示、告警,官方是不涉及的。

OpenTelemetry 的终极目标十分伟大:实现 Metrics、Tracing、Logging 的融合及大一统,作为 APM 的数据采集终极解决方案。

目前 OpenTelemetry 正式成为 CNCF 的孵化项目,OpenTracing 和 OpenCensus 不再维护, OpenTracing 目前是 CNCF 的存档项目。

OpenTelemetry 的一些基础知识是兼容 OpenTracing 的, 只有一些 API 是不同的。

OpenTelemetry for go

我们继续使用上文的代码, 对上文的代码进行修改。

install 依赖

go get go.opentelemetry.io/otel@v1.0.0-RC1 go.opentelemetry.io/otel/sdk@v1.0.0-RC1 go.opentelemetry.io/otel/exporters/stdout/stdouttrace@v1.0.0-RC1 go.opentelemetry.io/otel/trace@v1.0.0-RC1
go get -u go.opentelemetry.io/otel/exporters/jaeger

OpenTelemetry 官方为多种开源框架提供了开箱即用的 Instrumentation Packages ,如 gin , beego , mux , go-kit 等,当然也支持 net/http 标准库 ,更多可浏览opentelemetry-go-contrib 仓库。

我们使用的是 gin, 所以安装相应的依赖 :

go get go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin

对原有代码进行修改, 修改后的代码:

package main

import (
	"context"
	"log"
	"math/rand"
	"net/http"
	"time"

	"github.com/gin-gonic/gin"

	"go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin"
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/exporters/jaeger"
	"go.opentelemetry.io/otel/propagation"
	"go.opentelemetry.io/otel/sdk/resource"
	tracesdk "go.opentelemetry.io/otel/sdk/trace"
	semconv "go.opentelemetry.io/otel/semconv/v1.4.0"
	"go.opentelemetry.io/otel/trace"
)

var (
	addr = ":8080"
)

// tracerProvider returns an OpenTelemetry TracerProvider configured to use
// the Jaeger exporter that will send spans to the provided url. The returned
// TracerProvider will also use a Resource configured with all the information
// about the application.
func tracerProvider(url string) (*tracesdk.TracerProvider, error) {
	// Create the Jaeger exporter
	exp, err := jaeger.New(jaeger.WithCollectorEndpoint(jaeger.WithEndpoint(url)))
	if err != nil {
		return nil, err
	}
	tp := tracesdk.NewTracerProvider(
		tracesdk.WithSampler(tracesdk.AlwaysSample()),
		tracesdk.WithBatcher(exp),
		tracesdk.WithResource(resource.NewWithAttributes(
			semconv.SchemaURL,
			semconv.ServiceNameKey.String("opentelemetry-overstarry"), // 服务名
			semconv.ServiceVersionKey.String("0.0.1"),
			attribute.String("environment", "test"),
		)),
	)
	otel.SetTracerProvider(tp)
	otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{}))
	return tp, nil
}

func main() {
	tp, err := tracerProvider("http://localhost:14268/api/traces")
	if err != nil {
		log.Fatal(err)
	}
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	// Cleanly shutdown and flush telemetry when the application exits.
	defer func(ctx context.Context) {
		// Do not make the application hang when it is shutdown.
		ctx, cancel = context.WithTimeout(ctx, time.Second*5)
		defer cancel()
		if err := tp.Shutdown(ctx); err != nil {
			log.Fatal(err)
		}
	}(ctx)

	engine := gin.New()

	engine.Use(otelgin.Middleware("server"))
	engine.GET("/", indexHandler)
	engine.GET("/home", homeHandler)
	engine.GET("/async", serviceHandler)
	engine.GET("/service", serviceHandler)
	engine.GET("/db", dbHandler)
	err = engine.Run(addr)
	if err != nil {
		return
	}
}

func dbHandler(c *gin.Context) {
	ctx := c.Request.Context()
	span := trace.SpanFromContext(otel.GetTextMapPropagator().Extract(ctx, propagation.HeaderCarrier(c.Request.Header)))
	defer span.End()

	time.Sleep(time.Duration(rand.Intn(200)) * time.Millisecond)
}

func serviceHandler(c *gin.Context) {
	ctx := c.Request.Context()
	// 通过http header,提取span元数据信息
	span := trace.SpanFromContext(otel.GetTextMapPropagator().Extract(ctx, propagation.HeaderCarrier(c.Request.Header)))
	defer span.End()

	dbReq, _ := http.NewRequest("GET", "http://localhost:8080/db", nil)
	otel.GetTextMapPropagator().Inject(ctx, propagation.HeaderCarrier(dbReq.Header))
	if _, err := http.DefaultClient.Do(dbReq); err != nil {
		span.RecordError(err)
		attribute.String("请求 /db error", err.Error())
	}
	time.Sleep(time.Duration(rand.Intn(200)) * time.Millisecond)
}

func homeHandler(c *gin.Context) {
	c.Header("Content-Type", "text/html; charset=utf-8")

	c.String(200, "开始请求...\n")
	ctx := c.Request.Context()
	// 设置一个根节点 span
	span := trace.SpanFromContext(ctx)
	defer span.End()

	asyncReq, _ := http.NewRequest("GET", "http://localhost:8080/async", nil)
	otel.GetTextMapPropagator().Inject(ctx, propagation.HeaderCarrier(asyncReq.Header))
	go func() {
		if _, err := http.DefaultClient.Do(asyncReq); err != nil {
			span.RecordError(err)
			span.SetAttributes(attribute.String("请求 /async error", err.Error()))
		}
	}()

	time.Sleep(time.Duration(rand.Intn(200)) * time.Millisecond)

	syncReq, _ := http.NewRequest("GET", "http://localhost:8080/service", nil)
	otel.GetTextMapPropagator().Inject(ctx, propagation.HeaderCarrier(syncReq.Header))

	if _, err := http.DefaultClient.Do(syncReq); err != nil {
		span.RecordError(err)
		span.SetAttributes(attribute.String("请求 /service error", err.Error()))
	}
	c.String(200, "请求结束!")
}

func indexHandler(c *gin.Context) {
	c.Header("Content-Type", "text/html; charset=utf-8")
	c.String(200, string(`<a href="/home"> 点击发起请求 </a>`))
}

运行项目

代码修改完,现在运行项目,现在访问 Jaeger UI, 可以看到已经有请求调用的链条了。 img.png

小结

今天继续讲了分布式链路追踪的最新标准 OpenTelemetry, 并使用相应 SDK 编写了相应的代码。

相应代码: https://github.com/overstarry/tracing-demo

参考链接