-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update HTTP Semantic Conventions to Stable Versions #118
Conversation
…e latest stable semantics if the environment variable OTEL_SEMCONV_STABILITY_OPT_IN is "http"
…_STABILITY_OPT_IN
… a comma-separated list
…le is read correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished self-review.
api/hs-opentelemetry-api.cabal
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all auto-generated changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this file should be in the api/src/OpenTelemetry
directory or somewhere else. This seemed like the most appropriate place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine. I'd probably prefer expanding the module name to OpenTelemetry.SemanticConventionsStabilityOptIn
for clarity.
getSemConvStabilityOptIn :: IO SemConvStabilityOptIn | ||
getSemConvStabilityOptIn = do | ||
menv <- lookupEnv "OTEL_SEMCONV_STABILITY_OPT_IN" | ||
let menvs = fmap T.strip . T.splitOn "," . T.pack <$> menv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a better library to use than Data.Text for parsing. Also, T.strip strips whitespace from the environment variable and I'm not sure if that's necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me
hie.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Auto-generated using the command gen-hie > hie.yaml
. This allows HLS to work in VSCode for this repository.
H.union | ||
[ ("db.connection_string", fromString $ showsPrecConnectInfoMasked 0 ci "") | ||
[ ("db.connection_string" :: Text, fromString $ showsPrecConnectInfoMasked 0 ci "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:: Text
is used because of OverloadedStrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Auto-generated
@@ -28,6 +28,7 @@ dependencies: | |||
- bytestring | |||
- network | |||
- iproute | |||
- unordered-containers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for Data.HashMap.Strict
-- , ("http.response_content_length_uncompressed", _) | ||
-- , ("net.transport") | ||
-- , ("net.peer.name") | ||
-- , ("net.peer.ip") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
net.peer.ip
is deleted because it is superseded by server.address
which can either contain the contents of net.peer.name
or net.peer.ip
.
Stable -> | ||
usefulCallsite | ||
`H.union` [ | ||
( "user_agent.original" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This attribute must be provided to the sampler and is therefore has to be added when the span is created and not after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work so far, Evan! This looks like excellent progress and is very thoughtful. I'll defer to Ian for approval, but this appears to be going in a good direction to me.
data SemConvStabilityOptIn = Stable | Both | Old deriving (Show, Eq) | ||
|
||
|
||
getSemConvStabilityOptIn :: IO SemConvStabilityOptIn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could consider splitting this into two functions: a pure function that does the parsing and this that does the lookupEnv
. Something like:
getSemConvStabilityOptIn :: IO SemConvStabilityOptIn
getSemConvStabilityOptIn = parseSemConvStabilityOptIn <$> lookupEnv "OTEL_SEMCONV_STABILITY_OPT_IN"
parseSemConvStabilityOptIn :: Maybe Text -> SemConvStabilityOptIn
...
The pure function can make testing a little bit easier, avoiding the set and unset calls. You could make an argument for skipping tests on getSemConvStabilityOptIn
claiming that its implementation is trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I split it into two functions, but I am thinking about keeping the tests the same. I like that the current tests make sure that the program is reading the env var correctly, and the fact that getSemConvStabilityOptIn
is trivial means that it is still a good test for parseSemConvStabilityOptIn
. It might be better to test the pure function though, not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your tests are pretty nice, compact and legible, but one thing you could do with the pure function is reduce the number of env var tests to two (set/unset), and replace the rest with tests of the pure function that could be one-liners like parseSemConvStabiltyOptIn (Just "http")
shouldBe Stable
. if the setup was more onerous than setEnv "blah"
I might feel more strongly about suggesting that approach, but as it stands I think it's just an aesthetic choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the tests to use a looping structure like Max suggested. Does that satisfy you're aesthetic preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's good too :)
getSemConvStabilityOptIn :: IO SemConvStabilityOptIn | ||
getSemConvStabilityOptIn = do | ||
menv <- lookupEnv "OTEL_SEMCONV_STABILITY_OPT_IN" | ||
let menvs = fmap T.strip . T.splitOn "," . T.pack <$> menv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me
[New HTTP semantic conventions have been declared stable.](https://opentelemetry.io/blog/2023/http-conventions-declared-stable/#migration-plan) Opt-in by setting the environment variable OTEL_SEMCONV_STABILITY_OPT_IN to | ||
- "http" - to use the stable conventions | ||
- "http/dup" - to emit both the old and the stable conventions | ||
Otherwise, the old conventions will be used. The stable conventions will replace the old conventions in the next major release of this library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent
t <- httpTracerProvider | ||
inSpan'' t "withResponse" (addAttributesToSpanArguments callerAttributes spanArgs) $ \_wrSpan -> do | ||
tracer <- httpTracerProvider | ||
inSpan'' tracer "withResponse" (addAttributesToSpanArguments callerAttributes spanArgs) $ \_wrSpan -> do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the variable rename is reasonable. Haskellers can lean on the side of being terse and I think that partly stems from mathematics backgrounds, where single letter variable naming is common.
case semConvStabilityOptIn of | ||
Stable -> addStableAttributes | ||
Both -> addStableAttributes >> addOldAttributes | ||
Old -> addOldAttributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the structure of this... very clean.
instrumentation/http-client/src/OpenTelemetry/Instrumentation/HttpClient/Raw.hs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to see changes here, particularly special-casing on the same OTEL_SEMCONV_STABILITY_OPT_IN values. On the face of it, those seem unrelated. However, upon looking at the database conventions, they reference this. It appears that the overall changes aren't HTTP-only. So it seems like you are doing the correct thing according to the spec.
it "defaults to 'Old' when env var has incorrect value" $ do | ||
setEnv envVarName "http/du" -- intentionally similar to "http/dup" and "http" | ||
semConvStabilityOptIn <- getSemConvStabilityOptIn | ||
semConvStabilityOptIn `shouldBe` Old | ||
it "is 'Stable' when env var is 'http'" $ do | ||
setEnv envVarName "http" | ||
semConvStabilityOptIn <- getSemConvStabilityOptIn | ||
semConvStabilityOptIn `shouldBe` Stable | ||
it "is 'Both' when env var is 'http/dup'" $ do | ||
setEnv envVarName "http/dup" | ||
semConvStabilityOptIn <- getSemConvStabilityOptIn | ||
semConvStabilityOptIn `shouldBe` Both | ||
it "is 'Both' when env var is 'http/dup,http'" $ do | ||
setEnv envVarName "http/dup,http" | ||
semConvStabilityOptIn <- getSemConvStabilityOptIn | ||
semConvStabilityOptIn `shouldBe` Both | ||
it "is 'Both' when env var is 'http,http/dup'" $ do | ||
setEnv envVarName "http,http/dup" | ||
semConvStabilityOptIn <- getSemConvStabilityOptIn | ||
semConvStabilityOptIn `shouldBe` Both | ||
it "is 'Both' when env var is 'http,something-random,http/dup'" $ do | ||
setEnv envVarName "http,http/dup" | ||
semConvStabilityOptIn <- getSemConvStabilityOptIn | ||
semConvStabilityOptIn `shouldBe` Both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your way is preferable, just throwing out an option is to do these tests as a loop through [("testname", "envvarname", valueItShouldBe), ...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I was on the edge but I think it increases readability to do it that way because there is less repeated information to look through.
instrumentation/yesod/src/OpenTelemetry/Instrumentation/Yesod.hs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The direction here is really great! Left some feedback on how I'd evolve the interface a bit to future-proof it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine. I'd probably prefer expanding the module name to OpenTelemetry.SemanticConventionsStabilityOptIn
for clarity.
@@ -0,0 +1,26 @@ | |||
{-# LANGUAGE OverloadedStrings #-} | |||
|
|||
module OpenTelemetry.SemConvStabilityOptIn ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to haddock all the things in this module.
|
||
module OpenTelemetry.SemConvStabilityOptIn ( | ||
getSemConvStabilityOptIn, | ||
SemConvStabilityOptIn (Stable, Both, Old), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do SemConvStabilityOptIn(..)
here to export all of the options.
import System.Environment (lookupEnv) | ||
|
||
|
||
data SemConvStabilityOptIn = Stable | Both | Old deriving (Show, Eq) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the spec for this environment variable says that it could expand to encompass other things besides the current set of values, I think we need to tweak this a bit so that it's a bit more forwards compatible. If something like sql/dup
shows up in the future, we'd need some way to recognize it and let users consume it.
data SemanticConventionStabilityOptIn
= HttpStableSemantics
| HttpOldAndStableSemantics
-- note that I'm deleting the current `Old` value here, see the next comment for what I'm thinking.
| UnknownSemanticConvention Text
deriving (Show, Eq, Generic)
instance Hashable SemanticConventionStabilityOptIn
data SemConvStabilityOptIn = Stable | Both | Old deriving (Show, Eq) | ||
|
||
|
||
parseSemConvStabilityOptIn :: Maybe String -> SemConvStabilityOptIn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we need to account for other semantic convention options in the future, I think it'd be nice to produce an opt-in collection data type where we aim to expose functionality via functions that we export.
newtype SemanticConventionStabilityOptIns = SemanticConventionStabilityOptIns -- hide the constructor export
{ optIns :: HashSet SemanticConventionStabilityOptIn
}
semanticConventionOptionIsSet :: SemanticConventionStabilityOptIn -> SemanticConventionStabilityOptIns -> Bool
semanticConventionOptionIsSet opt (SemanticConventionStabilityOptIn opts) = HashSet.member opt opts
useNewHttpSemanticConventions :: SemanticConventionStabilityOptIns -> Bool
useNewHttpSemanticConventions opts =
semanticConventionOptionIsSet HttpStableSemantics opts ||
semanticConventionOptionIsSetHashSet.member HttpOldAndStableSemantics optIns
useOldHttpSemanticConventions :: SemanticConventionStabilityOptIns -> Bool
useOldHttpSemanticConventions SemanticConventionStabilityOptIns{..} =
semanticConventionOptionIsSet HttpOldAndStableSemantics optIns ||
not (semanticConventionOptionIsSet HttpStableSemantics optIns)
parseSemanticConventionStabilityOptIns :: Maybe String -> SemanticConventionStabilityOptIns
parseSemanticConventionStabilityOptIns Nothing = SemanticConventionStabilityOptIns HashSet.empty
parseSemanticConventionStabilityOptIns (Just env) = SemanticConventionStabilityOptIns $ HashSet.fromList $ fmap parseItem envs
where
envs = fmap T.strip . T.splitOn "," . T.pack $ env
parseItem "http/dup" = HttpOldAndStableSemantics
parseItem "http" = HttpStableSemantics
parseItem other = UnknownSemanticConvention
$ requestHeadersToRecord conf | ||
|
||
semConvStabilityOptIn <- liftIO getSemConvStabilityOptIn | ||
case semConvStabilityOptIn of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use the set-membership approach I suggested, then here you'd do something like:
when (useStableHttpSemanticConventions semConvStabilityOptIn) addStableAttributes
when (useOldHttpSemanticConventions semConvStabilityOptIn) addOldAttributes
(\h -> (\v -> ("http.request.header." <> T.decodeUtf8 (foldedCase h), toAttribute (T.decodeUtf8 v))) <$> lookup h (requestHeaders req)) | ||
$ requestHeadersToRecord conf | ||
|
||
semConvStabilityOptIn <- liftIO getSemConvStabilityOptIn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate trying to avoid breaking the interface. I don't think it's great to have to read the environment variable and parse the functionality on each web request though, so I think we need to memoize the result if we're not going to pass in the opt-ins directly.
I'd recommend adding something like this to the Semantic Convention module:
import Control.Exception
import Data.IORef
import System.IO.Unsafe
-- | Create a new memoized IO action using an 'IORef' under the surface. Note that
-- the action may be run in multiple threads simultaneously, so this may not be
-- thread safe (depending on the underlying action). For the sake of reading an environment
-- variable and parsing some stuff, we don't have to be concerned about thread-safety.
memoize :: IO a -> IO (IO a)
memoize action = do
ref <- newIORef Nothing
pure $ do
mres <- readIORef ref
res <- case mres of
Just res -> pure res
Nothing -> do
res <- tryAny action
writeIORef ref $ Just res
pure res
either throwIO pure res
-- This uses the global IORef trick:
-- https://www.parsonsmatt.org/2021/04/21/global_ioref_in_template_haskell.html
memoizedGetSemanticConventionOptIns :: IO SemanticConventionStabilityOptIns
memoizedGetSemanticConventionOptIns = unsafePerformIO $ memoize getSemConvStabilityOptIn
{-# NOINLINE memoizedGetSemanticConventionOptIns #-}
Then in places where you're calling getSemConvStabilityOptIn
currently, use memoizedGetSemanticConventionOptIns
instead.
let attrs' = | ||
case semConvStabilityOptIn of | ||
Stable -> addStableAttributes attrs | ||
Both -> addStableAttributes . addOldAttributes $ attrs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a (very) minor preference towards using $
unless I'm doing pointfree style. It makes it a little bit easier to move things around in my experience.
That said, you could just write addStableAttributes $ addOldAttributes attrs
and do away with the dot entirely.
…instead of booleans
Co-authored-by: Ian Duncan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished re-review
| StableAndOld | ||
| Old | ||
deriving (Show, Eq) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used this instead of set membership because I did not like that if the pattern was to use when (useStableHttpSemantics options) useStableAttributes
it would be easy to forget to check both useStable
and useOld
because the compiler does not enforce it. This way, the compiler will say "missing cases in pattern match".
parseHttpOption envs | ||
| "http/dup" `elem` envs = StableAndOld | ||
| "http" `elem` envs = Stable | ||
| otherwise = httpOption defaultOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using httpOption defaultOptions
means that the value if the environment variable is unset will always be the same as if it is set but missing configuration for http.
parseSemanticsOptions (Just env) = SemanticsOptions {..} | ||
where | ||
envs = fmap T.strip $ T.splitOn "," $ T.pack env | ||
httpOption = parseHttpOption envs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler will give a warning if one of the fields is unset.
unless it is necessary to retrieve the value of OTEL_SEMCONV_STABILITY_OPT_IN every time getSemanticsOptions' is called. | ||
-} | ||
getSemanticsOptions' :: IO SemanticsOptions | ||
getSemanticsOptions' = parseSemanticsOptions <$> lookupEnv "OTEL_SEMCONV_STABILITY_OPT_IN" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used function'
to indicate a different version of function
. I'm not sure if that is the best way to indicate the difference. I want function
to be the default and function'
to only be used in necessary circumstances.
describe "HttpOption" $ do | ||
it "defaults to 'Old' when env var has no value" $ do | ||
unsetEnv envVarName | ||
semanticsOptions <- getSemanticsOptions' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uses getSemanticsOptions'
so that the tests are not affected by memoization. Memoization is tested later.
semanticsOptions <- liftIO getSemanticsOptions | ||
case httpOption semanticsOptions of | ||
Stable -> addStableAttributes | ||
StableAndOld -> addStableAttributes >> addOldAttributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StableAndOld
seemed like a more specific name than Both
.
Context
The OTEL spec declared the new HTTP semantic conventions as stable, so this PR is updating several instrumentation libraries to follow the new convenions. The updated libraries are http-client, wai, postgresql-simple, persistent-mysql, and yesod. Programmers can opt into the new conventions using the environment variable
OTEL_SEMCONV_STABILITY_OPT_IN
. The stable conventions will replace the old conventions in the next major release of hs-opentelemetry.Related: Issue #103
Testing