-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cross Navigation Enrichment #821
Closed
Closed
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
{ | ||
"schema": "iglu:com.snowplowanalytics.snowplow.enrichments/cross_navigation_config/jsonschema/1-0-0", | ||
|
||
"data": { | ||
"enabled": false, | ||
spenes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"vendor": "com.snowplowanalytics.snowplow.enrichments", | ||
"name": "cross_navigation_config" | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
230 changes: 230 additions & 0 deletions
230
...plowanalytics.snowplow.enrich/common/enrichments/registry/CrossNavigationEnrichment.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,230 @@ | ||
/** | ||
* Copyright (c) 2023 Snowplow Analytics Ltd. All rights reserved. | ||
* | ||
* This program is licensed to you under the Apache License Version 2.0, | ||
* and you may not use this file except in compliance with the Apache License Version 2.0. | ||
* You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the Apache License Version 2.0 is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. | ||
*/ | ||
package com.snowplowanalytics.snowplow.enrich.common.enrichments.registry | ||
|
||
import java.time.format.DateTimeFormatter | ||
|
||
import cats.data.ValidatedNel | ||
import cats.syntax.either._ | ||
import cats.syntax.option._ | ||
import cats.syntax.traverse._ | ||
|
||
import io.circe.Json | ||
import io.circe.syntax._ | ||
|
||
import com.snowplowanalytics.iglu.core.{SchemaCriterion, SchemaKey, SchemaVer, SelfDescribingData} | ||
import com.snowplowanalytics.snowplow.badrows.FailureDetails | ||
import com.snowplowanalytics.snowplow.enrich.common.enrichments.{EventEnrichments => EE} | ||
import com.snowplowanalytics.snowplow.enrich.common.enrichments.registry.EnrichmentConf.CrossNavigationConf | ||
import com.snowplowanalytics.snowplow.enrich.common.utils.{ConversionUtils => CU} | ||
import com.snowplowanalytics.snowplow.enrich.common.QueryStringParameters | ||
|
||
/** | ||
* Companion object to create an instance of CrossNavigationEnrichment | ||
* from the configuration. | ||
*/ | ||
object CrossNavigationEnrichment extends ParseableEnrichment { | ||
|
||
type CrossNavTransformation = String => Either[FailureDetails.EnrichmentFailure, Option[String]] | ||
|
||
val supportedSchema = SchemaCriterion( | ||
"com.snowplowanalytics.snowplow.enrichments", | ||
"cross_navigation_config", | ||
"jsonschema", | ||
1, | ||
0 | ||
) | ||
|
||
val outputSchema = SchemaKey( | ||
"com.snowplowanalytics.snowplow", | ||
"cross_navigation", | ||
"jsonschema", | ||
SchemaVer.Full(1, 0, 0) | ||
) | ||
|
||
/** | ||
* Creates a CrossNavigationConf instance from a Json. | ||
* @param config The cross_navigation_config enrichment JSON | ||
* @param schemaKey provided for the enrichment, must be supported by this enrichment | ||
* @return a CrossNavigation configuration | ||
*/ | ||
override def parse( | ||
config: Json, | ||
schemaKey: SchemaKey, | ||
localMode: Boolean = false | ||
): ValidatedNel[String, CrossNavigationConf] = | ||
(for { | ||
_ <- isParseable(config, schemaKey) | ||
} yield CrossNavigationConf(schemaKey)).toValidatedNel | ||
|
||
/** | ||
* Extract the referrer domain user ID and timestamp from the "_sp={{DUID}}.{{TSTAMP}}" | ||
* portion of the querystring | ||
* | ||
* @param qsMap The querystring parameters | ||
* @return Validation boxing a pair of optional strings corresponding to the two fields | ||
*/ | ||
def parseCrossDomain(qsMap: QueryStringParameters): Either[FailureDetails.EnrichmentFailure, CrossDomainMap] = | ||
qsMap.toMap | ||
.map { case (k, v) => (k, v.getOrElse("")) } | ||
.get("_sp") match { | ||
case Some("") => CrossDomainMap.empty.asRight | ||
case Some(sp) => CrossDomainMap.makeCrossDomainMap(sp) | ||
case None => CrossDomainMap.empty.asRight | ||
} | ||
|
||
case class CrossDomainMap(domainMap: Map[String, Option[String]]) { | ||
|
||
/** | ||
* Gets the cross navigation parameters as self-describing JSON. | ||
* | ||
* @param cnMap The map of cross navigation data | ||
* @return the cross navigation context wrapped in a List | ||
*/ | ||
def getCrossNavigationContext: List[SelfDescribingData[Json]] = | ||
domainMap match { | ||
case m: Map[String, Option[String]] if m.isEmpty => Nil | ||
case m: Map[String, Option[String]] if m.get(CrossDomainMap.domainUserIdFieldName).flatten == None => Nil | ||
case m: Map[String, Option[String]] if m.get(CrossDomainMap.timestampFieldName).flatten == None => Nil | ||
case _ => | ||
List( | ||
SelfDescribingData( | ||
CrossNavigationEnrichment.outputSchema, | ||
finalizeCrossNavigationMap.asJson | ||
) | ||
) | ||
} | ||
|
||
def duid: Option[String] = domainMap.get(CrossDomainMap.domainUserIdFieldName).flatten | ||
|
||
def tstamp: Option[String] = domainMap.get(CrossDomainMap.timestampFieldName).flatten | ||
|
||
/** | ||
* Finalizes the cross navigation map by reformatting its timestamp key | ||
* | ||
* @param inputMap A Map of cross navigation properties | ||
* @return The finalized Map | ||
*/ | ||
private def finalizeCrossNavigationMap: Map[String, Option[String]] = | ||
domainMap | ||
.map { | ||
case ("timestamp", t) => ("timestamp" -> CrossDomainMap.reformatTstamp(t)) | ||
case kvPair => kvPair | ||
} | ||
} | ||
|
||
object CrossDomainMap { | ||
val domainUserIdFieldName = "domain_user_id" | ||
val timestampFieldName = "timestamp" | ||
val CrossNavProps: List[(String, CrossNavTransformation)] = | ||
List( | ||
(domainUserIdFieldName, CU.fixTabsNewlines(_).asRight), | ||
(timestampFieldName, extractTstamp), | ||
("session_id", Option(_: String).filter(_.trim.nonEmpty).asRight), | ||
("user_id", decodeWithFailure), | ||
("source_id", decodeWithFailure), | ||
("source_platform", Option(_: String).filter(_.trim.nonEmpty).asRight), | ||
("reason", decodeWithFailure) | ||
) | ||
|
||
/** | ||
* Parses the QueryString into a Map | ||
* @param sp QueryString | ||
* @return either a map of query string parameters or enrichment failure | ||
*/ | ||
def makeCrossDomainMap(sp: String): Either[FailureDetails.EnrichmentFailure, CrossDomainMap] = { | ||
val values = sp | ||
.split("\\.", -1) | ||
.padTo( | ||
CrossNavProps.size, | ||
"" | ||
) | ||
.toList | ||
val result = | ||
if (values.size == CrossNavProps.size) | ||
values | ||
.zip(CrossNavProps) | ||
.map { | ||
case (value, (propName, f)) => f(value).map(propName -> _) | ||
} | ||
.sequence | ||
.map(_.toMap) | ||
else Map.empty[String, Option[String]].asRight | ||
result.map(CrossDomainMap(_)) | ||
} | ||
|
||
def empty: CrossDomainMap = CrossDomainMap(Map.empty) | ||
|
||
/** | ||
* Wrapper around CU.decodeBase64Url. | ||
* If passed an empty string returns Right(None). | ||
* | ||
* @param str The string to decode | ||
* @return either the decoded string or enrichment failure | ||
*/ | ||
private def decodeWithFailure(str: String): Either[FailureDetails.EnrichmentFailure, Option[String]] = | ||
CU.decodeBase64Url(str) match { | ||
case Right(r) => Option(r).filter(_.trim.nonEmpty).asRight | ||
case Left(msg) => | ||
FailureDetails | ||
.EnrichmentFailure( | ||
None, | ||
FailureDetails.EnrichmentFailureMessage.Simple(msg) | ||
) | ||
.asLeft | ||
} | ||
|
||
/** | ||
* Wrapper around EE.extractTimestamp | ||
* If passed an empty string returns Right(None). | ||
* | ||
* @param str The string to extract the timestamp from | ||
* @return either the extracted timestamp or enrichment failure | ||
*/ | ||
private def extractTstamp(str: String): Either[FailureDetails.EnrichmentFailure, Option[String]] = | ||
str match { | ||
case "" => None.asRight | ||
case s => EE.extractTimestamp("sp_dtm", s).map(_.some) | ||
} | ||
|
||
/** | ||
* Converts a timestamp to an ISO-8601 format | ||
* | ||
* @param tstamp The timestamp expected as output of EE.extractTimestamp | ||
* @return ISO-8601 timestamp | ||
*/ | ||
private def reformatTstamp(tstamp: Option[String]): Option[String] = { | ||
val pFormatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS") | ||
val formatter = DateTimeFormatter.ISO_DATE_TIME | ||
tstamp.map(t => formatter.format(pFormatter.parse(t)).replaceAll(" ", "T") + "Z") | ||
} | ||
} | ||
} | ||
|
||
/** | ||
* Enrichment adding cross navigation context | ||
*/ | ||
final case class CrossNavigationEnrichment(schemaKey: SchemaKey) extends Enrichment { | ||
private val enrichmentInfo = | ||
FailureDetails.EnrichmentInformation(schemaKey, "cross-navigation").some | ||
|
||
/** | ||
* Given an EnrichmentFailure, returns one with the cross-navigation | ||
* enrichment information added. | ||
* @param failure The input enrichment failure | ||
* @return the EnrichmentFailure with cross-navigation enrichment information | ||
*/ | ||
def addEnrichmentInfo(failure: FailureDetails.EnrichmentFailure): FailureDetails.EnrichmentFailure = | ||
failure.copy(enrichment = enrichmentInfo) | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a discussion about the choice of this default?
On one hand, making the enrichment enabled by default would all of a sudden add a new entity for users who previously weren't receiving it. But this would only be for a small number of users and for a small number of events (cross-domain tracking is not a commonly used feature). They would still get the old information so not a breaking change as far as I can understand.
On the other hand, if we have it off by default, we are complicating things for two reasons:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have misunderstood but i thought this file serves more as an example and it does not denote default.
Perfectly fine by me to enable this by default. I did have a concern about the sudden entity for users, but as long as users can disable it, this is ok by me.
The part that is handled inside the class relates only to the differences when the enrichment is enabled.
Do you mean to remove the config completely and hardcode the behaviour in the manager/page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind the part about the flow, we can get better feedback from DP on this. But I do think that this configuration suggests the default – at least when I ran enrich locally, I had to change it to true in order for the new behaviour to work.