-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scrubber refactoring #3108
Scrubber refactoring #3108
Conversation
exchange/utils.go
Outdated
@@ -177,48 +172,79 @@ func (rs *requestSplitter) cleanOpenRTBRequests(ctx context.Context, | |||
} | |||
} | |||
|
|||
ipConf := privacy.IPConf{IPV6: auctionReq.Account.Privacy.IPv6Config, IPV4: auctionReq.Account.Privacy.IPv4Config} | |||
|
|||
reqWrapper := cloneBidderReq(bidderRequest.BidRequest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a good compromise. As a follow-up, we'll want to pass the request wrapper all the way down, but that's not possible right now since we lose it in getAuctionBidderRequests
and that's too much work for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Better to refactor it separately.
exchange/utils.go
Outdated
//UFPD | ||
privacy.ScrubDeviceIDs(reqWrapper) | ||
privacy.ScrubUserIDs(reqWrapper) | ||
privacy.ScrubUserExt(reqWrapper, "data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general approach, I recommend calling just one scrubber method and manage the groupings in the scrubber code. This could be ScrubUserFPD
, which per the spec includes:
user.eids, user.ext.data.*, user.data.*, user.{id, buyeruid, yob, gender} and device-specific IDs
Looks like we're missing at least eids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Refactored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add more unit tests for these cases
exchange/utils.go
Outdated
if !auctionPermissions.PassID { | ||
privacy.ScrubDeviceIDs(reqWrapper) | ||
privacy.ScrubUserDemographics(reqWrapper) | ||
privacy.ScrubUserExt(reqWrapper, "eids") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider calling something like privacy.ScrubGdprID
and move the grouping into scrubber code. Might we be good with calling ScrubUserFPD
here instead? What would be the difference in information removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to the separate functions. It looks cleaner to me when we have separate function per activity in this case, rather than combining then into one, despite they are very similar to each other. I'm open to refactor it to one function.
var newReq *openrtb2.BidRequest | ||
newReq = ptrutil.Clone(req) | ||
|
||
if req.User != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use ortb.CloneUser
instead.
} | ||
} | ||
|
||
if req.Device != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add ortb.CloneDevice
and call it from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no CloneDevice
and CloneSource
functions. Is there a reason not to have them? Should I add them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clone functions added
} | ||
} | ||
|
||
if req.Source != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add ortb.CloneSource
and call it from here.
var newReq *openrtb2.BidRequest | ||
newReq = ptrutil.Clone(req) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these lines be rewritten as newReq := ptrutil.Clone(req)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be, yes, but IDE will miss the variable type (because it's generic after Clone) and underlines follow up code to red. We discussed it in one of the previous PRs. It's ok to leave the variable instantiation, it doesn't take memory and CPU and makes IDE happy.
exchange/utils.go
Outdated
var userCopy *openrtb2.User | ||
userCopy = ptrutil.Clone(req.User) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these lines be rewritten as userCopy := ptrutil.Clone(req.User)
. I see a similar pattern throughout this function as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewritten to userCopy := ortb.CloneUser(req.User)
I didn't know this function exists.
privacy/scrubber.go
Outdated
func ScrubDeviceIP(reqWrapper *openrtb_ext.RequestWrapper, ipConf IPConf) { | ||
if reqWrapper.Device != nil { | ||
reqWrapper.Device.IP = scrubIP(reqWrapper.Device.IP, ipConf.IPV4.AnonKeepBits, iputil.IPv4BitSize) | ||
reqWrapper.Device.IPv6 = scrubIP(reqWrapper.Device.IPv6, ipConf.IPV6.AnonKeepBits, iputil.IPv6BitSize) | ||
} | ||
} | ||
|
||
userCopy := *user | ||
|
||
if strategy == ScrubStrategyUserIDAndDemographic { | ||
userCopy.BuyerUID = "" | ||
userCopy.ID = "" | ||
userCopy.Ext = scrubExtIDs(userCopy.Ext, "eids") | ||
userCopy.Yob = 0 | ||
userCopy.Gender = "" | ||
} | ||
func ScrubDeviceIDsIPsUserDemoExt(reqWrapper *openrtb_ext.RequestWrapper, ipConf IPConf, fieldName string, scrubFullGeo bool) { | ||
ScrubDeviceIDs(reqWrapper) | ||
ScrubDeviceIP(reqWrapper, ipConf) | ||
ScrubUserDemographics(reqWrapper) | ||
ScrubUserExt(reqWrapper, fieldName) | ||
|
||
switch geo { | ||
case ScrubStrategyGeoFull: | ||
userCopy.Geo = scrubGeoFull(user.Geo) | ||
case ScrubStrategyGeoReducedPrecision: | ||
userCopy.Geo = scrubGeoPrecision(user.Geo) | ||
if scrubFullGeo { | ||
ScrubGeoFull(reqWrapper) | ||
} else { | ||
ScrubGEO(reqWrapper) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a lack of test coverage here. Is this out of the scope of this PR to update this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More unit tests are added. Consolidated scrubber functions like ScrubUserFPD
are covered by integration tests in utils_test.go
-> TestCleanOpenRTBRequestsActivities
Prebid Server 2.0 has been released and Go Module name has changed from Please merge the |
# Conflicts: # exchange/utils.go # exchange/utils_test.go # privacy/enforcement.go # privacy/scrubber.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, left test coverage and spelling comments
privacy/scrubber.go
Outdated
func (scrubber) ScrubUser(user *openrtb2.User, strategy ScrubStrategyUser, geo ScrubStrategyGeo) *openrtb2.User { | ||
if user == nil { | ||
return nil | ||
func ScrubDeviceIP(reqWrapper *openrtb_ext.RequestWrapper, ipConf IPConf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests to cover this function?
|
||
userCopy := *user | ||
func ScrubDeviceIDsIPsUserDemoExt(reqWrapper *openrtb_ext.RequestWrapper, ipConf IPConf, fieldName string, scrubFullGeo bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests to cover this function?
case ScrubStrategyGeoReducedPrecision: | ||
userCopy.Geo = scrubGeoPrecision(user.Geo) | ||
} | ||
func ScrubUserFPD(reqWrapper *openrtb_ext.RequestWrapper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests to cover this function?
privacy/scrubber.go
Outdated
func ScrubGdprID(reqWrapper *openrtb_ext.RequestWrapper) { | ||
ScrubDeviceIDs(reqWrapper) | ||
ScrubUserDemographics(reqWrapper) | ||
ScrubUserExt(reqWrapper, "eids") | ||
} | ||
|
||
func ScrubGeoAndDeviceIP(reqWrapper *openrtb_ext.RequestWrapper, ipConf IPConf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests to cover these functions?
ortb/clone.go
Outdated
return &c | ||
} | ||
|
||
func CloseSupplyChainNodes(s []openrtb2.SupplyChainNode) []openrtb2.SupplyChainNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a type in this function name?
CloseSupplyChainNodes
--> CloneSupplyChainNodes
These functions are covered from utils_test.go. |
I generally agree with this approach. Even though these functions are public and I prefer to have unit tests for functions of this kind, most of the complex logic lives in private functions which are thoroughly tested in their own unit tests. There are also adequate tests in |
privacy/scrubber.go
Outdated
delete(userExtParsed, "data") | ||
userExtModified = true | ||
} | ||
func ScrubUserExt(reqWrapper *openrtb_ext.RequestWrapper, fieldName string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a private function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
privacy/scrubber.go
Outdated
if userExtModified { | ||
userExt, _ := jsonutil.Marshal(userExtParsed) | ||
userCopy.Ext = userExt | ||
func ScrubEids(reqWrapper *openrtb_ext.RequestWrapper) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: ScrubEIDs
Edit: This function isn't used. Do we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a function to support transmitEids
activity, it will be added in the next PR.
Renamed
privacy/scrubber.go
Outdated
if reqWrapper.Device != nil { | ||
if reqWrapper.Device.Geo != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: flatten by combining these two if
statements with &&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, refactored
privacy/scrubber.go
Outdated
type scrubber struct { | ||
ipV6 config.IPv6 | ||
ipV4 config.IPv4 | ||
func ScrubDeviceIDs(reqWrapper *openrtb_ext.RequestWrapper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about making some of these functions private that are meant to only be used within this package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. All functions that are used in this package only are now private
ortb/clone.go
Outdated
mobileCopy := &s.Mobile | ||
c.Mobile = *mobileCopy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I think it is more idiomatic to write this as such:
mobileCopy := *s.Mobile
c.Mobile = &mobileCopy
They should be functionally equivalent but typically when I see a variable named Copy
I think it is the value that has been copied and not the address. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, modified, good point
ortb/clone_test.go
Outdated
TID: "Tid", | ||
PChain: "PChain", | ||
SChain: &openrtb2.SupplyChain{ | ||
Complete: 1, Nodes: []openrtb2.SupplyChainNode{{ASI: "asi", Ext: json.RawMessage(`{"anyField":1}`)}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want the populated ext to be openrtb2.SupplyChain.Ext
instead of openrtb2.SupplyChainNode.Ext
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, refactored
ortb/clone_test.go
Outdated
|
||
t.Run("populated", func(t *testing.T) { | ||
given := &openrtb2.SupplyChain{ | ||
Complete: 1, Nodes: []openrtb2.SupplyChainNode{{ASI: "asi", Ext: json.RawMessage(`{"anyField":1}`)}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want the populated ext to be openrtb2.SupplyChain.Ext
instead of openrtb2.SupplyChainNode.Ext
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this change was made.
ortb/clone_test.go
Outdated
}) | ||
} | ||
|
||
func TestCloseSupplyChainNodes(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: TestCloneSupplyChainNodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
assert.NotSame(t, given[1], result[1], "item1-pointer") | ||
assert.NotSame(t, given[1].Ext, result[1].Ext, "item1-pointer-ext") | ||
assert.NotSame(t, given[1].HP, result[1].HP, "item1-pointer-hp") | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing a couple of asserts here:
assert.NotSame(t, given[2].Ext, result[2].Ext, "item2-pointer-ext")
assert.NotSame(t, given[2].HP, result[2].HP, "item2-pointer-hp")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added
return &dntCopy | ||
} | ||
|
||
func CloneUserAgent(s *openrtb2.UserAgent) *openrtb2.UserAgent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a unit test for this function. It appears to be an oversight as I see all other functions have their own dedicated unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, good catch! Sorry I don't know why I didn't add this test.
privacy/scrubber_test.go
Outdated
expectedDevice: &openrtb2.Device{Geo: nil}, | ||
}, | ||
{ | ||
name: "with_user_geo", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "with_device_geo"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, renamed
assert.NotSame(t, given[0], result[0], "item0-pointer") | ||
assert.NotSame(t, given[0].Ext, result[0].Ext, "item0-pointer-ext") | ||
assert.NotSame(t, given[1], result[1], "item1-pointer") | ||
assert.NotSame(t, given[1].Ext, result[1].Ext, "item1-pointer-ext") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you missed making this change here.
ortb/clone_test.go
Outdated
|
||
t.Run("populated", func(t *testing.T) { | ||
given := &openrtb2.SupplyChain{ | ||
Complete: 1, Nodes: []openrtb2.SupplyChainNode{{ASI: "asi", Ext: json.RawMessage(`{"anyField":1}`)}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this change was made.
analytics/build/build.go
Outdated
if !ac.Allow(privacy.ActivityTransmitUserFPD, component, privacy.ActivityRequest{}) { | ||
// scrub UFPD from ao.RequestWrapper | ||
} | ||
if !ac.Allow(privacy.ActivityTransmitPreciseGeo, component, privacy.ActivityRequest{}) { | ||
// scrub geo from ao.RequestWrapper | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to add this in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a part of a future PR remained in my local changes, yes, removed.
ortb/clone_test.go
Outdated
assert.NotSame(t, given.Architecture, result.Architecture, "architecture") | ||
assert.NotSame(t, given.Bitness, result.Bitness, "bitness") | ||
assert.NotSame(t, given.Model, result.Model, "model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Can we delete these three assert statements since these fields are just string values? I think we only need the NotSame
asserts for pointers right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. Yes non-pointers should be copied correctly as a part of shallow copy
assert.NotSame(t, given, result, "pointer") | ||
assert.NotSame(t, given.SChain, result.SChain, "schain") | ||
assert.NotSame(t, given.SChain.Ext, result.SChain.Ext, "schain.ext") | ||
assert.NotSame(t, given.Ext, result.Ext, "ext") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add assert.NotSame(t, given.SChain.Nodes[0].Ext, result.SChain.Nodes[0].Ext, "schain.nodes.ext")
here?
No description provided.