fix(twilio): deduplicate matches to prevent O(N×M) result explosion#4954
fix(twilio): deduplicate matches to prevent O(N×M) result explosion#4954kashifkhan0771 wants to merge 1 commit into
Conversation
| req.SetBasicAuth(apiKey, secret) | ||
|
|
||
| resp, err := client.Do(req) | ||
| resp, err := detectors.DoWithDedup(client, detector_typepb.DetectorType_TwilioApiKey, apiKey+secret, req) |
There was a problem hiding this comment.
This detector falsely sets the detector type of the results as DetectorType_Twilio but you are passing DetectorType_TwilioApiKey as an argument here. Will that create any issues?
There was a problem hiding this comment.
Nope, and that’s actually a good thing. The secrets detected by both detectors can overlap, and we accidentally assigned the same detector type to both. Preserving the actual detector type for the Dedup client won’t change the result - it just ensures that identical secrets coming from both detectors are treated separately instead of overlapping.
MuneebUllahKhan222
left a comment
There was a problem hiding this comment.
LGTM, Just one non blocking comment.
| for apiKey := range uniqueAPIKeys { | ||
| for secret := range uniqueSecrets { | ||
| s1 := detectors.Result{ | ||
| DetectorType: detector_typepb.DetectorType_Twilio, |
There was a problem hiding this comment.
Since we're already making changes here, we can probably update the DetectorType as well.
There was a problem hiding this comment.
Actually, that’s the tricky part. If we update DetectorType now, all previously discovered secrets associated with this detector will become orphaned.
Description:
Both Twilio detectors (twilio, twilioapikey) were emitting O(N×M) duplicate results when scanning. A document with the same SID appearing N times and the same API key M times produced N×M findings with identical secret values.
This fix deduplicates regex matches before the nested loop so only unique (sid, key) pairs are processed. The detectors also adopt
NewClientWithDedup/DoWithDedupto combine concurrent verification requests for the same credential pair into a single network call. Two engine-level issues (dedup condition logic and LRU cache size) are noted in engine.go as known issues for a follow-up pass.Checklist:
make test-community)?make lintthis requires golangci-lint)?Note
Medium Risk
Changes Twilio/TwilioApiKey detectors’ matching/verification flow to reduce duplicate findings and network calls, which could alter result counts and verification behavior in edge cases. Engine changes are comments only but highlight an existing dedup/cache limitation that remains unresolved.
Overview
Fixes Twilio secret scanning to avoid an O(N×M) explosion of duplicate findings by deduplicating regex matches before generating
(sid,key)/(apiKey,secret)pairs.Twilio verifications now use
detectors.NewClientWithDedupplusdetectors.DoWithDedupto coalesce concurrent verification requests for the same credential pair, and simplifies merging verificationExtraDataviamaps.Copy.Adds known-issue notes in
engine.gocalling out the small LRU dedupe cache size and a dedupe condition that still allows duplicates for the same decoder type (no functional engine change in this PR).Reviewed by Cursor Bugbot for commit 186678b. Bugbot is set up for automated code reviews on this repo. Configure here.