Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignoring domains? #92

Open
0x646e78 opened this issue May 7, 2020 · 8 comments
Open

Ignoring domains? #92

0x646e78 opened this issue May 7, 2020 · 8 comments
Labels
enhancement New feature or request

Comments

@0x646e78
Copy link

0x646e78 commented May 7, 2020

I'm wondering if there's a mechanism to blacklist the adding of domains and hosts to the db?

For example I don't want google.com or any of their subdomains or hosts added, but these are often picked up in various domain recon activities.

If not, happy to help get this in if it seems useful to others.

@0x646e78
Copy link
Author

0x646e78 commented May 17, 2020

I'm wondering if soemthing like a 'scope' db could be introduced, which could set scoping bounds on the recon tasks. Something like the following

value db type
%gmail.com contacts blacklist
example.com domains whitelist

but then also, do we add a table column, or wrap these up into an abstraction, or just do away with the db column and just have the table colum, that actually might be best.... I'll think it through a bit more and try some things out there.

This could be hooked into the insert function in core/framework.py here

I think it'd be useful to do a LIKE, thus allowing say %mx%google.com in the value field for domains, but this could also get confusing for people and perhaps introduce a bug or two along the way.

Keen on any thoughts about this @lanmaster53 , I'll work on it if it sounds worthwhile.

@0x646e78
Copy link
Author

I've made a first attempt on a branch here: https://github.com/0x646e78/recon-ng/tree/scoping_table

So far matching regex, looking something like this:

[recon-ng][sc][certificate_transparency] > show scope

  +-----------------------------------------------------------------------+
  | rowid |       value       | column |   action  | notes |    module    |
  +-----------------------------------------------------------------------+
  | 1     | .*mx.*google\.com | host   | blacklist |       | user_defined |
  | 2     | .*googlemail\.com | host   | blacklist |       | user_defined |
  +-----------------------------------------------------------------------+

[*] 2 rows returned

I'll open a WIP PR once I'm a but further a long, any suggestions would be great.

@kpcyrd
Copy link

kpcyrd commented May 23, 2020

For some inspiration how this is handled in other projects, feel free to have a look at the autonoscope feature in sn0int: https://sn0int.readthedocs.io/en/stable/autonoscope.html

We have a hierarchical system that allows blacklist/whitelist rules for domains, ips and urls. We're basically doing this with "tree"-style matching. This allows setting up layered rules like:

  • default is accept
  • ignore everything . [blacklist]
  • except if it's .com [whitelist]
  • except if it's example.com [blacklist]
  • except if it's a.b.c.d.example.com [whitelist]

The most specific matching rule wins. To avoid having to exclude all kinds of special characters we don't support wildcards though. It also avoids the problem that .*googlemail\.com would match notgooglemail.com. I think there are advantages/disadvantages in both of them, just wanted to share some other approaches.

@0x646e78
Copy link
Author

@kpcyrd that's a pretty nice approach, will certainly take inpspiration from it. sn0int looks good too, rust based is cool, will take it for a spin. I realised last night after making this comment that I'd left the literal dot from the regex's too, hence that googlemail match ;)

@lanmaster53
Copy link
Owner

lanmaster53 commented Jun 8, 2020

We've been kicking around the idea of a validation system for all harvested data as well (#34). So, for instance, any time Recon-ng tries to write harvested data to the ip_address column of the hosts table, it will validate that it is actually an IP address. Modules return some unexpected stuff when resources change, etc. and can make a real mess of the databse. The reason I mention this, is because this system would tie in closely with that one. Something to think about. Regardless, I'd like to add both of these capabilities.

@lanmaster53 lanmaster53 added the enhancement New feature or request label Jun 8, 2020
@0x646e78
Copy link
Author

0x646e78 commented Jun 9, 2020

Ahhh cool. I was thinking of that sort of thing too. Good to know. I've progressed down the regex path for domains, I appreciate the sn0int approach but I also really like the flexibility of regex matches, and the options afforded that way.

@lanmaster53
Copy link
Owner

Feel free to hop in the slack and collaborate with us on a solution. There's at least one other person that I believe was actively working on a solution. I had worked on some code as well, but I'm just so busy at the moment. Perhaps I'll drop my stuff in a new branch and everyone can start working on that. Thoughts? Interested?

@0x646e78
Copy link
Author

0x646e78 commented Oct 9, 2020

Well the last few months have been tumultuous for me, but have a bit of breathing space to look at this again now.

I've been running using scoping functionality I built in May, and it's been really useful: https://github.com/0x646e78/recon-ng/blob/6b2659762567838889510b92c82e2256ccb9990d/recon/core/framework.py#L666

I'll bring up a discussion in slack in coming days to see if I can get something together that'll work for people.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants