Skip to content

Latest commit

 

History

History
6 lines (4 loc) · 549 Bytes

README.md

File metadata and controls

6 lines (4 loc) · 549 Bytes

simple-web-crawler

A simple crawler that takes a URL as input and tries to crawl all the web pages connected to this page and crawl the crawled pages and so on. It is designed in a multithreaded way which allows to run several crawlers concurrently.

  • The user should enter the starting URL to start the Crowling process and the maximum number if pages that he/she wants the crawler to crawl.

  • The crawler will crawl the entered URL, retreive all the URL that the starting page contains and then crawl the retrieved URLs in a recursive manner.