Room Crawler System has core function is crawl rooms information form 2 sites “muabannhadat.vn” & “nhadat24h.net” and possible to extend more sites.
The system helps people get huge collection of data about rooms which is selling or renting from some website.
Technology used in project:
- Ruby on Rails, HTML5/CSS3, Boostrap
- Mechanize & Nokogiri gem (lib)
- MongoDB
- Github for sub-version control
Setup development environment:
Setup account basic authentication:
#config/local_env.yml
BASIC_AUTHEN_USERNAME: username
BASIC_AUTHEN_PASSWORD: password
bundle install
Bundler will connect to https://rubygems.org (and any other sources that you declared), and find a list of all of the required gems that meet the requirements you specified
Go to terminal, run below command to crawl rooms from external sites:
#crawl rooms from muabannhadat.vn
rake crawler:rooms:crawl_from_muabannhadat
#crawl rooms nhadat24h.net
rake crawler:rooms:crawl_from_nhadat24h
Check log/crawling_development.log
for crawling log
Start the web server. In rails 5.0, by default Puma is used for web server
rails server
List room view: http://localhost:3000/rooms
Room detail view: http://localhost:3000/rooms/ [:room_id]
Search room with any of 5 conditions:
- provider site
- code
- city or distric or address
- area
- price