Add a new site to the SI ------------------------ This document describes the process for adding another site to the Synthetic Internet. The site can be added to the existing SI VM or to another machine that runs at an appropriate place in the network. --------------- Get the site(s) --------------- First, make sure you have permission to scrape/use the site. Second, use ``wget`` or a similar tool to create a mirror of the site with a command like: wget -m -k -K -E https://www.irs.gov Be prepared that this may take a very long time depending on the site you are fetching. - Next :ref:`dns_update` (most likely on SI) to include entry for the new site ------------------- Hosting on a new VM ------------------- - create/clone a VM to be the web host for the new site, any webserver will do fine - Copy the files to the default location for the system, e.g., /var/www/html for Apache on RHEL type systems - Profit! ------------------------------- Updating SI to include new site ------------------------------- - create a directory under /var/www/html - copy new files to that directory - edit /etc/httpd/conf.d/vhosts.conf .. code-block:: apache ServerName hostname.domain.tld DocumentRoot /path/to/where/the/files/are