Log collection in Khaleesi ========================== Collecting and saving the logs after a job run from the affected machines is an important step. Khaleesi has a playbook dedicated for this process. To collect the logs from the machines, run the ``khaleesi/playbooks/collect_logs.yml`` playbook right after the job run with the same settings and host file. The localhost and the machine called ``host0`` is excluded from the log collection. They are usually long running machines (the slave and if it exists, a virtual host) that have a large amount and mostly irrelevant logs. What files are gathered? ------------------------ Quite a few diagnostic commands are run on the machines (found in the playbook) and then a set of log files collected. If some specific setting used require specific logs to be collected, it's practical to add these files to that specific settings yaml:: job: archive: - /var/foo/*.log - /var/bar/engine.log - /opt/baz/*.xml ... If any file is missing it won't cause the log collection to fail. Methods of gathering logs ------------------------- By default the logs are stored on the machine running ansible in ``khaleesi/collected_files``, each machine's log in a different ``tar.gz`` file. This behavior can be changed by the ``--logs=gzip`` option, which will result in individually gzipping the files instead. Uploading the logs ------------------ When using the gzip method, it's possible to upload the logs on an artifact server. Ideally the logs are then exposed on a HTTP server for online browsing. To set up a new site, add a new option to ``khaleesi/settings/logs/gzip/site``, then use the ``--logs-site`` option when running ksgen. An example site definition looks like:: job: rsync_logs: true rsync_path: myuser@example.com:/opt/artifacts artifact_url: http://example.com/artifacts The ``rsync_path`` should be something that rsync understands as a destination, and the ``artifact_url`` will be used to generate the link to the logs. This method assumes the job runs on a Jenkins server, so the ``$BUILD_TAG`` variable should be set in the environment. After the upload, the logs are deleted from the local machine and a link file is created as ``khaleesi/collected_files/full_logs.html``. This file should be added as an artifact to the Jenkins job definition and can be used as a one click redirect to the job specific artifacts. Setting up an artifact storage ------------------------------ The machine running the job should be able to upload the logs without a password to the artifact server. When using Apache to expose the logs for browsing, the following httpd settings will allow transparent browsing of the gzipped files:: Alias /artifacts /opt/artifacts Options +Indexes RewriteCond %{HTTP:Accept-Encoding} gzip RewriteCond %{LA-U:REQUEST_FILENAME}.gz -f RewriteRule ^(.+)$ $1.gz [L] ForceType text/plain AddDefaultCharset UTF-8 AddEncoding x-gzip gz It's recommended to put these settings to a separate file in the ``/etc/httpd/conf.d`` directory. Pruning old logs ---------------- The artifact directory could grow too big over time, thus it's useful to set up a cron job for deleting obsolete logs. On the artifact server, add a line to /etc/crontab similar to this:: 0 0 * * * rhos-ci find /opt/artifacts -maxdepth 1 -type d -ctime +14 -exec rm -rf {} +; This will delete any artifact directory in /opt/artifacts that is older than 14 days. It's useful to match the Jenkins artifact retention time with the time specified here to avoid broken links in Jenkins. Use in Jenkins job definitions ------------------------------ To use the advanced gzip+upload method, modify your jobs the following way: Add ``--logs=gzip --logs-site=mysite`` to the ksgen invocation of your builder, for example:: ksgen --config-dir settings generate \ --provisioner=manual \ ... --logs=gzip \ --logs-site=downstream \ ... ksgen_settings.yml Add ``**/full_logs.html`` to the list of artifacts:: - publisher: name: default-publishers publishers: - archive: artifacts: '**/collect_logs.txt, **/cleanup.txt, **/nosetests.xml, **/ksgen_settings.yml, **/full_logs.html' After regnerating the jobs, the logs should start appearing on the artifact server. It's practical to match the Jenkins artifact retention time with the artifact server retention time to avoid broken links in Jenkins:: - defaults: name: job-defaults ... logrotate: daysToKeep: 14 artifactDaysToKeep: 14 ...