Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 Short-term needsBest PracticeOngoing
Sysadmin
  • Walkthrough performing an actual backup recovery. Document the steps and how we determine whether data has been lost.

Note these tasks result in related ongoing maintenance.

  • set up service monitoring
  • set up log analysis
  • Perform risk assessments and business impact analyses (BIA); keep these up-to-date
  • Help design and implement redundancies (e.g. failover server) for needs identified by BIA. Execute redundancies as-needed
  • OS-level updates and upagradesupgrades
  • Security patches / monitoring this space
  • Backup script maintenance
  • AWS expertise
  • Own and maintain deployment scripts
  • Help coordinate and perform large-scale upgrades (e.g. those that require spinning up new boxes and doing switch-overs of drives or DNS entries)
  • Keep tabs on storage use over time and coordinate projections thereof
  • Create and manage SSL certs
  • Manage user (server) accounts
  • Firewall configuration
Grey area: responsibility shared, unclear, or variable 
  • database administration / tuning
  • Integrate Hydra user accounts with CHF LDAP server
  • monitor and benchmark JVM, make heap size, garbage collection adjustments as needed
 
Ops 
  • set up CI server or service
  • set up security filters for incoming / outgoing code
  • modify new ansible project to work with vagrant to create a development environment.
  • configure differences between staging, prod, and test environments in ansible and capistrano

...