Service Management

Last Published December 12, 2018

The service provides a fully managed and supported records management service consisting of:

  • The Records365 platform, which includes a secure managed networking, compute, storage, and platform infrastructure.
  • The Records365 support portal which provides the ability to log and track service incidents, resolution of acknowledged service bugs and processing of enhancements, as well as access to RecordPoint online resources.
  • The background IT service and system management systems which are used to provide issue, problem, incident, change and release management.

Systems Management

A variety of systems are in-place to ensure that the systems that underpin Records365 are properly managed and monitored. These consist of:

  • Help Desk System – System for clients to log support requests and work with RecordPoint support representatives
  • Application Monitoring Tools
    • Realtime monitoring of critical application & infrastructure metrics
    • Realtime alerting based of the metrics collected using the monitoring tooling
    • Historical reporting for trend & capacity analysis
  • Security Monitoring
    • Continuous vulnerability assessments for virtual hosts and virtual networks
    • Realtime monitoring of network traffic and security events for threat analysis and protection
  • Backup Tools – Capabilities such as virtual machine snapshotting and backup of critical storage components
  • Change Tracking System –Platform for raising, evaluating and tracking changes to production systems
  • Source Control System – Platform for engineers to submit and peer review enhancements and bug fixes to the various components that make up the service

Service Monitoring and Capacity Management

The Records365 service is monitored 24 hours per day by our Site Reliability Engineering (SRE) team.

Applications, virtual hosts, networks and storage are continuously monitored to ensure that the service operates within desired service levels. Capacity is monitored to ensure that there is adequate compute, storage and networking resources to support the current service utilisation plus any anticipated growth in utilisation.

We also store anonymous data that allows us to track system trends, user activity, and the technical effects of on-boarding new customers and new releases.

Anonymous usage data is collected for capacity planning and monitoring purposes.

The following table shows indicative response times to proactively monitored events.

Alert Severity Response Typical Restoration Time Typical Resolution Time
Critical 1 hour 2 hours 4 hours
Non-Critical 1 day 8 hours 16 hours