Tuesday, October 15, 2013

Oracle ZFS analytics capabilities

When Oracle integrated Sun Microsystems a couple of years ago part of the acquired technologies was the ZFS filesystem. ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include protection against data corruption, support for high storage capacities, efficient data compression, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. The ZFS filesystem is (theoretically) capable of holding a maximum of 16 Exbibytes fo data.

As we can see in the Oracle strategy for storage is that they are building and shipping at this moment a number of storage and backup appliances based on the ZFS technology. At this moment they do ship the ZFS storage appliances ZS3-2, ZS3-4, 7120, 7320 and the 7420 and also Oracle is shipping the Sun ZFS backup Appliance.

A lot of exciting technologies are included in the ZFS storage appliances both on hardware level and software level which can help you get performance gains especially when used in combination with Oracle databases. However, often forgotten is that there is a management suite to manage and monitor you storage appliances.

The strategic product roadmap of Oracle, which is not officially communicated, shows that all management and monitoring solutions for both hardware and software should be integrated within the Oracle Enterprise Manager solution or at least (for now) interact with it. For the ZFS storage appliances a plugin is available to include functionality for managing and monitoring ZFS appliances from Oracle Enterprise Manager. You can download the plugin from the Oracle website.

However, next to the integrated way you have a standalone solution for managing and monitoring your ZFS appliances. This solution is holding the ZFS storage appliance analytics which helps tuning your storage to an optimum. The entire analytics solution is based on the dtrace capabilities, this means that a deep core analysis can be done.



In the above video you can see a bot more about the capabilities of the Analytics that you are able to pull out of a ZFS storage appliance and how they can help you in tuning your storage in a more efficient way.

The common analytics that are provided are:
- CPU: Percent utilization
- Cache: ARC accesses
- Cache: L2ARC I/O bytes
- Cache: L2ARC accesses
- Capacity: Capacity bytes used
- Capacity: Capacity percent used
- Capacity: System pool bytes used
- Capacity: System pool percent used
- Data Movement: Shadow migration bytes
- Data Movement: Shadow migration ops
- Data Movement: Shadow migration requests
- Data Movement: NDMP bytes statistics
- Data Movement: NDMP operations statistics
- Data Movement: Replication bytes
- Data Movement: Replication operations
- Disk: Disks
- Disk: I/O bytes
- Disk: I/O operations
- Network: Device bytes
- Network: Interface bytes
- Protocol: SMB operations
- Protocol: Fibre Channel bytes
- Protocol: Fibre Channel operations
- Protocol: FTP bytes
- Protocol: HTTP/WebDAV requests
- Protocol: iSCSI bytes
- Protocol: iSCSI operations
- Protocol: NFSv bytes
- Protocol: NFSv operations
- Protocol: SFTP bytes
- Protocol: SRP bytes
- Protocol: SRP operations

Next to the common analytics there are also a number of things where you can get more detailed and more advanced analytics on;
- CPU: CPUs
- CPU: Kernel spins
- Cache: ARC adaptive parameter
- Cache: ARC evicted bytes
- Cache: ARC size
- Cache: ARC target size
- Cache: DNLC accesses
- Cache: DNLC entries
- Cache: L2ARC errors
- Cache: L2ARC size
- Data Movement: NDMP bytes transferred to/from disk
- Data Movement: NDMP bytes transferred to/from tape
- Data Movement: NDMP file system operations
- Data Movement: NDMP jobs
- Data Movement: Replication latencies
- Disk: Percent utilization
- Disk: ZFS DMU operations
- Disk: ZFS logical I/O bytes
- Disk: ZFS logical I/O operations
- Memory: Dynamic memory usage
- Memory: Kernel memory
- Memory: Kernel memory in use
- Memory: Kernel memory lost to fragmentation
- Network: IP bytes
- Network: IP packets
- Network: TCP bytes
- Network: TCP packets
- System: NSCD backend requests
- System: NSCD operations

Getting all those analytics can be done via the GUI that is provided by Oracle. The mentioned analytics can help you tune your appliance and the way applications are interacting with it. One thing however is of vital importance, that you have a deep understanding of what the figures mean. A good starting guide is analytics guide from Oracle. However, this alone will not be sufficient. When running a mission critical system which is based upon a ZFS storage appliance and you have to deliver the most optimum performance a deep knowledge of ZFS and storage solutions will be needed.

No comments: