EMC has released ScaleIO 2.0 couple of days ago. More information – https://community.emc.com/docs/DOC-52581
Some new features (source ScaleIO 2.0 release notes):
- Extended MDM cluster – introduces the option of a 5-node MDM cluster, which is able to withstand two points of failure.
- Read Flash Cache (RFcache) – use PCI flash cards and/or SSDs for caching of the HDDs in the SDS.
- User authentication using Active Directory (AD) over LDAP.
- The multiple SDS feature – allows the installation of multiple SDSs on a single Linux or VMware-based server.
- Oscillating failure handling – provides the ability to handle error situations, and to reduce their impact on normal system operation. This feature detects and reports various oscillating failures, in cases when components fail repeatedly and cause unnecessary failovers.
- Instant maintenance mode – allows you to restart a server that hosts an SDS, without initiating data migration or exposing the system to the danger of having only a single copy of data.
- Communication between the ScaleIO system and ESRS (EMC Secure Remote Support) servers is now supported – this feature replaces the call-home mechanism. It allows authorized access to the ScaleIO system for support sessions.
- Authenticate communication between the ScaleIO MDM and SDS components, and between the MDM and external components, using a Public and Private Key (Key-Pair) associated with a certificate – this will allow strong authentication of components associated with a given ScaleIO system. A Certificate Authority certificate or self-signed certificate can be used.
- In-flight checksum protection provided for data reads and writes – this feature addresses errors that change the payload during the transit through the ScaleIO system.
- Performance profiles – predefined settings that affect system performance.
ScaleIO can be downloaded from EMC website – http://www.emc.com/products-solutions/trial-software-download/scaleio.htm. ScaleIO 2.0 supports VMWare (5.5 and 6.0), Linux and Windows.
More info about ScaleIO 2.0 can be found from Chad Sakac blog: http://virtualgeek.typepad.com/virtual_geek/2016/03/scaleio-20-the-march-towards-a-software-defined-future-continues.html
Check out all of my posts about ScaleIO from here.
ScaleIO does not support deduplication and compression natively at the moment. Since ScaleIO can use almost any disk device I decided to test ScaleIO combined with QUADStor storage virtualization software which enables deduplication and compression.
For a test I built a small setup – three CentOS 7 servers with 200GB local disk running QUADStor software and ScaleIO SDS software and one ScaleIO MDM server and ScaleIO client based on Windows Server 2012 R2. On each CentOS server QUADStor was used to create 150GB disk with compression and deduplication enabled. The same 150GB disk was used by ScaleIO SDS as storage device.
To the client machine I presented one 200GB disk. To test the deduplication I copied some iso files to that disk. Below it is visible that my test data resulted almost 2x deduplication ratio. Deduplication ratio is affected by the way ScaleIO works – it distributes data to several nodes. Example: block “A” from “dataset1” will end up on servers “One” and “Two”. Block “A” from “dataset2” will end up on servers “One” and “Three”. On server “One” block “A” will deduplicated since it already had the block but on server “Three” the block “A” will not be deduplicated since it’s unique for this server.
I did not perform any performance test since my test systems were running on single host and on singe SSD drive.
For conclusion I can say that using 3rd party software it is possible to add features to ScaleIO – deduplication, tiering, etc. Mixing and matching different software can add complexity but sometimes the added value makes sense.
Enabling data deduplication in Linux with QUADStor
Speeding up writes for ScaleIO with DRAM
Automatic storage tiering with ScaleIO
In my previous post “Automatic storage tiering with ScaleIO” I described how I used Windows Storage Spaces to add ScaleIO SSD write cache and automatic tiering. But sometimes there is no SSDs available. In this case it is possible to use software that can use DRAM as read and/or write cache – Romex PrimoCache (homepage) and SuperSpeed SuperCache (homepage). Adding even a small amount of DRAM write cache will turn random IO to more sequential IO and with this it will increase the performance of a spinning disk.
Introducing a volatile DRAM as write IO destination should include some careful planning as it increases risk of loosing data. As ScaleIO writes data into two fault domains it is important to minimize chances of simultaneous failures in multiple fault domains. Things to consider – dual power supplies, battery backup, different blade enclosures, different racks and even different server rooms.
In my test I used PrimoCache – 1GB of DRAM write only cache with 5 second deferred write. Deferred write is the key option here – it allows data to reside in the memory for 5 seconds before it is flushed to disk. The deferred write time is configurable from 1 second up to infinity.
With DRAM write cache in front of spinning disk random IO performance increases significantly as IO is captured in to DRAM and then flushed to disk as sequential IO. From the screenshot below it is visible how PrimoCache flushes writes every 5 seconds to disk. Device Detail page in ScaleIO show that average write latency is about half what it is for other two tiered SSD based ScaleIO SDS nodes. Additional option is to add DRAM write cache with deferred write also in front of SSD based solutions to speed up write IO and reduce wear on SSD disks.
Since ScaleIO is software only it allows many different configurations to be combined into single cluster. I have mixed different hardware vendors, hardware generations and operating systems together into single ScaleIO cluster. I recommend everyone try to ScaleIO who is interested of hyper-converged solutions.
Automatic storage tiering with ScaleIO
PrimoCache – Disk caching software for Windows
Since EMC ScaleIO natively doesn’t have automatic storage tiering I decided to try a solution with Windows Server 2012 R2 Storage Spaces tiering and ScaleIO.
I have 3 servers:
- with bunch of local disks – two of the servers have also SSD disks, one has only 10k SAS disks.
- all running VMWare ESXi 5.5
- all disks have VMFS datastore on them
On each of the servers I installed Windows Server 2012 R2 VM to be used as ScaleIO SDS server. On the servers with SSD disks I created a Storage Spaces pool with one 200GB SSD and two 1TB HDDs. Into that pool I created initially one virtual disk – 100GB SSD and 1TB HDD with 1GB Write-Back Cache which I use for ScaleIO SDS.
With this setup most writes will always land on SSD disk and hot blocks eventually will be tiered to SSD giving the solution much better overall performance. I included a screenshot from perfmon to show the how the SSD disk (disk 1) is serving all the IO for the ScaleIO disk E (disk 4). 100% SDD hit rate means that my current working set is smaller than my SSD tier. When checking Device Details in ScaleIO both of these servers have read and write latency below 1 ms.
As you may have notices one my servers did not have any SSD disks. I will soon write how I increased performance of that server.
Related ScaleIO posts
Using a file as a device in ScaleIO
Speeding up writes for ScaleIO with DRAM
In most cases one would use the whole unformatted disk as a device for ScaleIO, but sometimes this is not possible. To use the free space from already partitioned disk ScaleIO sds component has a command line tool called create_file_storage. It is located in the ScaleIO sds component folder. Tool will create a pre-allocated file with specified size to a specified location.
Command to create file:
create_file_storage −−create_file −−size_gb <SIZE_IN_GB> −−file_name <FILE_NAME>
create_file_storage −−create_file −−size_gb 120 −−file_name C:\scaleio_devices\file1.io
When adding a devices for SDS specify the full path of the file as the “Path” for the device.
The “file as device” options allows to start consuming free space for ScaleIO from disks that have existing partitions and existing file systems.