Evolution of the z System Channel – FICON Security, Resiliency, and Performance

This is the 7th blog post by Patty Driever who I highly recommend you follow on Twitter @MainframeIOLady.

To understand what was gained in the move from ESCON to FICON, let’s start with a few comparisons.  ESCON had a limit of 1024 device addresses (unit addresses) per channel; with FICON this was expanded to 16,384 devices.  An ESCON channel was capable of executing 2000-2500 I/Os per second (transferring 4k bytes of data) while the initial FICON channel was capable of completing 6000 such I/Os/second, and those on the latest z13 processors are capable of over 100,000 I/Os/second.  This was achieved with increases in link speed capabilities along with a series of architectural enhancements.  The ESCON 9km distance limit between server and storage controller before bandwidth degradation was observed expanded to beyond 100km with FICON.

In Fibre Channel (FC), a frame is the smallest architected data unit of transmission.  A ‘Sequence’ consists of one or more frames, and contains one of a variety of types of architected ‘Information Units’ (IUs) (such as ‘solicited data’, ‘unsolicited command’, ‘command status’, etc.).  An ‘Exchange’ is used to flow a set of these IUs between an originator and a responder.  Within an exchange an upper level protocol (such as FICON) maps its protocol-specific data blocks to these FC IUs.

Built into FC are means to ensure the integrity of both the link itself and the communications flowing over the link, and built into FICON are additional measures to ensure such integrity.  FC contains a set of primitive sequences that are used when issues are detected with the quality of the received transmission signals.  Information maintained on a Sequence basis (e.g, Sequence ID, Sequence Count, sequence initiative, etc.) is used to ensure in-order delivery of data within an exchange, as well as to detect and recover from out-of-order or missing frame delivery. Similarly within the mapping of the IUs FICON provides additional mechanisms, such as IU numbers and CCW numbers, to associate a response with a particular command sent.   The FC standard itself provides for a Cyclic-Redundancy-Check (CRC) value at the link level to ensure that data as it flows on the link is not corrupted.  FICON adds to that a separate CRC to ensure the integrity of the data at a higher layer….end-to-end between hosts.  Working with industry partners, IBM also added Forward Error Correction (FEC) to the FICON Express 16S features so that bit errors often seen at higher link speeds are more likely to be corrected by technology in port optical transmitters and receivers.  FC also provides an option that fabrics may deploy to provide security measures regarding switch membership in the fabric.  FICON channels require these measures to be in place in any fabric to which they attach.

Other fabric security and resiliency constructs were also carried over into FC/FICON from ESCON, such as Registered State Change Notification (RSCN).  With RSCN, the fabric notifies registered ports (channel or control unit) of events it detects that may have affected the state of one or more other ports in the fabric (e.g. loss of light on a link or a new entity joining the fabric).  Resiliency begins with knowing something went wrong.  Being notified of a ‘remote’ link failure explicitly as soon as it is detected by the fabric avoids having to endure numerous upper level protocol timeouts to operations, determining that the common denominator of the operations is that they all flow over a particular link, thereby deducing that the link must have gone away.

Performance management is another strong focus area in all layers of the FC standard, including at the FICON upper level.  The FC standard provides for link flow control through a buffer-to-buffer credit scheme in which end points on a link (nearest neighbors, e.g. a channel and the attached switch port) can specify the maximum number of frames that a port can transmit across the link before receiving explicit indication that another frame can be transferred.  When a frame is sent, the amount of available credit is reduced by one.  When the frame has been processed on the other side an indication (R_RDY) is sent back to allow the credit to be reclaimed.  With both ends of the link adhering to this scheme, a frame is never sent that the recipient port does not have room to hold, making FC a highly reliable transport.  How many credits a port makes available to its peer has a direct relationship on how far apart they can be while still maintaining the full link frame rate (i.e. whether a 16Gb link can actually operate at full bandwidth).  FICON additionally provides an architectural end-to-end flow control mechanism, called IU-Pacing, which prevents an originator from being able to flood a target port.  Since FICON supports the pipelining of multiple commands and data frames to a control unit, IU Pacing is used to prevent over running the control unit.  A feature referred to as ‘extended distance’ support enables a control unit to dynamically change the pacing count to adjust for distance by sending a new value within a ‘command response’ control IU.

Performance is also a key factor in decisions made within the mainframe channel subsystem technology and architecture.  Improvements in the internal bus between the channel and channel subsystem, the microprocessor technology used in the channel engine, and higher capacity channel subsystem engines have all contributed to increases in throughput and reduced latency of I/O operations.  Performance is also a factor in the design of the channel subsystem algorithm used to dispatch work to the channels, and in the design of the internal mechanisms for exchanging control commands/responses between the channels and channel subsystem.  Architectural enhancements, such as the Multiple Indirect Data Address Word (MIDAW) facility improved the performance of long chains of small blocks of data, providing a more efficient means than earlier ‘data chaining’ architecture.

The most significant growth in both I/O rates and throughput for FICON channels has come from an architectural enhancement known as System z High Performance FICON (zHPF), the topic of the next (and final!) blog entry in this series.

Stay tuned for the final post in this series next week…










One thought on “Evolution of the z System Channel – FICON Security, Resiliency, and Performance

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.