What Goes In Must Come Out
FIFOs are hardware analogs of a sort of reverse stack. Here's
how they work.
Published in Embedded Systems Programming, July 1993
Want to
increase your team's productivity?
Reduce Bugs? Meet deadlines? Take Jack's one day Better Firmware Faster
seminar. You’ll learn how to estimate a schedule
accurately, thwart schedule-killing bugs, manage reuse, build
predictable real-time code, better ways to deal with uniquely
embedded problems like reentrancy, heaps, stacks and hardware drivers,
and much, much more. Jack will be presenting this
seminar in Chicago (April 23, 2008),
Denver (April 25) and
London, UK (May 19). Want
to be your company’s embedded guru? Join us! More info
here.
For hints, tricks and ideas about better ways to build embedded systems,
subscribe to The Embedded
Muse, a free biweekly e- newsletter. No advertising,
just down to earth embedded talk.
Click here to subscribe.
It seems no computer text is complete without a comprehensive
(read "yawn") discussion of simple data structures.
Favorite subjects include stacks, queues and dequeues (sometimes
spelled "deque"). Though these are vitally important
subjects, the chasm that separates hardware and software design
often makes us forget that hardware versions of these concepts
have been around for years.
Few cigar-chomping, whisky guzzling, grizzled old TTL veterans
may recognize the name "queue", but FIFOs, functionally
the same concept, have been a prime part of data communications
devices for decades. FIFO is yet another example of the wonderful
way we techies torture the English language, converting the adjective
phrase "First In First Out" to the noun FIFO. Drifting
around the Embedded Systems Conference, overhearing scraps of
conversation, I could only think of poor Professor Higgins; our
nounized, acronym-ridden speech could make him abandon his Pygmalion
quest in frustration. "70% chance precip in plains of Spain".
LIFOs, Last In First Out buffers, seem to be rarely used in embedded
hardware. Actually, I can't think of a single example, as the
concept seems to violate the nature of real time systems where
you want to respond right now to an event. I'm sure someone will
read this and prove me wrong. Great!
Of course, we all use LIFOs as software stacks, though without
any unique embedded twist. All modern processors give us some
sort of hardware to deal with stacks easily. Did you know RCA's
ancient 1802 did not? The 1802 was the first ultra-low power micro,
which had no inherent subroutine mechanism. The user was forced
to write a bit of nasty code to handle calls and returns. I rather
like the system pioneered by the PDP-11, and later adopted by
Motorola, of providing an auto increment addressing mode so you
can use any register as the stack pointer.
In some cases this is a significant advantage. A single stack,
as provided by the 80x88 and other families, is not always enough.
Years ago I wrote a multitasking compiler targeted at embedded
applications (it was a flop); the doubly recursive expression
parser used 3 stacks to track operands and their modes. The 68000
version was a breeze to code; others were much more trying.
Hardware FIFOs
A surprising number of embedded systems use FIFOs built of hardware,
as opposed to a software implementation. The most common application
seems to be data synchronization. If the processor cannot handle
the incoming data stream, a FIFO may be a logical addition to
queue up the bytes till the CPU is ready for them.
An example is the Compact Disk player. Musical data must be played
at a predefined, exact rate, yet the error correction algorithm,
platter speed variations, etc., all conspire against this. The
FIFOs keep enough data backlogged to insure that despite all of
these problems, time domain fidelity is maintained.
Specialty chip vendors, like IDT and Cyprus, make a wide variety
of hardware FIFOs in chip form. Interestingly enough, most are
9 bit wide, reflecting their wide usage in the data communications
industry (a byte plus parity). Hardware FIFOs are defined by their
width (typically 9 bits), depth (how many 9 bit "words"
can be queued - typically 1k to 4k), and speed.
Conceptually, a FIFO is nothing more than dual port RAM with whose
addresses are created by internal counters. One counter places
bytes sequentially in the array, while another tracks removal.
Unlike a conventional RAM, you can read and write from most FIFOs
at the same time.
The devices are quite simple, consisting of data in and out busses
(permitting simultaneous reads and writes), a write line that
clocks data in, a read line to clock data out, and one or more
flag bits indicating FIFO Full, Empty or sometimes even Half Full.
Each data byte is strobed into the device with the write line.
If the FIFO is 1k deep, than, assuming no reads take place, after
1024 write cycles the FIFO Full flag is asserted and each additional
write generally deletes the oldest data and clocks in the new.
Thus, just like a software FIFO, it keeps only the most recent
1024 data items.
Any read pulls the oldest saved data from the chip. If the device
was full, a single read will clear the FIFO Full flag. In most
applications data is being written and read from the device asynchronously,
the Full, Empty, and Half-Full flags may be toggling erratically
to reflect the chaotic nature of the synchronization between input
and output devices.
Different devices behave in startling different ways. I've used
the IDT 7100 family extensively, and have found that once you
exceed the chip's depth, it does in fact not start to save the
newest data only. In fact, it stops accepting new data entirely.
A close examination of the data sheet shows (always, closely examine
the data sheets - it's surprising what you'll find) that once
the FIFO fills you must read it before writing to it - the read
clears the oldest data out, making room for the new byte. In effect,
they've put part of the burden of making a real FIFO on the user.
It's not too difficult to implement, but requires close attention
to the timing specs.
How do you decide to use a hardware FIFO, as opposed to a purely
software implementation handled in an interrupt service routine?
I generally prefer software solutions, as the recurring costs
(that is, manufacturing costs) are essentially zero, whereas every
piece of hardware detracts from the company's bottom line. I know
of only two situations where a hardware FIFO makes sense in an
embedded system. The first is dealing with high-speed bursts of
data. The second is precise synchronizing of data to time.
Burst Data
High speed data bursts can exceed any CPU's processing capability.
A hardware FIFO is a natural choice if the data arrives (or leaves)
so fast that an interrupt service routine cannot keep up, or would
use too much of the total compute resources available.
But beware! FIFOs are no magic bullet. They are primarily for
dealing with bursts of high speed data. If the processor cannot
keep up with a data stream, than a FIFO will help only if the
line goes idle from time to time. In effect, the FIFO spreads
the data burst out over a longer time frame, giving the CPU a
chance to catch up. Sustained, high speed data, will defeat the
FIFO every time.
No doubt there is some clever way to compute the required depth
of a FIFO as a function of data rate, available processor time
to read the data from the FIFO, and size of the data burst. I'm
reminded of a great example from elementary calculus: the perfect
Martini is 1 part Vermouth to 5 parts Gin. If a frat house starts
pouring x gallons of Gin per minute into a bathtub with a 1 inch
hole in it, and 3 minutes later starts adding Vermouth at y gallons
per minute, when will the mixture be perfect?
A thorough analysis is probably impossible (of the FIFO problem
- the Martini is left as a student exercise), as it will also
be a function of the other interrupts in the system. If the CPU
is shut down periodically handling lots of high priority external
devices, be sure that the FIFO is deep enough to catch all data
arriving during the interrupts-off period.
Certainly, if the maximum number of bytes per second is greater
than the number the processor can handle, no FIFO will be deep
enough to guarantee that every data byte is read.
I've never resorted to a sophisticated mathematical treatment
of this, as it's awfully hard to quantify the behavior of the
software. Instead, I weigh the parameters and try to get a gut
feel for likely depth of the queued data. I know - it's crude,
but so far has been effective. It's important, though, to take
some measurements after the system is complete to see just how
close your estimates are. Look at the Half-Full flag on an oscilloscope
- is it always asserted? This would be a sure sign of marginal
design.
Time Synchronizing
Another important application is precision time synchronizing
of data. Processors cannot measure the exact time of arrival of
data items, as even the fastest interrupt structure has some latency.
In the best of cases, with interrupts always enabled and the data
interrupt given the highest priority, just the uncertainty due
to varying instruction execution times will cause perhaps a microsecond
or more of dither. If the data could arrive during the servicing
of another interrupt, with interrupts off, then this uncertainty
could be much, much longer.
By using multiple FIFOs the processor can use an external time
base to get exact arrival times of the data. Put the data in one
FIFO, and the time base in another. The devices will be organized
logically as one very wide FIFO. The time will be queued exactly
with the data.
The reverse is true as well. In the example of playing a Compact
Disk, any time skew in the data provided to the digital to analog
converters will result in signal distortion. You just cannot predict
with any certainty the exact timing of a CPU, which can resemble
the harried executive responding to many different requests, from
interrupts to DMA to refresh generation to irregular instruction
timing. So, the processor can write data to FIFOs, while a precision
time base clocks data out of them. The FIFO then becomes a metronome-like
conductor, taking erratically timed data and playing it back with
a regular beat.
Design Considerations
It's easy to connect a FIFO to a microprocessor, but be careful
to properly use the FIFO's status flags.
Most designs seem to connect FIFO FULL to an interrupt line. Conceptually
simple, this results in the processor being interrupted only after
the entire buffer is full. If a little extra latency causes a
short delay before the CPU reads the FIFO, then an extra data
byte arriving before the FIFO is read will be lost.
An alternative is using the inverse of FIFO EMPTY. A single byte
arriving will cause the micro to read the FIFO. This has the advantage
of keeping the FIFOs relatively empty, minimizing the chance of
losing data. It also makes the biggest demand on CPU time, generating
interrupts with practically every byte received.
I think it makes more sense to connect FIFO HALF-FULL, if the
signal exists on the FIFOs you've selected, to the interrupt line.
HALF-FULL is a nice compromise, deferring processor cycles until
a reasonable hunk of data is received, yet leaving free buffer
space for more data during the ISR cycles.
Some processors do amazing things to service an interrupt, stacking
addresses and vectoring indirectly all over memory. The ISR itself
no doubt pushes lots of registers, perhaps also preserving other
machine information. If the HALF-FULL line generates the interrupt,
than you have the a priori information that lots of other data
is already queued and waiting for processor time. Save overhead
by making the ISR read the FIFOs until the EMPTY flag is set.
You'll have to connect the EMPTY flag to a parallel port, so the
software can read it, but the increase in performance is well
worth it.
In mission-critical systems it might also make sense to design
a simple circuit that latches the combination of FIFO-FULL and
an incoming new data item. This overflow condition could be disastrous,
and should be signaled to the processor.
Noise
I hate to get back on my noise soapbox, but FIFOs seem to be particularly
sensitive to electronic noise and glitches. Some older FIFOs had
no hysteresis on the write line, so the smallest glitch could
cause erratic extra writes. This was particularly apparent with
wide systems, where several FIFOs were ganged in parallel to achieve
an 18 bit or larger word. A little noise could cause the FIFOs
to get out of sequence, with each one at different fill states.
The 7202 1k by 9 bit FIFOs were particularly subject to input
noise, though newer designs have largely cured these problems.
In one case, a company I know observed a 1 nsec glitch that caused
strange clocking. 1 nsec - that's so fast that most scopes will
not see it. Even though the problem has been largely eliminated
with new silicon, most vendors recommend the use of high quality
multilayer circuit boards and careful transmission line design
practices around the write line.
Conclusion
It's tempting to resort to a hardware FIFO as a alternative to
careful software design. Resist this impulse unless you have a
compelling reason! It's surprising just how often designers add
hardware to make up for either lazy or poor software design.
If you elect to use a software FIFOing scheme, though, model it
like a hardware device. You never know what might get changed
later. Access it via device drivers, perhaps even using a software
interrupt or OS call to queue data in and out. One mantra of the
90's, a fundamental part of object oriented programming, is surely
data abstraction.
Back to home page.
The Ganssle Group
PO Box 38346, Baltimore, MD
21231
Tel: 410-504-6660, Fax: 647-439-1454
Email info@ganssle.com
© 2008 The Ganssle Group
|