Like the IRQ, the FIQ has a single point of entry from the vector table. You must inspect the interrupt controller and branch based on a bit/number to handle the specific FIQ. For the FIQ, this tends to negate the banked registers as both routines would have to share them. It is possible to have one FIQ routine own the banked registers and the others explicitly save them.
Thecurrent Linux FIQ code supports stacking of FIQ handlers and not multiple simultaneous FIQ sources. Your code can use set_fiq_regs() to initialize FIQ registers. You may assign an interrupt controller base address and have code that inspects the interrupt source and branch to the appropriate handler. Note:, the kernel doesn't provide any communication mechanism with the FIQ. You will have to write your own interlocks. I think that the FIFO implementations should be FIQ safe as well as other lock free kernel patterns.
Edit: Here is a sample of FIQ in the mainline code. It is an IMX SSI driver.
SSI assembler, Symbol interface, main file. FIQ is also known as soft DMA. The FIQ latency is very small and should allow high service frequencies. Usually there is only a single device that needs this attention. You can demultiplex in your handler (branch/function call/pointer on source number). The reason a FIQ is often written in assembler is that performance is implicit if you use this. Also, the FIQ will not normally be masked and it will increase IRQ latency for the rest of the system. Making it faster by coding in assembler reduces the IRQ latency.
See also: FIQ-IRQ difference