The PCI Express hard IP block in Xilinx Virtex-5 and later families provides
a Transaction Layer Packet (TLP) interface for the user (FPGA fabric)
side. The TLP interface has a 64-bit data path and runs at a
frequency dependent upon the number of PCIe lanes: 62.5, 125, or 250
MHz. There are separate receiver and transmitter TLP interfaces and
these have strobes that are used to validate packet data on the
interfaces. These strobes are similar to those of the parallel PCI
bus in that they allow for wait states to be generated on either end
of the interface, the core side or the user side.
TLP Buffering
On the Virtex-5 implementation, there is buffering for a maximum of 8
transmit TLPs for each type: posted, non-posted, and completion. Note
that this amount cannot be exceeded, but can be reduced depending upon
the TLP size. There is a maximum of three 512 byte or seven 256 byte TLPs for
either posted or completion, and a max of eight non-posted TLPs.
PCIe Simulation Test
Xilinx provides a PCI Express simulation model. This model actually
appears to be based upon an instance of the hard IP. Out of the box,
this model has a number of inadequacies. First off, since it models a
serial interface, it is very slow, in simulation speed. Additionally
one must run for a long time in the beginning of a
simulation to complete link training in the core.
Second, there isn't any way to test the
TLP wait states on both sides of the TLP interface. This is a major
drawback since this is essentially the complex part of the TLP interface
to the core. Third, there are
other features which can't easily be tested since the core prevents
these features (such as the Expansion ROM Base Address Register
response, FIFO flow control, and others). And lastly, it isn't a true
pseudo-code model in that read data is not returned to the model (our
version has actually been modified to provide this). This means that
from the stimulus code when a read is performed, you can't test the
code and act on the value.
At Verien, we generated our own TLP pseudo-code behavioral model to
test our PCI Express designs. This has the advantage that it generates
cycles from the TLP interface - not the serial interface, and is
therefore about an order of magnitude faster (and without link training). It also allows wait
states to be tested for both the receiver and transmitter, either for
a fixed amount or randomly applied. Since read data is returned to
the model, the test vectors can be written much like a diagnostic.
Please contact us if you have any questions on
this, or to provide feedback. Thank you!