Do not use ripple or gated clocks

From this alteraforum post

Divided clocks and on/off gated clocks are common cases where you can use clock enables or PLLs instead of ripple or gated clocks.

Divided clocks:

You can always avoid using a ripple clock to do a divide-by-n function. If you are writing new HDL, use a PLL or clock enable from the beginning. If you are reusing existing HDL that has a ripple clock, consider changing the design to use a PLL or clock enable.

If the divided clock is fast enough to be driven by a PLL and if a PLL output is available in the design, consider doing the frequency division in a PLL.

If there are synchronous data paths between the full-speed clock domain and the divide-by-n clock domain, skew created by the PLL will be minimized by driving the full-speed clock with a x1 output of the PLL (PLL output at same frequency as input) instead of using the clock input of the PLL to clock the full-speed data registers. With a x1 output of the PLL driving the full-speed clock, the PLL compensation delay in the clock paths will be the same for both the full-speed and divide-by-n domains; the PLL compensation delay will not create skew. If the full-speed clock domain uses the clock input of the PLL, then cross-domain data paths will have clock skew because only the divide-by-n domain will have the PLL compensation delay in the clock paths. Even in that case the PLL implementation is preferred over a ripple clock; the clock skew created by the PLL will potentially have less variation over the range of operating conditions than clock skew created by a ripple clock. (See the post for design guideline #3 for more about clock skew.)

Any divide-by-n function can be implemented with a divide-by-n clock enable instead of dividing the frequency of the actual clock. Clock the registers in the divide-by-n domain with the full-speed clock. Enable the registers with a clock enable that is asserted every nth clock cycle of the full-speed clock. The functional behavior will be the same as having a separate clock domain running at the divide-by-n frequency.

If you are converting existing HDL from a divide-by-n logic-driven clock, it might be less work to convert to a divided clock driven by a PLL than to convert to a divide-by-n clock enable.

A divide-by-n clock enable has some advantages over doing the frequency division in a PLL: A clock enable does not have a lower frequency limit. A clock enable does not introduce jitter, static phase error, or any other cause of clock uncertainty. A clock enable does not consume a PLL resource. A clock enable does not cause clock skew.

For recommended HDL coding styles using clock enables, see the templates in the Quartus II text editor. In version 7.2, the templates are at “Verilog HDL –> Logic –> Registers” and “VHDL –> Logic –> Registers” in the “Insert Template” dialog box.

For either a PLL or a divide-by-n clock enable, the data paths within the divide-by-n domain have n times the full-speed-clock period for setup. For a PLL, the timing analyzer knows the setup requirement based on the PLL output clock period. For a divide-by-n clock enable, use multicycle exceptions so that the timing analyzer can compute the setup requirement. Set the multicycle setup to n. For TimeQuest, set the multicycle hold to n minus 1; for the Classic Timing Analyzer, set the multicycle hold to n. For an example of how to use multicycle exceptions for a clock enable in TimeQuest, see (mirrored here.

No matter what the n value is for a divide-by-n clock enable, the clock enable paths from the clock enable source to the data registers have to operate in a single clock cycle.

If you decide to do the divide-by-n function with a ripple clock despite this design guideline, then tell the timing analyzer that the ripple clock is derived from a base clock with a divide-by-n frequency. In TimeQuest use create_generated_clock.

On/off gated clocks:

Instead of gating a clock to stop it, you can use a clock enable.

A clock enable has advantages over a gated clock: A clock enable does not cause clock skew. No special design considerations are necessary to avoid timing hazards like glitches or runt pulses with a clock enable.

Either a gated clock or a clock enable will reduce power by reducing the toggling of registers, but only a gated clock reduces the power from toggling on the clock network.

Timing closure on clock enable paths:

Clock enable paths have to operate in a single clock cycle. This is the case for a divide-by-n clock enable as well as for a clock enable that provides the functionality of an on/off gated clock.

Because there is a large delay associated with the global buffer, the timing might be better with the clock enable signal using nonglobal routing instead of global even for a high-fan-out clock enable. If the clock enable has a high fan-out, there might be significant interconnect delay from nonglobal routing. However, the significant delay for the global buffer itself might be worse for timing. Usually the biggest advantage of global routing is to minimize skew, and skew does not matter for a clock enable as it does for a clock. Using the “Global Signal” assignment in the Assignment Editor, you can try the clock enable using both global and nonglobal routing to see which has better timing.

Some people fear that it will be hard to meet the timing requirement on clock enable paths for a high-fan-out clock enable that must operate in a single clock cycle. First, even a high fan-out on a clock enable is not likely to be the main culprit if timing closure on these paths is challenging. Second, the clock enable might not be as high a fan-out on a single signal as you would expect from the RTL. Synthesis tools tend to include other logic in the clock enable in addition to what is directly implied by the HDL “if” statement for the RTL clock enable. That’s why you often see a large number of clock enable signals in the “Control Signals” table in the Fitter compilation report.

If you do have a timing problem from the fan-out on a clock enable using nonglobal routing, then replicate the source of the clock enable. There are multiple ways to do this ranging from letting the tools do a brute-force replication without regard to where the clock enable destinations need to be placed to a manual replication in the RTL that groups the destinations according to where they will be placed on the device. Brute-force methods available in the Quartus II software include the “Maximum Fan-Out” assignment in the Assignment Editor and the equivalent maxfan synthesis attribute.

Most likely you will not have a problem meeting the timing requirement on the clock enable paths. Even if you do, the extra work to do something like clock enable replication in the RTL will give you a better design than a ripple or gated clock, especially if you cannot follow design guideline #2 for the derived clock.

Related Information

Most FPGA designs today are largely synchronous circuits controlled by clock networks. These clock networks are typically composed of both externally generated clocks (Absolute Clocks) and internally generated clocks (Gated and Derived Clocks). The focus of this Tech Note is the latter, internally generated clocks. The various types of both Gated Clocks and Derived Clocks are discussed, with recommendations for their implementation in Altera FPGAs, and their timing analysis. Many of the more complex clock networks are difficult if not impossible to time properly in the Classic Timing Analysis engine; however Altera’s TimeQuest Timing Analyzer handles these situations easily when constrained properly. The proper method of designing and constraining internally generated clocks is the main focus of this Tech Note.

Timing_Analysis_of_Internally_Generated_Clocks_in_Timequest_v2.0.pdf (839 KB) Andrew Kohlsmith, 04/24/2012 08:25 PM

Add picture from clipboard (Maximum size: 1 GB)