Laboratory tests have been used to measure condom quality since the 1930s, but not until the 1980s did scientists begin investigating the ability of these tests to predict condom failure during human use.
This chapter reviews the six published studies and two of the major unpublished studies that have compared laboratory tests with failure in human use. Ideally, laboratory tests would be able to predict condom failure during human use, and hence, could be used to determine which condoms would be most reliable. However, this research area lacks studies of sufficient size and quality to be able to say definitively which laboratory tests best correlate with failure rates in human use.
The published studies have involved a total of only 551 couples, more than half of them in one study (see Table 6-1). The studies do not reach the same conclusions. Moreover, new information about the importance of latex formulation (stress and strain properties), packaging and lubricants is now becoming available. These factors were generally not considered in the studies reviewed here. All of these studies have looked at condom breakage, although recent research has measured condom slippage as well. The term "condom failure" refers to condom breakage or complete slippage.
At this point, in the absence of definitive answers from research, there is no single indicator of potential failure in human use. The most reliable approach is to use several tests, in combination with the age of the condom. When monitoring condoms stored in the field, FHI now tests condoms that have been stored for more than three years using the package integrity, air burst and lubricant quantity tests. The results of these tests in combination can help assess whether condoms in storage are reliable.
Methodological challenges have limited research in this field. Also, user characteristics and condom design can vary, complicating the research process. Many factors can contribute to deterioration of latex over time, including light, excessive heat, high humidity, poor packaging, and the latex formulation. It is difficult in small-scale studies to measure the relative importance of the variables that influence condom breakage in human use. The reader needs to keep this important caveat in mind as the studies are summarized below.
The results of the six published studies are presented here in chronological order, in order to show the evolution of the thinking on these issues. Then two major unpublished studies are reviewed. The chapter concludes with a summary of what is known about the relationship of laboratory tests with performance in human use and what research questions we need to know more about.
It is important to note whether these studies were conducted using condoms from the same lot (also called a batch) or different lots. Condoms within a single lot are assumed to have less variability than condoms from multiple lots. A condom lot can be defined in various ways, using machinery, time of production and/or number of condoms. For example, USAID defines a condom lot as the quantity produced on one production line during a 24-hour period. Lot sizes can vary from 7,200 to more than 500,000 condoms, because of a manufacturer's ability to control the process.
Artificial Aging
To simulate the impact of natural aging of condoms, laboratories typically age condoms artificially in chambers using heat (70° C). They sometimes use ultraviolet (UV) light or gamma radiation for the aging process, but these approaches are not representative of conditions likely to be encountered in natural aging and hence are not specified in any standards for condom testing. After artificial aging in heat chambers, laboratories subject these condoms to the air burst or tensile test, or both.
The first two published studies comparing laboratory test results to breakage in human use were coordinated by PATH and relied substantially on artificially aged condoms (UV light). In a 1980 study, six participants in the U.S. each received 36 condoms from one condom lot, divided into six groups of six condoms each (total of 216 condoms). Each condom group had been exposed to different levels of UV light. The study used the air burst test to measure condom strength of samples from each group before distributing the condoms to the participants. In 1980, no international standard included the air burst test, so the study was partly designed to assess the usefulness of this test. All used condoms were returned for inspection. (Free 1980)
Among the condoms that had deteriorated by less than 25 percent based on the results of the air burst test, not a single one broke during human use. While this was a small sample, the study concluded that the air burst test can effectively measure changes in condom strength. It also concluded that the minimum test standard being used in western laboratories was well above the minimum required for effective use; the study did not report the specific standards that were used.
In a 1986 study, PATH used artificially aged condoms as well as condoms aged in the field. The study had 130 participants in Indonesia use seven condoms each. Three of the condoms were new, but two of these three were artificially aged -- one exposed to UV light for five hours, a second for 10 hours. The other four condoms in each group had been naturally aged under typical tropical conditions for about 42 months. All used condoms were returned and additional air burst testing was conducted with the unbroken condoms. (Free 1986)
While all condom groups had high breakage rates during human use, the naturally aged condoms had by far the highest rate: 49 percent. Those aged by 10 hours of UV light had a breakage rate of about 20 percent. If the air burst test correlated with breakage in human use, one would expect better air burst results among the artificially aged condoms since they had the lower breakage rates in human use. However, this was not the case. The air burst inflation results were slightly lower for the condoms aged by UV light than for those stored in the field.
These findings illustrate how difficult it is in a small study to isolate those variables that could explain the breakage rates. The authors gave three possible explanations for the apparent inconsistency in the data. First, there was a different agent of deterioration between the two batches of condoms: heat/humidity among those stored in the field versus ultraviolet light among those artificially aged. Second, there could have been selective deterioration of the condom tip in the UV light-exposed condoms, compared to a more generalized deterioration throughout the length of the condom in the field-aged group. Third, the field-aged condoms had an average 50 mm lay-flat width, a tighter fit than the 52 mm lay-flat width of the other condom batches, which could have led to more breakage.

Cross-Sectional Study
An FHI study published in 1992 tested condoms from 20 lots that differed in age, storage history and manufacturing dates. (Steiner) Four of the lots were new condoms obtained from four different U.S. manufacturers. The remaining 16 lots were from a single U.S. manufacturer and were recovered from warehouses overseas where they had been sent as part of the USAID family planning commodities program. The study used a "cross-sectional" approach, which means that condoms of different ages were collected at one point in time and tested in the lab as well as in human use. This contrasts with a prospective study design, discussed below, which follows condoms for a prescribed period of time and tests them at various points in time.
All of the condoms were sent to the PATH condom laboratory in Seattle, WA, USA, where random samples were evaluated using the tensile and air burst tests. The remaining condoms were then distributed to 300 study participants in North Carolina. Each couple was instructed to use 20 condoms during vaginal intercourse, one from each of the 20 lots, and to record their experience on data forms. The correlations between the lab tests and the human use breakage rates were then calculated.
Of the 300 couples, 262 completed the study, although some of them did not use all 20 condoms; a total of 4,589 condoms were used. Breakage rates ranged from 3.5 percent of new condoms to 18.6 percent for condoms that were almost seven years old and had been stored under adverse conditions.
The study found that the age of the condom, rather than any of the laboratory tests, was the best predictor of failure during human use, with a correlation coefficient (R2) of 0.92. (The closer the correlation coefficient is to 1.0, the stronger the association between the two factors being measured.)
Several laboratory measurements were also closely related to condom breakage in human use. The percent elongation from the tensile test had an R2 of 0.81. Two outcomes from the air burst test followed: a measurement called condom quality index (CQI) had an R2 of 0.74, and the percent of condoms failing the air burst volume test had an R2 of 0.69. CQI is a mathematical treatment of air burst volume data to assess quality of condom stocks in storage. USAID used this measurement, among others, during the early 1990s to assess the quality of condom stock aged in the field.*
In this study, the high correlation of breakage during use with age may have resulted to some extent from the cross-sectional study design. The 16 lots of condoms from overseas storage were manufactured in different years by the same company and could have had different product attributes. During this time span, manufacturers and researchers learned a lot more about the product attributes that affect breakage, including the relationship of stress and strain formulation properties to oxidation and vulcanization, condom thickness, packaging and other issues discussed in Chapter 4. Hence, the very strong association between condom age and breakage rates may be partially due to the improved quality of the more recently manufactured condoms. That is, lower breakage rates of younger condoms may have been a function of more careful attention to the quality of the finished product rather than a function of a shorter aging period.
An additional problem with the cross-sectional study design was that information was not available on the transport and storage conditions of these condoms throughout their life. This limits the knowledge needed to assess all of the factors that might affect the quality of the condoms in storage.
However, the limitations may be outweighed by two strengths of the cross-sectional study design. It ensured that all lots were tested under similar conditions, because the laboratory tests were done at one point in time at one site. Also, having each couple use a condom from each lot allowed for easy control of variation in sexual practices and user characteristics among couples.
Prospective Study
A study published in 1992 by the Population Council in collaboration with FHI used a prospective study design. Condoms from the same lot were tested in the laboratory in 1988 and shortly thereafter used by study participants in Barbados and St. Lucia, two small Caribbean countries. In 1990, the same lot was retested in the laboratory and used by U.S. participants. In each case, 50 male condom users were recruited at each site. (Russell-Brown)
In both 1988 and 1990, the condom lot passed ASTM requirements for tensile strength and percent elongation prior to the human use tests. It narrowly failed the ISO air burst test in 1988 and barely passed in 1990. The breakage rates during human use in Barbados and St. Lucia in 1988 were 12.9 percent and 10.1 percent, respectively. Two years later, the breakage rate was 6.7 percent among the U.S. participants.
The condom breakage rates in both 1988 and 1990 were unexpectedly high. Hence, the study concluded that existing laboratory tests as used with the current pass/fail standards were either not sufficiently sensitive or not well-defined enough to predict reliably condom performance during human use. The study pointed out that the air burst tests revealed some failing values while the tensile tests did not. Hence, the study concluded that air burst test might be a more accurate indicator of condom quality than the tensile test.
The study also concluded that sexual practices and user characteristics may be important factors in determining condom breakage. The higher breakage rates in the Caribbean countries, even though the condoms were two years younger, suggest that user behaviors or other characteristics are important factors. The study called for more research about behavioral differences among couples (see Chapter 3).
A prospective approach has several weaknesses in studying the correlation of laboratory tests with condom use. The variation in sexual practices and characteristics among couples are not controlled for from one time period to the next, because a new cohort of participants is enrolled at each time interval. Thus, any observed change in the failure rate of a condom lot may be due to a difference in the study population, not because of the characteristics of the condoms. Improvements in laboratory testing equipment and test methods may affect test data comparison over the span of the study as well. Finally, the studies take a long time to conduct, since the condoms are naturally aged.
Alternative Study Designs
During the past decade, PATH, in collaboration with the FDA, conducted a series of studies looking at how well laboratory tests predict shelf life of condoms. (Free 1996) In one of the studies, condoms from a single production lot were stored in warehouses in Pakistan, Thailand and Mexico and sampled periodically for laboratory testing. This aspect of the study incorporated a prospective design; that is, it tested condoms from the same lot at different points in time.
The study compared the mean air burst volume of these condoms, collected prospectively, with the data from the 1992 FHI study, collected cross-sectionally. (Steiner) The air burst volumes of the condoms in the two studies decreased over time in a similar fashion. Although this study did not collect any human-use breakage data, the study nonetheless concluded that the air burst test is a reliable way to detect changes in stored condoms due to shelf vulcanization and oxidation, both of which could reduce resistance to breakage during human use.
A small study published in 1991 used an unusual design to see how the results of the tensile test correlated with human use during anal intercourse. Men attending an STD clinic were provided 48 condoms each and asked to return their first used, unbroken condom and all the condoms that broke. Due to problems with recruitment, only 11 condoms from 10 different men that had been used during anal intercourse could be tested in the laboratory. Six of the condoms were unbroken and five broken. After being gamma-irradiated for infection control, the condoms underwent the tensile test, with results of the broken and unbroken condoms being virtually the same. Although results from such a small study are not statistically significant, the authors concluded that the tensile test was not sufficient to predict strong products. (Gerofi)
Unpublished Studies
In addition to the varying results from the published studies, the results of two unpublished studies by FHI add to the difficulty in reaching a clear conclusion about the predictive value of the laboratory tests.
One study initiated in 1994 examined four lots of condoms of the same age but in two different storage conditions. Two lots were stored in warehouses in Burkina Faso, a West African country with tropical conditions, and two lots were stored in the U.S. under ideal conditions. Of the 100 couples recruited for this study, 76 used one condom each from the four condoms lots provided. The study compared the Condom Quality Index (CQI) and the condom breakage rates among the four lots. It tested the hypothesis that condoms with low CQI values would have higher condom breakage rates. As mentioned earlier, CQI is a numerical rating of condoms derived from a mathematical treatment of air burst volume measurements.
Measured prior to human use, the CQI values were 35 and 41 for the Burkina Faso lots and 60 and 79 for the U.S. lots. One would expect the lot with the lowest CQI to have the highest failure rate. Instead, the lots with CQI's of 35, 60 and 79 all had a failure rate of 5.3 percent, while the lot with a CQI of 41 had a slightly higher rate of 6.3 percent. The study indicated that the CQI alone was not a reliable predictor of condom functionality. (FHI 1994)
A subsequent FHI study sought to assess how well CQI values and other laboratory results predict condom functionality in human use. Three lots of latex condoms, each manufactured in 1990 by a different U.S.-based manufacturer, were evaluated in this study. Each lot was made of a latex formulation specific to its manufacturer. The three lots were divided into five sub-lots and packaged in either plastic or foil and lubricated with silicone or nonoxynol-9. Note that none of the published studies involving human use had attempted to isolate the variables of packaging and lubrication.
These five sub-lots were stored at FHI under ideal conditions and in three different locations in Mexico (Mexico City, Merida and Juarez). At approximately annual intervals, condoms from each sub-lot were sampled for laboratory testing. In addition, 125 couples were recruited annually to use the condoms stored at FHI and in Merida (worst storage condition among the Mexican sites). An analysis of the data after four years found that there was a slight upward trend in the breakage and total failure rates over time. The final report concluded that neither storage condition, packaging nor latex formulation appeared to influence these trends. (FHI 1996)
Conclusion
Which laboratory tests can best predict condom performance during human use? This research area lacks studies of sufficient size and rigor to be able to answer the question adequately. Another recent review of the literature, which included several unpublished studies, also concluded that the available data do not yet present a clear picture. (Enersol nd) While the studies have not provided a definitive answer, they have contributed to the improvement of existing testing standards.
The 1992 FHI study was the first large-scale study that attempted to correlate results from the air burst and tensile tests to breakage rates in use. (Steiner) These two tests were considered the preeminent tests for measuring condom strength. As discussed above, in that study the age of the condom -- not the air burst or tensile tests -- correlated most closely with breakage.
Based on that study, USAID established a policy of monitoring condoms in the field that are two or more years of age and are still stored at control or regional levels of the distribution system. USAID currently assesses the performance of these condoms using the air burst volume measurement, as well as the package integrity and lubricant quantity test. The published studies generally point toward using air burst volume measurements to monitor the natural aging process. The 1996 PATH laboratory study showed the importance of packaging and lubricants in contributing to the stability of condoms stored in adverse conditions.
Currently, as a practical approach for the field, the combination of age with the air burst, package integrity and lubricant tests is used to assess the quality of condoms in storage. This approach can provide some guidance for what combination of tests can be used on newly manufactured condoms to predict breakage problems. But future research should identify more specifically the predictive value of laboratory tests for newly manufactured condoms.
A greater understanding is emerging of how field conditions and human use interact with the chemical and mechanical properties of condoms. For example, recent laboratory studies and some small studies by manufacturers have focused increasingly on the importance of the stress and strain qualities of the latex formulation (see Chapter 4). The type of packaging, package integrity, and the type and quantity of lubricant have also emerged as critical issues to understand.
None of the studies discussed here has totally corroborated another. This lack of corroboration prevents the standards organizations from reaching consensus on which tests and what performance limits to require for new condoms, and has resulted in various standards and specifications being used throughout the world. Hence, regulatory agencies in many countries have been forced to choose which standard to adopt or to create an independent standard. This situation has become problematic, particularly for donor agencies and international manufacturers that must adhere to the regulatory requirements of different governments.
Another variable is the quality of the laboratories used to test condoms. In many laboratories, condom testing is not the primary function and sometimes does not receive the level of attention needed to generate accurate and precise test measurements. Also, some laboratories have poorly designed and poorly maintained test equipment, with inexperienced or unqualified technicians. Only a few manufacturers of air burst test equipment distribute internationally, and the equipment is usually expensive, requiring skilled personnel and extensive preventive maintenance. Because of these costs, many laboratories are forced to design and construct their own test equipment and use unskilled labor. Testing consistency among laboratories has not improved to an acceptable level although some laboratories are working toward more uniformity through internationally sponsored inter-laboratory studies, including the USAID-sponsored CITNET effort (see Chapter 5) and work by the Enersol group in Australia.
Throughout this chapter, we have discussed the relationship between laboratory tests and condom failure during human use. Even if we could implement laboratory tests that are able to identify the most reliable condoms, this is only one step in the many efforts needed to ensure condom effectiveness on a societal level. Aside from condom quality, there are many other factors that determine the effectiveness of condoms, such as non-use and user behaviors (see Chapter 3). A recent overview of the condom literature developed what it called a "condom effectiveness matrix" to provide a way to consider these broader questions. (Spencer)
The predictive ability of existing tests can be summarized by what we know and what research questions we need to know more about.
What we know:
- Latex condoms will degrade rapidly when not properly protected from heat, moisture or ozone.
- Foil packaging is known to be superior to other packaging materials and is rapidly becoming the standard for latex condoms.
- Latex condoms packaged in foil can remain reliable for use in excess of five years if package seals are not compromised and if the condoms have the proper formulation.
- Taken together, air burst, package integrity and lubricant quantity testing are useful in evaluating the reliability of condoms stored in the field.
- There is no single laboratory test that accurately predicts condom performance. Human use studies conducted thus far have not produced results that prove that any single existing laboratory test can be used as a surrogate for performance in use.
What we need to know more about:
- What are the best laboratory test surrogates to use as predictors of performance in human use?
- How do human sexual behaviors affect condom performance in use?
- If properly packaged in foil, will condoms remain stable for five years (or more) in any environment, regardless of formulation?
- Can laboratory tests predict the impact of variations in sexual behaviors on the condom?
- Can we improve the precision among laboratories? Can inter-laboratory studies be designed to accomplish this?
*The method of calculating CQI is explained in Condom Quality Testing Handbook, produced by PATH.
by Eli J. Carter and Markus J. Steiner
References
- Enersol Consulting Engineers, Draft summary of data linking inflation and clinical breakage. Unpublished paper. Enersol, 1996.
- Enersol/Macfarlane Burnet Study. Unpublished study. Enersol, nd.
- FHI. Study to determine condom breakage in human use of four lots of condoms of the same age but different CQI's -- FHI final report. Unpublished paper. Family Health International, 1994.
- FHI. Condom prospective aging study -- human use component. Unpublished paper. Family Health International, 1996.
- Free MJ, Hutchings J, Lubis F, et al. An assessment of burst strength distribution data for monitoring quality of condom stocks in developing countries. Contraception 1986;33:285-99.
- Free MJ, Skiens EW, Morrow MM. Relationship between condom strength and failure during use. Contraception 1980;22:31-37.
- Free MJ, Srisamang V, Vail J, et al. Latex rubber condoms: predicting and extending shelf life. Contraception 1996;53:221-29.
- Gerofi J, Shelley G, Donovan B. A study of the relationship between tensile testing of condoms and breakage in use. Contraception 1991;43:177-85.
- Russell-Brown P, Piedrahita C, Foldesy R et al. Comparison of condom breakage during human use with performance in laboratory testing. Contraception 1992;45:429-37.
- Spencer BE. The Condom Effectiveness Matrix -- An Analytical Tool for Defining Condom Research Priorities. New York, NY: SMPF, Inc. French Publications, 1996.
- Steiner M, Foldesy R, Cole D, et al. Study to determine the correlation between condom breakage in human use and laboratory test results. Contraception 1992;46: 279-88.
Return to table of contents