A step-wise process is used to characterize glycans and understand the functioning of a molecule for biosimilar development.
The structures of protein drugs such as monoclonal antibodies are made more complex by post-translational modifications. The most notable of these is glycosylation, where carbohydrate residues are attached to the protein chain.
The variety of ways in which the monosaccharides in a glycan can be linked leads to a large diversity of structures that can be created from a limited number of building blocks. The pattern cannot be predicted from the genetic code, and the exact nature of the glycosylation depends on the cells and the conditions in which they are cultured. This adds an additional layer of complexity when proving biosimilarity as, even if the amino acid sequence is the same, the glycosylation pattern may be different, potentially altering the protein’s biological properties. It is, therefore, important to use a cell line that produces an appropriate glycosylation profile or the same glycosylation profile as the reference material.
There are two main types of glycosylation commonly found on glycoproteins: N-glycosylation and O-glycosylation. To create N-glycosylation, the sugar residues are attached to the side chain of the amino acid asparagine, the initial step of the process being the transfer of a large, lipid-linked glycan donor to the protein. This immature glycan is composed of three glucose residues, nine mannoses, and two N-acetylglucosamine units. These residues are trimmed back to a core structure and rebuilt using other monosaccharides that are attached via specific glycosyltransferases. This is where much of the variation arises: these transferases are a function of the genetic profile of the cell. Furthermore, the environment of the cell and the rate of protein synthesis will have an impact on the final glycosylation profile, also.
The process for the biosynthesis of O-glycosylation is different. Both serine and threonine residues can be glycosylated. First, a specific transferase attaches the monosaccharide N-acetylgalactosamine to the side chain of the amino acid. This is then extended into relatively short structures, typically only three or four monosaccharides long, using glycosyltransferases, some of which are identical to those involved in N-glycan biosynthesis.
The nature of the glycosylation process can result in significant heterogeneity in glycan structures. Nonetheless the glycans must be characterized in order to understand the functioning of the molecule, assess the outcome of the manufacturing process, and fulfill the requirements of the International Council for Harmonization’s (ICH’s) Q6B, Specifications: Test Procedures and Acceptance Criteria for Biotechnological/Biological Products guidelines (1).
Determining glycosylation patterns
Glycosylation pattern analysis is a step-wise process. As a simple first step, it may be enough to assess the type and amount of each monosaccharide composing the glycans. Typically, this involves breaking the sample down into its individual monosaccharides, which can be identified and quantified by the retention time and fragmentation pattern using gas chromatography-mass spectrometry (GC-MS).
If this information is combined with amino acid analysis to quantitate the amount of protein in the sample, the levels of different monosaccharides per unit protein can be determined. This is a simple test that gives a lot of information. For example, the presence of the monosaccharide N-acetylgalactosamine suggests the presence of O-glycosylation on the molecule. Fucose and galactose are both indicators of complex type glycosylation patterns. Monosaccharide analysis can also give a purity check: high levels of glucose might suggest the breakdown of Sephadex beads, for example, in the purification process, and high mannose levels may indicate yeast contamination.
With this information in hand, the next steps are to study the glycoprotein more closely to determine the structures of the glycans and where they are attached on the protein backbone. The intact glycoprotein must first be processed to make it more amenable to glycan release. To do this, first the disulfide bridges in the cysteine links are split to release free cysteine residues. These disulfide bonds are a key element in creating the 3D-structure of proteins and preventing them from unfolding; clearly, for further analysis, unfolding the protein from its tight globular conformation is advantageous. This is achieved via reduction and subsequent carboxymethylation to block the released thiol groups and prevent refolding.
The next stage is to use a protease to split the protein up into smaller peptide units, and this digestion is typically done using trypsin, but other enzymes can be used if trypsin is not appropriate. The peptides are then treated with the enzyme PNGaseF to cleave off the N-glycans from the molecule. On the rare occasion the protein has been generated in a plant or insect cell line, a different enzyme, PNGaseA, must be used instead. The released N-glycans are then purified. Any O-glycans are still present in the peptide fraction, and are then released via chemical reductive elimination and purified.
Finally, a permethylation procedure is performed to convert all of the free hydroxyl groups in the released carbohydrates into their methoxy derivatives. This leaves released, derivatized carbohydrates, which is advantageous in each of the three mass spectrometry-based analytical techniques used to determine the glycosylation pattern.
In matrix-assisted laser desorption ionization (MALDI)-MS, the use of permethylated glycans facilitates the clear identification of structures that would otherwise be very similar in mass and levels the playing field in terms of ionization, particularly for sialylated species.
Electrospray ionization-MS is used to look at the antennal structures of N-glycans. These are extensions of the core structures produced during biosynthesis. The fragmentation in this technique follows a handful of well-defined pathways, and the methyl groups greatly aid fragment identification. For example, the gal-alpha-gal epitope, which commonly causes immunogenic reactions, gives a signal at 668 Da.
The final analytical stage is GC-MS to identify how the monosaccharides are linked to one another. For this analysis, the methylated glycans are hydrolyzed to their individual monosaccharides, and following several chemical steps, the newly released hydroxyl groups are acetylated. The GC-MS fragmentation patterns are definitive for identification of how the monosaccharide units are linked together.Analytical examples
Figure 1. Matrix-assisted laser desorption ionization-mass spectrometry (MALDI-MS) analysis of released permethylated N-glycans.
An example of a MALDI-MS analysis of N-glycans is shown in Figure 1. This compares the MS traces of the N-glycans released from two different glycoproteins: a classic antibody and a secretory glycoprotein. Antibodies produce glycans that are relatively small and simple compared to other glycoproteins because the two sites of glycosylation on the antibody’s heavy chain are relatively inaccessible to the glycosyltransferases. Although the oligosaccharyltransferase will install the precursor, which is then trimmed back, it will only be rebuilt in a limited way. The result is two or three main structures, in this case the biantennary structures evident in the MS trace, with only relatively small masses.
In contrast, the trace for the secretory glycoprotein shows much larger glycans; this is typical of most non-antibody glycoproteins. The glycosylation sites are more exposed in these molecules, facilitating the ability of the glycosyltransferases to create larger and potentially more heterogenous structures, depending on the repertoire of transferases expressed in the cell. The mammalian cells that are typically used for the culture can produce up to four ‘arms,’, or tetra-antennary structures. These structures can be seen at the high mass (right) side of the upper mass spectrum in Figure 1. Structures with two (biantennary) or three (triantennary) arms can also be formed by the cell and these can be seen in the middle region of the upper mass spectrum in Figure 1.
While structures appear in the figure next to the peaks, it is unlikely that it would be possible to identify exactly what those peaks represent at this stage. While it should be clear that a particular mass represents, say, six hexose residues and five N-acetylhexosamines, their precise nature is not necessarily readily determined from the data. MALDI data gives compositional clues, allowing possibilities that are biosynthetically feasible to be drawn, but drawing definitive structures requires more data.
Figure 2 shows an example of a fragmentation pathway in electrospray ionization. As the carbohydrates have been permethylated, fragmentation is driven down a limited number of pathways, and one of the main pathways, the formation of the A-type oxonium ion, is depicted. This fragmentation mechanism produces a positive charge on the fragment ion, and this represents the non-reducing end of the molecule at which residues such as N-acetylglucosamine, galactose, fucose, or sialic acid are attached to form the antennae. This fragmentation pathway occurs most readily on the reducing side of N-acetylhexosamine or sialic acid species; it does not readily occur on the reducing side of hexose residues.
By combining electrospray ionization with MS or MS/MS analysis in a quadrupole time-of-flight (Q-TOF) mass spectrometer, information can be gleaned about the different structural antennae that are present, allowing structures for the different combinations identified in the MALDI spectra to be pieced together.
Figure 3. Nanospray mass spectrometry-mass spectrometry (MS-MS) analysis (performed on a quadrupole time-of-flight mass spectrometer) showing the fragment ions obtained from the selected mass. Both A-type fragment ions and subfragments generated by elimination of methanol from the A-type ions are observed.
An example MS/MS trace is shown in Figure 3. The instrument has been tuned specifically to one particular mass, which is fragmented, and the trace shows that fragmentation pattern. The use of selected energies in the source of the mass spectrometer allows for the formation of non-biased fragmentation of all the glycans that are present, and it allows compositions for antennal structures to be determined. It is still not definitive--whether a N-acetylhexosamine is N-acetylglucosamine or N-acetylgalactosamine is not clear, for example--but knowledge already built up about the biosynthetic pathway can be used to cut down the number of options.
Linkage analysis is used to determine how the monosaccharides in the glycans are attached to one another.
Figure 4 shows the difference between the GC-MS traces of 1, 2-linked and 1,3,6-linked mannose. The starting point is those methylated structures, which are acid-cleaved to produce monosaccharides with free hydroxyl groups where the links were split. A reduction step converts the hexoses to linear carbohydrate chains to simplify the fragmentation pathway, and if a deuterated reductant is used, C1 in the chain will bear a characteristic deuterium atom in place of the hydrogen atom. This is a useful way of distinguishing the two ends of the chain, particularly if the structure is otherwise symmetrical. Finally, the molecule is acetylated, converting these newly produced hydroxyl groups to acetyl groups and resulting in the formation of so-called partially methylated alditol acetates.
These molecules will fragment down specific pathways in the GC-MS instrument. The methyl and acetyl groups are positioned differently depending on where the original linkages were, resulting in differences in the fragmentation pattern. These represent unique fingerprints for the structures, and the trace can be searched for the key fragmentation ions expected for the different linkages.
Figure 4. Linkage analysis of gas chromatography-mass spectrometry (GC-MS) spectra of 1,2-linked and 1,3,6-linked mannose as their partially methlyated alditol acetate derivatives. The fragment ion profiles generated are unique to different linkages of monosaccharide. GC/EI-MS is gas chromatography/electron impact ionization-mass spectrometry.
The final piece of the jigsaw is to take the MALDI, electrospray, and linkage data, and use them to construct biosynthetically possible glycan structures that meet all of the spectral requirements. It also highlights the potential presence of problematic immunogenic epitopes like gal-alpha-gal, which can be further confirmed by enzymatic digestion and subsequent MALDI-MS analysis of the products of digestion.
Confirmation of biosimilarity
Clearly, confirming biosimilarity runs deeper than simply characterizing structure, and chromatographic profiling of glycans provides a way to produce a pattern of the glycan population of the glycoprotein of interest. One such technique uses fluorescent tagging of released native glycans with 2-Aminobenzamide (2-AB) followed by hydrophilic interaction chromatography (HILIC). Figure 5 shows 2-AB stacked chromatograms for five separate monoclonal antibody N-glycan preparations. The interaction of the glycans with the column matrix depends on the overall structure of the glycans, not just their mass.
The various antennary structures give different retention times, and thus the chromatogram provides a fingerprint profile. If the eluent is first passed through a fluorescence cell, and then into a mass spectrometer, information about the identity can be obtained, as well as the relative quantities of the different glycans in the sample. Chromatograms can then be assessed for similarity.
Figure 5. 2-AB stacked chromatograms of five separate N-glycan preparations of an antibody analyzed using hydrophilic interaction chromatography. The data demonstrate that reproducible glycan profiles are obtained, which can be used as comparators for other batches.
In summary, glycosylation is a post-translational modification present on many proteins of pharmaceutical significance, and as such must be characterized and monitored closely. MS analysis provides detailed structural information on the composition, antennal structure and linkage, while chromatography can be used to determine a unique glycan profile because of the interaction between different structures and the column matrix.
The use of specific proteolytic digestion procedures, in conjunction with chromatographic separation, allows the isolation of glycopeptides, which can then be analyzed using the techniques above to determine the structures of the glycans at individual sites within the molecule.
These are all important factors in proving biosimilarity. It is not enough simply to know what glycans are present on a protein: their positioning is equally important. The profile built up by careful analysis can be used as a reference standard against which batches can be tested to prove that they are, indeed, biosimilar.
1. ICH, Q6B, Specifications: Test Procedures and Acceptance Criteria for Biotechnological/Biological Products (ICH, March 10, 1999.)
All figures are courtesy of the author.