Exactly defined molecular weight poly(ethylene glycol) allows for facile identification of PEGylation sites on proteins | Nature Communications

News

HomeHome / News / Exactly defined molecular weight poly(ethylene glycol) allows for facile identification of PEGylation sites on proteins | Nature Communications

Nov 13, 2024

Exactly defined molecular weight poly(ethylene glycol) allows for facile identification of PEGylation sites on proteins | Nature Communications

Nature Communications volume 15, Article number: 9814 (2024) Cite this article Metrics details PEGylation (the covalent attachment of one or more poly(ethylene glycol) (PEG) units to a therapeutic) is

Nature Communications volume 15, Article number: 9814 (2024) Cite this article

Metrics details

PEGylation (the covalent attachment of one or more poly(ethylene glycol) (PEG) units to a therapeutic) is a well-established technique in the pharmaceutical industry to increase blood-residence time and decrease immunogenicity. A challenging aspect of PEGylation is the dispersity of PEGylation agents, which results in batch-to-batch variations and analytical limitations. Herein, we present an approach to overcome these limitations by manufacturing a defined molecular weight (dispersity-free) PEGylation agent. We synthesise a defined molecular weight (Mw), linear 5 kDa methoxy-PEG (mPEG) active ester in an efficient and scalable manner using an iterative liquid-phase approach based on Nanostar Sieving. We then perform a comparative study on the random PEGylation and subsequent characterisation of the protein bovine serum albumin (BSA), using both the defined Mw, dispersity-free mPEG active ester, and a commercially available disperse 5 kDa mPEG active ester. We demonstrate that the defined Mw PEG both allows for facile monitoring of chemical modification reactions during the synthesis of the PEGylation agents, and facilitates straightforward identification of the PEGylated fragments within a PEGylated protein via a simple peptide mapping approach using UPLC-MS.

Since its discovery by Abuchowski et al. in 19771, the masking of small drugs, proteins or peptides with linear or branched poly(ethylene glycol) (PEG) has become a well-established method for increasing the blood-residence time of therapeutics, and reducing their immunogenicity2,3,4,5,6,7,8,9,10. These effects are a consequence of the extremely hydrophilic character of PEG (resulting in a high hydrodynamic radius), as well as the stealth effect5,6,11. This phenomenon provides a protective barrier for polymer-modified entities by making them less accessible to blood components such as antibodies or enzymes. There are 40 PEGylated pharmaceuticals approved by the FDA (as of 18/12/2023), 31 of which are based on proteins.

Despite the oxymoron inherent in the term “monodisperse”, it is commonly used to describe commercially available PEGylation agents, although all currently used PEGylation agents show some degree of dispersity. The disperse nature of these PEGylation agents results in batch-to-batch variations and difficulties in purification and characterisation of the PEGylated therapeutics8,12,13,14. This is a result of compromised chromatographic behaviour and unclear mass spectrometric analysis. These problems become more pronounced both with the increasing molecular weight of the PEG and with an increasing number of PEGamers (species with varying numbers of pendant PEG molecules) or positional isomers incorporated in the sample. Despite the availability of advanced analytical methods, the exact binding site of the PEG in a PEGylated protein often remains unclear, or PEG-related impurities remain unidentified13,15,16,17.

Even if a protein is selectively mono-PEGylated at its N-terminus, the analytical confirmation is often complicated. An example of these analytical difficulties can be found in the FDA product quality review of the therapeutic Fulphila, a biosimilar of the blockbuster drug Neulasta (pegfilgrastim), approved by the FDA in 201815. This therapeutic is widely used to treat neutropenia and is composed of the 18 kDa protein filgrastim that is selectively mono-PEGylated with a 20 kDa linear mPEG aldehyde. In the FDA product quality review, the analytical similarities between the original U.S.-licensed drug Neulasta and the new biosimilar Fulphila (MYL-1401H) are compared. It is shown that the average PEG molecular weights of the two samples are not identical, despite 20 kDa PEG being used for both products. Different analytical methods aiming to confirm the presence and purity of the N-terminally PEGylated protein species remain inconclusive, due to the PEG residues dispersity, which results in unclear mass spectrometric data and impeded chromatographic behaviour. A series of three different peptide mapping approaches, MALDI-TOF and MS-MS experiments was needed to obtain strong enough evidence to confirm the PEGylation of N-terminus. Moreover, heterogeneity of the currently used PEGylation agents not only results in batch-to-batch variations, but also presents difficulties during the purification and characterisation of PEGylated proteins.

Using defined PEG instead of the currently used disperse derivatives would overcome these analytical problems and could have a substantial impact on the overall purity and quality of PEGylated therapeutics. Furthermore, we postulate that therapeutics PEGylated with defined PEG could potentially exhibit an improved biodistribution profile compared to the disperse derivatives. The correlation curve of PEG half-life in blood and the PEG molecular weight is particularly steep between 0 and 40 kDa18, which is in the range of most reported PEGylation agents. This suggests that using defined PEGylation agents could result in a decreased volume of distribution, which might have a significant impact on the dosing controllability, and the efficiency of the PEGylated therapeutic.

Because of this, there has been substantial interest in producing dispersity-free PEGs. PEG is conventionally synthesised via the anionic ring opening polymerisation of ethylene oxide, which results in a Gaussian distribution of PEG of different chain length19. One approach to generating dispersity-free PEGs is to chromatographically purify disperse PEG, but the limits of this approach arise at molecular weights where it becomes difficult to separate Egn from Egn+1. For example, low-pressure sample displacement chromatography has been used to increase the homogeneity of low-molecular weight, disperse PEG and that way reach PEG of high chain length purity20,21. The longest defined Mw PEG derivative available from reported is a methoxy-PEG alcohol with 45 Eg repeating units (about 2000 g mol−1)22.

A second approach is to iteratively couple dispersity-free PEG oligomers (Egn) to each other. Different iterative approaches have been developed, with the early attempts exclusively relying on bidirectional growth methods resulting in homofunctional products up to Eg4823,24,25,26,27,28,29,30. Because most PEGylation applications require heterobifunctional PEGs of higher molecular weight (5 kDa and higher), an increased focus has been placed on the development of desymmetrisation techniques and unidirectional growth methods30,31,32,33,34,35,36,37,38,39,40,41. In 2014 our group presented a unidirectional synthesis approach, in which PEG chains were iteratively grown from a 3-armed, benzylic nanostar (Hub) using an Eg8 building block. At the time, this was the longest defined, heterobifunctional PEG derivative (Eg56), and was synthesised in a 12% overall yield on a gram scale33,34. Alternatives to the conventional orthogonal protecting group chemistries have been developed, for example, an approach based on macrocyclic sulfates, resulting in the longest yet defined heterobifunctional PEG with 64 ethylene glycol repeating units37. However, these approaches rely on tedious chromatographic techniques for separating the growing PEG molecule from the reaction debris after each chain elongation step. Problems include low yields, increasing separation difficulties with increasing chain length, and poor scalability.

To synthesise defined, heterobifunctional PEG building blocks up to Eg16 in high overall yield (62% for HO-Eg16-OTs), KINBARA et al. recently presented a method solely based on extraction, which has the potential to be deployed at an industrial scale39. Nevertheless, this method is only applicable up to a chain length of 16 Eg units and therefore only allows for the large scale synthesis of shorter building blocks. Different solid-phase approaches have also been published35,40,41. The most recent approach was an automated process that combined a peptide synthesiser, Wang resin and a base-labile protecting group41. This reduced the overall workload for every chain extension step but has only limited scalability, and slow kinetics due to the solid phase reactions.

An approach based on organic solvent nanofiltration (OSN) has previously been used to synthesise low molecular weight, multi-functional, sequence-defined polymers via a combination of liquid-phase synthesis and molecular sieving using poly(benzimidazole) (PBI) membranes42. This method involved the use of small, PEG-based building blocks (up to Eg4) and produced polymers up to a weight of 2.5 kDa. A similar OSN-based method was shown to be effective in producing highly pure peptides in the liquid phase, also resulting in relatively low molecular weights (up to ten amino acids)43.

Recently, Wang et al. have presented a solid-phase process to synthesise defined Mw oligo(ethylene glycol) polyamides with a Mw higher than 4 kDa, which have PEG-like characteristics, but do not exhibit a pure ethylene glycol backbone44.

To the best of the authors knowledge, there is no work yet that investigates the potential benefits of using defined PEGylation agents in comparison to the currently used disperse derivatives. This is because PEGylation agents require a molecular weight of at least 5 kDa, whereas the viable synthesis routes for defined PEG have only succeeded in producing defined PEG with molecular weight of less than 2.5 kDa. Therefore, in this work we describe the iterative, chromatography-free synthesis of a defined Mw (dispersity-free), 5 kDa methoxy PEG-alcohol (mPEG-OH), employing Nanostar Sieving. We also show that a similar synthesis using chromatographic separation results in only half the yield of the membrane-based approach, precluding advancing the synthesis above 5 kDa via chromatography.

The defined 5 kDa mPEG-OH was converted into a mPEG-succinimidyl propionate (mPEG-SP5135, Mw = 5135.11 g mol−1), a PEGylation agent active towards primary amine groups. This was subsequently used to randomly PEGylate native bovine serum albumin (BSA, Fig. 1). The same reaction was performed with a commercially available, disperse derivative. The mono-PEGylated derivative (PEGamer) was then isolated from both reaction mixtures using size exclusion chromatography and its purity was confirmed via MALDI-TOF. Both mono-PEG BSA samples were enzymatically digested with Trypsin, resulting in a mixture of peptide and PEG-peptide fragments, which were separated and analysed using UPLC-MS. The chain length homogeneity of the defined PEG allowed for separation and exact mass determination of the different PEG-peptide fragments. The exact sequences of the PEGylated fragments were identified by calculating the differences between the molecular weights of PEG-peptide fragments and the attached PEG residue and comparing those masses with a peptide mapping online database (ProteinProspector v 6.4.5). These results show that defined PEGs are a powerful tool to easily identify the PEGylated regions within a protein using relatively straightforward techniques. Furthermore, the general characterisation of the PEGylation agent, as well as the PEGylated therapeutic is more straightforward than for disperse agents. Together these findings can be used to increase the overall quality of PEGylated therapeutics.

a Conventional PEGylation: uses disperse PEGylation agent. Random PEGylation of the primary amine groups of BSA, using disperse 5 kDa linear methoxy PEG-succinimidyl propionate (mPEG-SP) (i). Size-exclusion chromatography (SEC) to isolate the mono-PEG BSA isomers (ii). Digestion: enzymatic digestion of the mono-PEG BSA results in a mixture of PEGylated and un-PEGylated peptide fragments (iii). The different peptide/ PEG-peptide species are then separated/ analysed via UPLC-MS (iv). In case of the disperse PEG the resulting data cannot be interpreted. b In this work a defined, 5 kDa PEGylation agent is used to randomly PEGylate BSA. After steps (i–iii), in-line ESI-MS allows for extraction of the masses for individual PEG-peptide fragments. The masses of the PEGylated peptide fragments can be calculated and (equivalent to a standard peptide mapping procedure) these can then be compared to an online digestion database to identify the PEGylated fragments.

The Nanostar Sieving approach for the synthesis of defined, linear 5 kDa PEG used a three-arm benzylic hub molecule42, with an Eg28 oligomer attached to each arm via a cleavable linker. Subsequent synthesis of the defined PEG 5 kDa (Eg112) was based on the repetition of a three step cycle, consisting of: chain extension (Fig. 2a, i), deprotection (Fig. 2a, ii) and diafiltration (Fig. 2a, iii). The chain extension reaction was based on a Williamson etherification reaction to couple a single PEG-based building block (of length Egn) per cycle to the growing PEG chain, which is attached to the 3-armed, benzylic hub. An Eg28 oligomer was used as the building block as this was the highest Mw of defined PEG that was commercially available; we could equally have started with an oligomer for which n < 28, but this would have required a greater number of iterative steps to reach n = 112. Using the Eg28 building block, three cycles were necessary to go from Hub-Eg28-OH to the final length of Hub-Eg112-OH. Following methylation (Fig. 2b, v) and cleavage (Fig. 2b, vi) the corresponding defined 5 kDa mPEG-OH (Mw= 4965.98 g mol−1, mPEG-OH4966) was reached. Clean and complete reactions are crucial for producing a high purity final product, since any impurities that are formed via reaction and which are incorporated into the growing PEG-nanostar cannot be removed through membrane filtration. The building blocks were equipped with a good leaving group at one end, to allow for rapid and complete reaction with the terminal -OH. Furthermore, a stable protecting group was added to the other end, which withstands the chain extension conditions to avoid higher-molecular weight addition impurities ( + Egn), but which can be removed after complete chain extension without any side reactions. In our work, p-toluene sulfonate (Tos or LG in Fig. 2) was chosen as the leaving group and 4,4’-dimethoxytrityl (Dmtr or PG in Fig. 2) as the protecting group. The latter can be easily removed using dichloroacetic acid. Incomplete deprotection at any stage would reduce chain-length purity due to deletion impurities (-Eg28, -Eg56, -Eg84). Chemical modifications performed on the PEG-nanostar (i.e., the chain extension or deprotection) could be followed through UPLC-MS, because the nanostar has a unique absorption maximum at 290 nm and the PEG’s have a defined mass which enables identification through UPLC-MS. It is important to note that the building block (DmtrO-Eg28-OTs) can react with itself to form a dimer species (DmtrO-Eg56-ODmtr) due to the inevitable ingress of traces of water during etherification. This was present as the corresponding diol after deprotection and is the largest size (and hence the separation-limiting) species that must be removed during diafiltration. Therefore, the chain extension reaction was monitored through UPLC-MS (example shown in Supplementary Figs. 27 and 28) and quenched as soon as it reached completion, to keep the amount of dimer impurity as low as possible.

a Chain elongation cycle consisting of the coupling of the building block (DmtrO-Eg28-OTs) to the free nanostar-PEG alcohol (chain extension, (i)), removal of the protecting group (deprotection, (ii)) and diafiltration (iii). Repeating this cycle three times results in the nanostar-Eg112-OH species. b The terminal hydroxyls are then methylated (v) and subsequently the PEG chains are cleaved from the nanostar (vi) to obtain defined mPEG-OH with a molecular weight of 4966 Da (mPEG-OH4966).

Due to their high solvent permeance, high stability and high separation factors, a range of DBX (α,α′-dibromo-p-xylene)-crosslinked PBI membranes were screened to investigate their suitability for the defined PEG synthesis. As previous work showed, the selectivity of PBI membranes can be varied by altering the concentration of PBI in the dope solution45. To select the best membranes for the different separation steps, PBI membranes synthesised with different concentrations of PBI in the dope solution were screened. Additionally, crosslinked PBI membranes modified with 2 kDa poly(propylene glycol) (PPG) chains are known to exhibit improved anti-fouling behaviour in methanol, an inexpensive, low-boiling solvent, which can easily dissolve all PEG-based species present in the reaction mixtures45. Hence, PPG-modified, crosslinked PBI membranes with methanol as the solvent were chosen for the diafiltrations in this work.

For the final membrane-based synthesis two diafiltration configurations were used. For the first two cycles the rejection (calculated according to Supplementary Equation 1) of the nanostar-PEG species was relatively low, with only 90 % and hence a two-stage system (Supplementary Fig. 1) was used, with the permeate from the first stage being further filtered in a second stage to recycle the nanostar-PEG species back to the first stage. This configuration has been shown to increase the diafiltration yield siginifcantly46. This resulted in yields of 87 % and 88 % for the first and second cycle respectively (Fig. 3a) using a PBI membrane cast at 16 wt% PBI in the dope solution. Due to their increased size the third (Hub(Eg84-OH)3) and fourth (Hub(Eg112-OH)3) intermediates allowed for purification using the 16 wt% membrane in a single stage system, which significantly decreased the number of diavolumes required for complete purification. The synthesis was performed on a multi-gram scale and resulted in an overall yield of 63 % with 8.3 g of final Hub(Eg112-OH)3 after four iterations of the cyclic process starting from the hub-tribromide.

a Yields for every cycle of the membrane- and chromatography-based syntheses of Hub(Eg112-OH)3. * DF limiting species defined as the biggest species to remove during the diafiltration (having the highest rejection after the product); **The first step in the chromatography-based synthesis needed two separate purification batches due to column loading constraints, hence yields are given for each batch. b MALDI-TOF MS of mPEG-OH4966 synthesised via the membrane-based approach in this work (top) and commercially available disperse mPEG-OH (bottom). c Functionalisation reaction to obtain UV-active derivative 4. d UV-chromatograms (254 nm, 15 cm C18 column, 30 min 20–95% B, A: H2O + 5 mM NH4OAc, B MeCN: MeOH 4:1) of 5 kDa MeO-Eg112-p-methoxybenzoate derived from membrane-based synthesis (black trace) and chromatography-based synthesis (red trace); peak P: MeO-Eg112-p-methoxybenzoate (4), Peak I1: -Eg28 impurity (≙ MeO-Eg84-p-methoxybenzoate), Peak I2: +Eg28 impurity (≙ MeO-Eg140-p-methoxybenzoate), Peaks I3 + I4: unidentified chemical impurities. e Summary of yields and purities for chromatography-based and membrane-based synthesis of MeO-Eg112-OH; the chemical and chain length purities were calculated based on Supplementary Equations 2 and 3.

The final Hub(Eg112-OH)3 was methylated with iodomethane and cleaved from the nanostar using BCl3 to release the desired MeO-Eg112-OH (mPEG-OH4966), which can be functionalised at its hydroxyl-end to provide active PEGylation reagents. MALDI-TOF characterisation demonstrated the product’s unprecedented chain length purity compared to the currently marketed disperse derivatives (Fig. 3b).

An equivalent synthesis (starting from the same amount of starting material) was performed using automated reversed-phase chromatography for the purification in each cycle, whilst keeping all chain extension and deprotection conditions the same. The final MeO-Eg112-OH (mPEG-OH4966) was then analysed for its purity in direct comparison with the same material produced from the membrane-based process. Despite MALDI-TOF being widely used for determining a final polymer’s chain length purity, it is not a quantitative method. Even though the available UPLC-MS system was equipped with an ELSD detector, this could not be used to quantify the PEG polymers of different chain lengths, in their unfunctionalised form (mPEG-OH). This is because the ELSD signal is non-linear across molecules of different sizes, so a calibration with a pure sample of every chain length would have been required. Hence, we employed UPLC-UV instead, using the MS trace for identifying each species, and the UV trace for their quantification. The two MeO-Eg112-OH samples (derived from the membrane-based synthesis vs. from the chromatography-based synthesis) were functionalised with anisoyl chloride to make them UV-active (Fig. 3c) and hence detectable using a UV-Vis detector. A method was developed for a 15 cm long C18 UPLC column, which resulted in successful separation and relative quantification of the MeO-Eg112-anisoyl derivative 4 and the respective impurities, including +/− Eg28 species. -Egx impurities resulting from Eg27 in the Eg28 starting material, or potential unzipping of the PEG chain cannot be separated and hence were not quantified. It should be noted that each peak showed a clear underlying MS trace (ES+) due to the uniformity of the PEG. This allowed for identification of the exact nature of all respective species.

The M-Eg28 species, derived from either incomplete chain extension or deprotection, was detected for both samples (Fig. 3d, Peak “I1” corresponds to MeO-Eg84-X, shorter derivatives were not detected). Also, the M + Eg28 species (Fig. 3d, Peak “I2” corresponds to MeO-Eg140-X, longer derivatives were not detected) was detected for both samples, which can be attributed to small amounts of TsO-Eg28-OH present in the building block. From this, the chain length purities (in terms of +/− Eg28 species) were calculated as 97 % for the chromatography-based synthesis and 96 % for the membrane-based synthesis. The samples derived from the membrane-based synthesis showed two more peaks at higher retention times, which could not be identified (Fig. 3d, Peak “I3” and “I4”). It is suspected that they are derived from side reactions on the PEG chain end, that result in impurities with a reverse-phase retention behaviour significantly different to the Hub-PEG-alcohol species. This would result in removal during chromatographic purification, whereas they cannot be removed by membrane-purification due to the relatively similar sizes of the different PEG-nanostar species. Due to this the overall purity for the membrane-based process is slightly lower (95.5%) than for the chromatography-based process (98.2%). However, with the membrane-based process an overall yield of 63% was achieved, which is a significant improvement compared to the synthesis relying on chromatography (33%). The improved yield achieved through the Nanostar Sieving process is highly significant, as the main material cost in the production of the 5 kDa PEG is the Eg28 building block. A higher yield, i.e., more output for the same input, loosely translates to lower cost and better sustainability. Thus, obtaining a high yield is crucial to ensure the overall viability of the process. This becomes even more important in our ongoing research on synthesising defined PEGylation agents beyond 5 kDa, to eventually cover the entire range of PEGylation agents up to 60 kDa.

The final mPEG-OH4966 derived from the membrane-based synthesis was then modified at its hydroxy terminus to synthesise the mPEG-SP5135 derivative (Fig. 4a, (2)) which is reactive towards the primary amine groups within a protein.

a Synthetic route from the starting material mPEG-OH4966 (3) using Ethyl 3-bromopropionate (EBP) to reach the corresponding ester (5), which was then hydrolysed with NaOH to obtain the acid (6), which was finally reacted with N-hydroxysuccinimide (NHS) to provide the desired active ester (2). b UPLC-ELSD traces of the resultant defined mPEG-SP5135 (top) and the commercially available disperse derivative (bottom) (15 min 29-95 % B gradient with A: H2O + 5 mM NH4OAc, B: MeCN/MeOH 4:1). The defined sample shows 4 peaks (I1: traces of starting material mPEG-OH4966, I2: hydrolysed product/ carboxylic acid, P: product (mPEG-SP), I3: NHS-breakdown product). For the disperse sample peak I1 is broad and peaks I2 and P are not separated. c For the defined sample the underlying ESI+ traces are consistent with the expected mass spectrum, e.g., (shown) the m/z series 1301.5, 1044.8, 873.7, 751.6, 659.8 corresponds to a Mw of 5135.11, which is the mPEG-SP5135 (1301.5 = [M + 4 NH4]4+, 1044.8 = [M + 5 NH4]5+, 873.7 = [M + 6 NH4]6+, 751.6 = [M + 7 NH4]7+, 659.8 = [M + 8 NH4]8+). For the disperse sample the determination of a precise molecular weight is not possible.

In a first step, the mPEG-OH4966 (Fig. 4a, (3), MeO-Eg112-OH) was reacted with Ethyl 3-bromopropionate (EBP) to reach the corresponding ester (5). It must be noted that this reaction was unsuccessful (with a maximum of 40% conversion), when performed with 5 eq. of sodium hydride and adding the EBP at room temperature. This was likely due to the EBP undergoing an elimination reaction, resulting in formation of the acrylate. Adding only 2.1 eq. of sodium hydride to the PEG starting material first, and then adding the EBP (in excess) at 0° C resulted in full conversion to the PEG-ester (5). During the synthesis the defined nature of the PEG allowed for clear monitoring of the different reaction steps via simple UPLC (equipped with an ultraviolet diode array detector, an evaporative light scattering detector and an electrospray mass spectrometer) and for straight-forward identification and quantification of any intermediates or side-products; for example, in the final UPLC-ELSD chromatograms (Fig. 4b) and in the corresponding MS traces of the defined mPEG-SP5135 (Fig. 4c) the product and the two impurities were clearly distinguishable based on their distinct m/z patterns and the corresponding masses (Peak I1: MeO-Eg112-OH and Peak I2: the hydrolysed NHS-Ester/ carboxylic acid). Baseline separation of the peaks allowed for quantification via integration of the ELSD trace. In the case of the disperse derivative, the mass-distribution resulted in poor separation, as well as an un-analysable ESI+ trace.

We then undertook a comparative study on the application of the defined PEG derivative synthesised in this work and a commercially available disperse derivative in the PEGylation of a protein, with the emphasis on the purification and characterisation of the resultant PEGylated protein. The 66.5 kDa protein bovine serum albumin (BSA) was chosen as a widely available test candidate for random PEGylation on its primary amine residues. Random PEGylation generally results in a heterogenous mixture of species with different numbers of PEG-molecules bound to the protein (PEGamers), with every PEGamer usually having different positional isomers. It is of great interest in pharmaceutical research to fully separate and characterise these positional isomers in terms of their PEGylation site, in order to predict potential interactions of the PEG with the proteins/ enzymes bioactivity and to control batch-to-batch variations. This is hindered by the PEG dispersity, which makes the separation of the different PEGamers more difficult and mass-related analytics inconclusive.

For this work, BSA was randomly PEGylated with the previously synthesised defined 5 kDa mPEG-SP5135 and a commercially available disperse derivative. The mono-PEG BSA (identification via MALDI-TOF MS, Fig. 5a) for both samples was isolated via size exclusion chromatography.

a Size-exclusion chromatogram of PEGylated BSA (right) and the MALDI-TOF spectrum of the fraction of mono-PEG BSA (left). b TIC trace of UPLC chromatogram of tryptic digest samples of mono-PEG BSA, PEGylated with a disperse (bottom, red) and defined (top, blue) 5 kDa mPEG-succinimidyl propionate. Examples are shown for the ESI+ traces extracted from the chromatograms. For the defined species (RT 20.6 min), the m/z series (1045.8, 899.0, 788.9, 703.2, 634.7) corresponds to a total mass of 6166.68 Da. Subtracting the PEG mass (5021.03 Da) identified the PEGylated fragment (1145.65 Da) as pos. 236–245 (AS Sequence -AWSVARLSQK-). The disperse derivative does not allow for extraction of exact masses. c Summary of the seven identified PEGylated fragments in mono-PEGylated BSA; bold letters in “Identified Sequences” indicate possible PEG binding sites.

The PEGylation reaction was initially performed with mPEG-succinimidyl succinate, but the PEG-ester bond hydrolysed during tryptic digestion and, apart from the peptide fingerprint region, the final UPLC-MS trace only showed a peak for the mPEG-alcohol. Hence, the mPEG-SP derivative was selected instead, resulting in a stable bond with the protein. Trypsin is known to cleave proteins at their lysine and arginine residues and was chosen for enzymatic digestion of the mono-PEG BSA. Because of potential inhibition of the digestion due to PEGylation, the protein was denatured prior to digestion using dithiothreitol (DTT, reduction of disulfide bonds), to make the amino acid sequence more accessible. The digestion reaction was quenched after 16 h. The resulting peptide/PEG-peptide mixtures were purified using ultracentrifugation, and then characterised on a UPLC equipped with a 5 cm C18 column and a mass spectrometer with an electrospray/single quadrupole set-up. The analyses were performed in positive mode, using NH4OAc as the buffer, so that only NH4+ adducts were formed. Comparison of the total ion count traces (Fig. 5b) of both samples (derived from PEGylation with both, the defined and the disperse mPEG-SP, referred to as “defined PEG” and “disperse PEG”) revealed significant differences. Although both chromatograms show the same characteristics in the protein fingerprint region (with the same underlying MS traces), at retention times over 14 minutes the defined PEG sample shows a clear separation of species with discrete m/z series, whilst the disperse PEG sample is composed of broad and overlapping peaks. The underlying m/z series show the characteristic dispersity pattern for PEG, with a ladder having mass difference of +/− 44 for every ion species, which does not allow for further analysis. In contrast, the TIC of the defined PEG sample showed seven discrete m/z series (seven PEG-Peptide fragments), for which the corresponding molecular weights were calculated (Supplementary Figs. 29–35). After subtracting the molecular weight of the attached PEG (5021.03 Da) from every mass found, the resulting masses were matched with digestion data obtained from an online database search and analytical tool (sequence from Uniprot ID P02769 and digestion data from ProteinProspector v 6.4.5) to identify the seven PEGylated peptide fragments derived from the mono-PEG BSA fraction (Fig. 5c). For example, at retention time 20.6 min the underlying ESI+ trace shows an m/z series of 1046, 899, 789, 703, 635 which corresponds to a molecular weight of 6167 Da (1046= [M + 6 NH4+]6+, 899= [M + 7 NH4+]7+, 789= [M + 8 NH4+]8+, 703= [M + 9 NH4+]9+, 635= [M + 10 NH4+]10+). This means the corresponding peptide fragment has a molecular weight of around 1146 Da and the only matching digestion fragment is the sequence position 236–245, with a molecular weight of 1145.65 Da.

In this way all seven positional isomers contained in the previously isolated mono-PEG BSA sample were identified down to the PEGylated sequence (Fig. 5c). The strategy presented here provides a fast and efficient technique for identifying the conjugation sites within a PEGylated protein. It can be applied to PEG-protein species of varying sizes and to mixtures consisting of different positional isomers. Knowing the PEGylation sites within a therapeutic is important for understanding and predicting the drug’s efficacy, as well as for quality control and safety assessments. Using defined PEGylation agents positively effects the synthesis and purification processes in terms of improved reliability and controllability. Overall, this could offer significant improvements in terms of purity and required dosages, thereby decreasing potential risks, and minimising adverse effects for patients.

We have successfully synthesised a defined, highly pure 5 kDa PEGylation agent using a scalable liquid-phase approach solely relying on membrane-filtration for purification. We have shown that the PEGylation of proteins with defined PEGs results in significant advantages in terms of characterisation and allows for straightforward identification of the PEGylated site within the protein. Using this class of defined PEGylation agents has the potential for better controllability of PEGylated therapeutics, increasing the overall purity, reducing batch-to-batch variations, and could result in enhanced biodistribution behaviour.

Eg28 Diol (Octacosaethylene glycol, Polypure AS), Boron trichloride solution (1.0 M in methylene chloride) (Sigma-Aldrich), Bovine Serum Albumin (heat shock fraction, protease free, fatty acid free, essentially globulin free, pH 7, ≥98%) (Sigma-Aldrich), Dichloroacetic Acid (>98%) (TCI), Ethyl 3-bromopropionate (EBP) (Alfa Aesar), Iodomethane (copper as stabiliser, 99%) (Sigma-Aldrich), N-Hydroxysuccinimide (98%) (Sigma-Aldrich), N,N’-Dicyclohexylcarbodiimide (DCC, 99%) (Sigma-Aldrich), mPEG-Succinimidyl Propionate (Creative PEGworks), Phosphate Buffered Saline (PBS, pH 7.4, liquid, sterile-filtered) (Sigma-Aldrich), Sinapinic acid (matrix substance for MALDI, ≥99%)(Merck), Pyrrole (>99.0%) (TCI), Sodium Hydride (60% in mineral oil) (Sigma-Aldrich), trans-2-[3-(4-tert-Butylphenyl)-2-methyl-2-propenylidene]malononitrile (DCTB, matrix substance for MALDI, ≥99%) (TCI), Triethylamine (for synthesis) (Sigma-Aldrich), Trypsin (from bovine pancreas, Type I, ~10,000 BAEE units/mg protein) (Sigma-Aldrich), 1,4-Dithiothreitol (DTT) (Merck) were used as received without further purification. All solvents used for the syntheses, purifications and analytics were purchased from VWR.

1H NMR and 13C NMR spectra were recorded on a Bruker AVANCE III HD-400 spectrometer, with working frequencies of 400 (1H) and 101 (13C) MHz using deuterated chloroform (CDCl3) or DMSO (DMSO-d6) as a solvent at 293 K.

UPLC-MS analysis was performed on a Waters Acquity UPLC stack equipped with a PDA eλ detector and a micromass ZQ mass spectrometer (operated in positive electrospray ionisation mode) or on an Agilent 1290 Infinity II stack equipped with a DAD Detector, a 1290 Infinity II ELSD detector and a 6130 quadrupole mass spectrometer (operated in positive electrospray ionisation mode). The columns used were either an Acquity UPLC BEH C18 1.7 um, 2.1 × 50 mm or an Acquity UPLC BEH C18 1.7 um, 2.1 × 150 mm. The mobile phases used were H2O (with 50 mM NH4OAc) and MeCN-MeOH (4:1) at a flow rate of 0.3 ml min-1 and a column temperature of 60 °C. The column was equilibrated with the initial solvent composition prior to injection (the injection volume was 4 µL).

MALDI-TOF MS was performed on a Bruker autoflex MALDI-TOF spectrometer at 70% laser power, using either sinapinic acid (for samples containing protein species) or DCTB (for samples containing pure PEG) as the matrices. The typical sample preparation protocols are described in the following.

Protein-containing samples: 10 µL of sample (derived from PEGylation mixture or after SEC) were purified using a C4 ZipTip and eluted into 10 µL of saturated sinapinic acid (in MeCN:H2O 30%:70% with 0.1% TFA). 0.8 µL of the solution was spotted onto a MALDI plate and left to dry.

PEG samples: a 1 mg ml-1 solution of the sample in MeCN and a saturated solution of the matrix (DCTB) in 50 mM NaOAc in H2O-MeCN 1:1 were prepared. Both solutions were mixed in a 1:1 ratio, 0.8 µL of the mixture was spotted onto the plate and left to dry.

All reactions were performed under argon, unless otherwise stated. Solvents labelled as “dry” were dried over activated 3 Å molecular sieve for 24 h and tested for their water content using a Karl-Fischer coulometric titrator ( < 50 ppm).

For the first coupling, DmtrO-Eg28-OH (S2 (Supplementary Fig. 11a), 5.35 g, 3.44 mmol, 4 eq.) was azeotroped from dry acetonitrile and then dissolved in dry THF (20 ml) to which was added sodium hydride (60 % dispersion in mineral oil, 0.34 g, 8.60 mmol, 10 eq.). The solution was stirred for 5 min and then the Hub-tribromide (S1, 0.70 g, 0.86 mmol) was added and the reaction was stirred at 40 °C for 4 h. The reaction was allowed to cool down to room temperature and then quenched with sat. ammonium chloride solution. In the case of a batch based on chromatographic purification, the crude material was purified via chromatography twice. Before loading onto silanised silica, the crude was diluted with THF and filtered through a paper filter. For deprotection, the crude was redissolved in DCM, and pyrrole (1.19 ml, 17.2 mmol, 20 eq.) and DCA (1.42 mL, 17.2 mmol, 20 eq.) were added. The solution was stirred until the reaction was completed (confirmed by UPLC-MS) and was then neutralised with triethylamine until pH ≈ 7. The DCM was removed under reduced pressure and the residue was dissolved in MeOH, any solids were removed by paper filtration and the clear solution was either transferred to the diafiltration rig for membrane purification, or the material was loaded onto silanised silica for chromatographic purification. After complete purification (procedures below), the product (S3a) was obtained as a yellow, waxy solid (membrane purification: 3.24 g, 87%, chromatographic purification: 2.46 g, 86%/78%).

1H NMR (400 MHz, CDCl3). δ = 7.86 (s, 3H), 7.78 (d, J = 8.2 Hz, 6H), 7.70 (d, J = 8.0 Hz, 6H), 7.62 (d, J = 8.0 Hz, 6H), 7.43 (d, J = 7.9 Hz, 6H), 4.60 (s, 6H), 3.72–3.53 (m, 336H).

13C NMR (101 MHz, CDCl3). δ = 142.01, 140.23, 140.02, 139.97, 137.61, 131.06, 128.34, 127.77, 127.58, 127.09, 73.01, 72.62, 70.60–70.58, 70.32, 69.59, 61.72.

m/z (ESI + ) = 1461 [M + 3 NH4+]3+, 1101 [M + 4 NH4+]4+, 884 [M + 5 NH4+]5+; calc. (C213H372O87) = 1460 [M + 3 NH4+]3+, 1099 [M + 4 NH4+]4+, 883 [M + 5 NH4+]5+.

For every chain extension, the combined nanostar-PEG-alcohol (S3a-d, Hub(Egn-OH)3, amounts can be found in Supplementary Tables 1 and 2) and building block (S4, DmtrO-Eg28-OTs, see Supplementary Table 1 for eq.) were azeotroped from dry acetonitrile, then dissolved in dry THF. Sodium hydride (60 wt% dispersion in mineral oil, 20 eq.) was added under argon and the reaction was stirred in an oil bath set to 40 °C until the reaction was fully complete (confirmation via UPLC-MS, reaction times between 1.5 h and 4 h). The reaction was quenched with sat. ammonium chloride and the solvent was removed under reduced pressure. For detritylation, the crude was redissolved in DCM, and pyrrole (20 eq.) then DCA (20 eq.) were added. The solution was stirred until the reaction was complete (confirmation via UPLC-MS) and was then neutralised with triethylamine until pH ≈7. The DCM was removed under reduced pressure and the crude was dissolved in MeOH, any solids were removed by paper filtration and the clear solution was either transferred to the diafiltration rig for membrane purification, or the material loaded onto silanised silica for chromatographic purification.

The diafiltration rig was a 2-stage, crossflow system (Supplementary Fig. 1), the first stage consisting of three circular cells, each equipped with a flat sheet membrane an active area of 51 cm2, and the second stage consisting of two cells with membranes of the same area. A diaphragm pump (Wanner, Hydra-Cell G20) both raised the pressure and circulation flow (1.76 ml min-1) in the first stage, and a gear pump provided circulation flow in the second stage; rapid cross-flow over the membranes is required to minimise concentration polarisation. Depending on the rejection of the Hub-PEG-alcohol species, the rig was fitted with different types of membranes. The rig was configured in either the two-stage set-up to maximise selectivity at lower molecular weight nanostar, or the second stage was disconnected at high molecular weight nanostar to accelerate diafiltration. Each set of membranes was compacted in methanol at 10 bar for at least 8 h before using it for purification.

After each chain extension/deprotection cycle, the deprotected Hub-PEG-alcohol species was transferred to the diafiltration rig dissolved in MeOH and the sample flask was diluted to reach a total system volume of 500 ml. The diafiltration rig was run at a trans membrane pressure of 10 bar and with an average permeance of 1.7 L m−2 h−1bar−1. The diafiltration progress was monitored by taking samples from the first stage, the second stage, and the permeate which were analysed via UPLC-MS. The diol species (Eg28 and Eg56) were detected using an ELSD detector (a calibration curve was prepared beforehand, Supplementary Fig. 2), whereas the 290 nm UV trace was used for the nanostar species. The diafiltration was stopped once the diol reached <1% of the initial concentration. The purified product was collected from first and second stage and the methanol was evaporated to obtain the Hub-PEG-OH as a waxy solid (Hub(Eg56-OH)3 (S3b): membrane purification 5.29 g (88%), chromatographic purification 3.30 g (74%); Hub(Eg84-OH)3 (S3c): membrane purification 6.60 g (86%), chromatographic purification 3.54 g (75%); Hub(Eg112-OH)3 (S3d): membrane purification 8.20 g (95%), chromatographic purification 4.12 g (89%)).

(S3b) (Hub(Eg56-OH)3)

1H NMR (400 MHz, CDCl3). δ = 7.86 (s, 3H), 7.78 (d, J = 8.0 Hz, 6H), 7.70 (d, J = 8.0 Hz, 6H), 7.63 (d, J = 8.0 Hz, 6H), 7.43 (d, J = 8.0 Hz, 6H), 4.61 (s, 6H), 3.79–3.44 (m, 672H).

13C NMR (101 MHz, CDCl3). δ = 142.04, 140.26, 139.98, 139.95, 137.64, 131.04, 128.36, 127.80, 127.61, 127.12, 73.04, 72.63, 70.69–70.48, 70.27, 69.62, 61.67.

m/z (ESI + ) = 1625 [M + 5 NH4+]5+, 1357 [M + 6 NH4+]6+, 1166 [M + 7 NH4+]7+, 1023 [M + 8 NH4+]8+, 911 [M + 9 NH4+]9+, calc. (C381H708O171) = 1623 [M + 5 NH4+]5+, 1355 [M + 6 NH4+]6+, 1164 [M + 7 NH4+]7+, 1021 [M + 8 NH4+]8+, 910 [M + 9 NH4+]9+.

(S3c) (Hub(Eg84-OH)3)

1H NMR (400 MHz, CDCl3). δ = 7.85 (s, 3H), 7.78 (d, J = 8.0 Hz, 6H), 7.69 (d, J = 8.0 Hz, 6H), 7.62 (d, J = 8.0 Hz, 6H), 7.42 (d, J = 8.0 Hz, 6H), 4.60 (s, 6H), 3.81–3.39 (m, 1008H).

13C NMR (101 MHz, CDCl3). δ = 141.99, 140.23, 140.02, 139.98, 137.62, 131.03, 128.34, 127.77, 127.58, 127.09, 73.01, 72.68, 70.79–70.41, 70.34, 69.60, 61.73

m/z (ESI + ) = 1484 [M + 8 NH4+]8+, 1321 [M + 9 NH4+]9+, 1191 [M + 10 NH4+]10+, 1084 [M + 11 NH4+]11+, 996 [M + 12 NH4+]12+, 921 [M + 13 NH4+]13+, 856 [M + 14 NH4+]14+, 800 [M + 15 NH4+]15+, 751 [M + 16 NH4+]16+, 708 [M + 17 NH4+]17+; calc. (C549H1044O255) = 1484 [M + 8 NH4+]8+, 1321 [M + 9 NH4+]9+, 1191 [M + 10 NH4+]10+, 1084 [M + 11 NH4+]11+, 995 [M + 12 NH4+]12+, 920 [M + 13 NH4+]13+, 856 [M + 14 NH4+]14+, 800 [M + 15 NH4+]15+, 751 [M + 16 NH4+]16+, 708 [M + 17 NH4+]17+.

(S3d) (Hub(Eg112-OH)3)

1H NMR (400 MHz, CDCl3). δ = 7.83 (s, 3H), 7.75 (d, J = 8.0 Hz, 6H), 7.67 (d, J = 8.0 Hz, 6H), 7.59 (d, J = 8.0 Hz, 6H), 7.40 (d, J = 8.0 Hz, 6H), 4.57 (s, 6H), 3.80–3.36 (m, 1344H).

13C NMR (101 MHz, CDCl3). δ = 141.92, 140.13, 139.93, 137.62, 137.54, 131.03, 128.25, 127.69, 127.50, 127.00, 72.92, 72.5, 71.00–69.99, 70.25, 69.52, 61.6.

m/z (ESI + ) = 1946 [M + 8 NH4+]8+, 1732 [M + 9 NH4+]9+, 1561 [M + 10 NH4+]10+, 1421 [M + 11 NH4+]11+, 1304 [M + 12 NH4+]12+, 1205 [M + 13 NH4+]13+, 1120 [M + 14 NH4+]14+, 1047 [M + 15 NH4+]15+, 982 [M + 16 NH4+]16+, 926 [M + 17 NH4+]17+, 875 [M + 18 NH4+]18+, 830 [M + 19 NH4+]19+, 789 [M + 20 NH4+]20+; calc. (C717H1380O339) = 1946 [M + 8 NH4+]8+, 1732 [M + 9 NH4+]9+, 1560 [M + 10 NH4+]10+, 1420 [M + 11 NH4+]11+, 1303 [M + 12 NH4+]12+, 1204 [M + 13 NH4+]13+, 1120 [M + 14 NH4+]14+, 1046 [M + 15 NH4+]15+, 982 [M + 16 NH4+]16+, 925 [M + 17 NH4+]17+, 875 [M + 18 NH4+]18+, 830 [M + 19 NH4+]19+, 789 [M + 20 NH4+]20+.

Chromatographic purification was performed using a Biotage Isolera flash chromatography instrument equipped with a UV detector.

The crude material was loaded onto silianised silica and packed in a Biotage dry loading vessel. The separation was conducted on a Biotage Sfaer C18 D column (120 g, 100 Å, 30 µm) using H2O (+0.5% Formic Acid) and MeCN/MeOH (4:1) as the mobile phase. The gradient was 10-95 % B (MeCN/MeOH) over 12 column volumes. Fractions were detected and collected at 290 nm and subsequently analysed via UPLC-MS. Pure fractions were combined, and the organic solvent stripped off on the rotary evaporator. The aqueous solution was saturated with NaCl and extracted with chloroform (5 times). The combined organic layers were dried over sodium sulfate, filtered off and the solvent was removed under reduced pressure. The products S3b -d were obtained as waxy solids with the yields stated in Supplementary Table 2.

The purified Hub(Eg112-OH)3 (S3d, 1.00 g, 0.065 mmol) was azeotroped from dry acetonitrile and then dissolved in dry THF (5 ml). Sodium hydride (60% dispersion in mineral oil, 0.05 g, 1.30 mmol, 20 eq.) was added and then immediately Iodomethane (0.08 ml, 1.30 mmol, 20.0 eq.). The flask was placed in an oil bath set to 30 °C and the reaction mixture was stirred for 2 hours. After confirming completion of the reaction via the UPLC-UV trace), it was quenched with a solution of triethylammonium chloride (Et3NCl, 0.18 g, 1.30 mmol, 20 eq.), filtered, dissolved in MeOH and then transferred to the diafiltration rig for purification. After diafiltration purification, 0.96 g of pure Hub(Eg112-OMe)3 (S6) was obtained (0.062 mmol, y = 95.4 %).

1H NMR (400 MHz, CDCl3). δ = 7.83 (s, 3H), 7.75 (d, J = 8.0 Hz, 6H), 7.67 (d, J = 8.1 Hz, 6H), 7.59 (d, J = 7.8 Hz, 6H), 7.40 (d, J = 7.8 Hz, 6H), 4.57 (s, 6H), 3.84–3.28 (m, 1344H), 3.25 (s, 3H).

13C NMR (101 MHz, CDCl3). δ = 141.92, 140.13, 139.93, 137.62, 137.54, 131.03, 128.25, 127.69, 127.50, 127.00, 72.92, 72.58, 71.00–69.99, 70.25, 69.52, 61.61, 59.09.

m/z (ESI + ) = 1952 [M + 8 NH4+]8+, 1736 [M + 9 NH4+]9+, 1565 [M + 10 NH4+]10+, 1425 [M + 11 NH4+]11+, 1307 [M + 12 NH4+]12+, 1208 [M + 13 NH4+]13+; calc. (C720H1386O339) = 1951 [M + 8 NH4+]8+, 1736 [M + 9 NH4+]9+, 1564 [M + 10 NH4+]10+, 1424 [M + 11 NH4+]11+, 1307 [M + 12 NH4+]12+, 1208 [M + 13 NH4+]13+

For cleaving MeO-Eg112-OH off the nanostar, Hub(Eg112-OMe)3 (S6, 0.50 g, 0.03 mmol) was dissolved in DCM. The flask was placed in a cooling bath consisting of Acetone and dry ice to reach a temperature of −70 °C. Once cooled down, Boron trichloride (1 M solution in DCM) was added to the stirred reaction solution and it was allowed to warm up to −30 °C. The reaction was maintained at this temperature until it reached completion (confirmation via UPLC-MS) and then cooled back to −70 °C to then slowly add MeOH and Sodium Bicarbonate to quench the reaction. The solution was concentrated under reduced pressure and then precipitated in ice cold ether. The precipitate was filtered off, washed with ice-cold ether and then dried under high vacuum. Pure MeO-Eg112-OH (3, 0.38 g, y = 85%) was obtained as a colourless, crystalline solid.

1H NMR (400 MHz, CDCl3). δ = 3.83–3.42 (m, 448H), 3.37 (s, 3H).

13C NMR (101 MHz, CDCl3). δ = 72.66, 71.96, 70.59, 61.71, 59.09.

m/z (ESI + ) = 1673 [M + 3 NH4+]3+, 1260 [M + 4 NH4+]4+, 1011 [M + 5 NH4+]5+, 846 [M + 6 NH4+]6+;calc. (C225H452O113) = 1673 [M + 3 NH4+]3+, 1259 [M + 4 NH4+]4+, 1011 [M + 5 NH4+]5+, 846 [M + 6 NH4+]6+.

m/z (MALDI-ToF + ). 5049.1 ([M + 2 MeCN + H]+).

MeO-Eg112-OH (3, 0.30 g, 0.060 mmol) was azeotroped from dry acetonitrile and then dissolved in dry THF (6 ml). The solution was cooled down to 0 °C and first sodium hydride (0.005 g, 0.125 mmol, 2.1 eq.) was added and then ethyl 3-bromopropionate (EBP, 75 µL, 0.60 mmol, 10.0 eq.). The reaction was allowed to warm up from 0 °C to room temperature under stirring and then heated to 40 °C. The reaction was complete after 1 h (monitored via UPLC-MS) and quenched with sat. NH4Cl. Aqueous NaOH was added to hydrolyse the ester 5. The THF was evaporated off and the pH of the remaining aqueous adjusted to pH ≈ 4. The aqueous was saturated with NaCl and extracted with dichloromethane (3 x). The combined organic layers were dried over anhydrous Na2SO4, filtered and the solvent was stripped off under reduced pressure. Subsequently, compound 6 (0.20 g, 0.04 mmol) was dissolved in dry DCM (3 ml) and N-hydroxysuccinimide (NHS, 0.018 g, 0.16 mmol, 4.0 eq.) was added under stirring. DCC (0.033 mg, 0.16 mmol, 4.0 mmol) was dissolved in dry THF and added to the solution dropwise and the solution was stirred under Argon for four hours. After the reaction was complete the solvent was stripped off on the rotary evaporator. The crude was redissolved in acetonitrile and filtered through a paper filter, the filtrate was collected, and the solvent removed on the rotary evaporator. This process was repeated twice. Subsequently, the remaining solid was dissolved in a small amount of DCM and precipitated in ice-cold diethylether. The precipitate was filtered off, washed with more ice-cold diethylether, and then dried under reduced pressure. Compound 2 was obtained as a colourless, waxy solid (0.16 g, 0.03 mmol, y = 75 %).

1H NMR (400 MHz, DMSO-d6). δ = 3.73–3.31 (m, 448H), 3.24 (s, 3H), 2.87–2.75 (m, 4H), 2.64–2.57 (m, 4H).

13C NMR (101 MHz, CDCl3). δ = 72.01, 70.78, 65.82, 34.05, 25.67.

m/z (ESI + ). = 1302 ([M + 4 NH4]4+), 1045 ([M + 5 NH4]5+), 874 ([M + 6 NH4]6+), 752 ([M + 7 NH4]7+), 660 ([M + 8 NH4]8+); calc. (C232H459O117N): 1302 ([M + 4 NH4]4+), 1045 ([M + 5 NH4]5+), 874 ([M + 6 NH4]6+), 752 ([M + 7 NH4]7+), 660 ([M + 8 NH4]8+).

The procedure for the small scale batch PEGylation of BSA with 5 kDa mPEG-SP was based on a paper published by van Alstine and Fee in 200447.

50 mg of bovine serum albumin (BSA, 66 kDa) was dissolved in 5 ml of PBS buffer (pH = 7.4). The solution was stirred in an open vial at room temperature and 50 mg of PEGylation agent was added (either disperse or defined 5 kDa mPEG-SP). The reaction was finished after 1 h (monitored via SEC) and quenched with 0.1 M HCl.

A sample of 0.5 ml was pushed through a 0.25 micron syringe filter and 100 µl were injected onto a Cytiva Superdex 200 Increase 10/300 GL SEC column, which was implemented into an Agilent 1100 HPLC stack with UV detector. 0.1 M NH4OAc was used as the mobile phase. Flowrate: 0.5 ml/min, length: 60 min, injection volume: 50 µl, T = 30 °C.

0.5 ml (1 min) fractions were collected starting from 18 min, up to 40 min. Each sample was analysed via MALDI-TOF, as described above.

The 23–24 min sample (disperse PEG) and 24–25 min sample (defined PEG) contained only mono-PEGylated BSA (according to MALDI spectrum). The concentration was estimated by calculating the relative peak area of the mono-PEG peak, compared to the other protein-related peaks. The samples were subsequently used for enzymatic digestion experiments.

Three of the previously collected SEC samples containing mono-PEG BSA were concentrated into one through ultracentrifugation (using an Amicon Ultra 0.5 mL/ 50 K centrifugal filter) and redissolved in 0.5 ml 5 mM DTT/ 50 mM NaHCO3 in ultrapure water. The solution was incubated at 37 °C for 1 h to denature the protein. Subsequently 25 µl trypsin solution (1 mg ml-1 in 50 mM acetic acid) was added to the sample to reach a ratio Enzyme:Protein of 1:20 (w:w). The sample was incubated at 37 °C overnight and purified through ultracentrifugation (using an Amicon Ultra 0.5 mL/ 30 K centrifugal filter). The ultracentrifugation permeate (containing peptide and PEG-peptide fragments without undigested protein) was then analysed via UPLC-MS (Mobile Phase: MeCN/MeOH 4:1 and 0.1% Formic Acid in Water, 15 cm C18 Column, 30 min 1–50% gradient). The underlying ESI+ trace of each peak was matched with the corresponding peptide fragment using ProteinProspector v 6.4.5 (sequence from Uniprot ID P02769) and Supplementary Equation 4.

The data supporting the findings of this study are available in the article and its supplementary files. Should any raw data files be needed in another format, they are available from the corresponding author. Source data are provided with the paper. The BSA sequence used for this study has been derived from Uniprot ID P02769 and Protein Prosepctor v 6.4.5 (MS-digest tool) has been used to obtain the digestion data (https://prospector.ucsf.edu). Source data are provided with this paper.

Abuchowski, A., van Es, T., Palczuk, N. C. & Davis, F. F. Alteration of immunological properties of bovine serum albumin by covalent attachment of polyethylene glycol. J. Biol. Chem. 252, 3578–3581 (1977).

Article CAS PubMed Google Scholar

Srichana, T. & Suwandecha, T. Biodegradable Polymers in Clinical Use and Clinical Development (John Wiley & Sons, Inc., 2011).

Zalipsky, S. & Harris, J. M. Introduction to Chemistry and Biological Applications of Poly(ethylene glycol). (American Chemical Society, 1997).

Roberts, M. J., Bentley, M. D. & Harris, J. M. Chemistry for peptide and protein PEGylation. Adv. Drug Deliv. Rev. 54, 459–476 (2002).

Article CAS PubMed Google Scholar

Knop, K., Hoogenboom, R., Fischer, D. & Schubert, U. S. Poly(ethylene glycol) in drug delivery: pros and cons as well as potential alternatives. Angew. Chem. Int. Ed. Engl. 49, 6288–6308 (2010).

Article CAS PubMed Google Scholar

D’souza, A. A. & Shegokar, R. Polyethylene glycol (PEG): a versatile polymer for pharmaceutical applications. Expert Opin. Drug Deliv. 13, 1257–1275 (2016).

Article PubMed Google Scholar

Harris, J. M. & Chess, R. B. Effect of pegylation on pharmaceuticals. Nat. Rev. Drug Discov. 2, 214–221 (2003).

Article CAS PubMed Google Scholar

Veronese, F. M. Peptide and protein PEGylation: a review of problems and solutions. Biomaterials 22, 405–417 (2001).

Article CAS PubMed Google Scholar

Swierczewska, M., Lee, K. C. & Lee, S. What is the future of PEGylated therapies? Expert Opin. Emerg. Drugs 20, 531–536 (2015).

Article CAS PubMed PubMed Central Google Scholar

Turecek, P. L., Bossard, M. J., Schoetens, F. & Ivens, I. A. PEGylation of biopharmaceuticals: a review of chemistry and nonclinical safety information of approved drugs. J. Pharm. Sci. 105, 460–475 (2016).

Article CAS PubMed Google Scholar

Veronese, F. M. & Pasut, G. PEGylation, successful approach to drug delivery. Drug Discov. Today 10, 1451–1458 (2005).

Article CAS PubMed Google Scholar

Gupta, V. et al. Protein PEGylation for cancer therapy: bench to bedside. J. Cell Commun. Signal 13, 319–330 (2019).

Article CAS PubMed Google Scholar

Park, E. J. & Na, D. H. Characterization of the reversed-phase chromatographic behavior of PEGylated peptides based on the Poly(ethylene glycol) dispersity. Anal. Chem. 88, 10848–10853 (2016).

Article CAS PubMed Google Scholar

Monkarsh, S. P. et al. Positional isomers of monopegylated interferon α-2a: isolation, characterization, and biological activity. Anal. Biochem. 247, 434–440 (1997).

Article PubMed Google Scholar

Kirwan, P. Product Quality Review(s) 761075Orig1s000. 92-153 (U.S. Food & Drug Administration, 2018).

Gerislioglu, S., Adams, S. R. & Wesdemiotis, C. Characterization of singly and multiply PEGylated insulin isomers by reversed-phase ultra-performance liquid chromatography interfaced with ion mobility mass spectrometry. Anal. Chim. Acta 1004, 58–66 (2018).

Article CAS PubMed Google Scholar

Qian, X. et al. Characterization of a site-specific PEGylated analog of exendin-4 and determination of the PEGylation site. Int. J. Pharm. 454, 553–558 (2013).

Article CAS PubMed Google Scholar

Caliceti, P. & Veronese, F. M. Pharmacokinetic and biodistribution properties of poly(ethylene glycol)–protein conjugates. Adv. Drug Deliv. Rev. 55, 1261–1277 (2003).

Article CAS PubMed Google Scholar

Herzberger, J. et al. Polymerization of ethylene oxide, propylene oxide, and other alkylene oxides: synthesis, novel polymer architectures, and bioconjugation. Chem. Rev. 116, 2170–2243 (2016).

Article CAS PubMed Google Scholar

Agner, E. Method for displacement chromatography. United States patent US 6,576,134 B1 (2003).

Agner, E. Purification of Peptides and Oligonucleotides by Sample Displacement Chromatography process and apparatus. United States patent US 6,245,238 B1 (1999).

Polypure A. S. mPEG-45 (Catalogue Number 1011-4595), https://polypure.com/products/peg/peg-28 (2024).

Perry, S. Z. & Hibbert, H. Studies on reactions relating to carbohydrates and polysaccharides: XLVIII. Ethylene Oxide and related compounds: synthesis of the polyethylene glycols. Can. J. Res. 14b, 77–83 (1936).

Article CAS Google Scholar

Fordyce, R., Lovell, E. L. & Hibbert, H. Studies on reactions relating to carbohydrates and polysaccharides. LVI. The synthesis of the higher polyoxyethylene glycols. J. Am. Chem. Soc. 61, 1905–1910 (1939).

Article CAS Google Scholar

Keegstra, E. M. D., Zwikker, J. W., Roest, M. R. & Jenneskens, L. W. A highly selective synthesis of monodisperse oligo(ethylene glycols). J. Org. Chem. 57, 6678–6680 (1992).

Article CAS Google Scholar

Harada, A., Li, J. & Kamachi, M. Preparation and characterization of a polyrotaxane consisting of monodisperse Poly(ethylene glycol) and.alpha.-Cyclodextrins. J. Am. Chem. Soc. 116, 3192–3196 (1994).

Article CAS Google Scholar

Ahmed, S. A. & Tanaka, M. Synthesis of Oligo(ethylene glycol) toward 44-mer. J. Org. Chem. 71, 9884–9886 (2006).

Article CAS PubMed Google Scholar

Maranski, K., Andreev, Y. G. & Bruce, P. G. Synthesis of poly(ethylene oxide) approaching monodispersity. Angew. Chem. Int. Ed. 53, 6411–6413 (2014).

Article CAS Google Scholar

Bohn, P. & Meier, M. A. R. Uniform poly(ethylene glycol): a comparative study. Polym. J. 52, 165–178 (2020).

Article CAS Google Scholar

French, A. C., Thompson, A. L. & Davis, B. G. High-purity discrete PEG-oligomer crystals allow structural insight. Angew. Chem. Int. Ed. 48, 1248–1252 (2009).

Article CAS Google Scholar

Loiseau, F. A., Hii, K. K. & Hill, A. M. Multigram synthesis of well-defined extended bifunctional polyethylene glycol (PEG) chains. J. Org. Chem. 69, 639–647 (2004).

Article CAS PubMed Google Scholar

Niculescu-Duvaz, D., Getaz, J. & Springer, C. J. Long functionalized poly(ethylene glycol)s of defined molecular weight: synthesis and application in solid-phase synthesis of conjugates. Bioconjug. Chem. 19, 973–981 (2008).

Article CAS PubMed Google Scholar

Szekely, G., Schaepertoens, M., Gaffney, P. R. & Livingston, A. G. Beyond PEG2000: synthesis and functionalisation of monodisperse PEGylated homostars and clickable bivalent polyethyleneglycols. Chem. Eur. J. 20, 10038–10051 (2014).

Article CAS PubMed Google Scholar

Székely, G., Schaepertoens, M., Gaffney, P. R. J. & Livingston, A. G. Iterative synthesis of monodisperse PEG homostars and linear heterobifunctional PEG. Polym. Chem. 5, 694–697 (2014).

Article Google Scholar

Li, Y. et al. Fluorous synthesis of mono-dispersed poly(ethylene glycols). Tetrahedron Lett. 55, 2110–2113 (2014).

Article CAS Google Scholar

Li, Y., Qiu, X. & Jiang, Z.-X. Macrocyclic sulfates as versatile building blocks in the synthesis of monodisperse poly(ethylene glycol)s and monofunctionalized derivatives. Org. Process Res. Dev. 19, 800–805 (2015).

Article CAS Google Scholar

Zhang, H. et al. Highly efficient synthesis of monodisperse poly(ethylene glycols) and derivatives through macrocyclization of oligo(ethylene glycols). Angew. Chem. Int. Ed. 54, 3763–3767 (2015).

Article CAS Google Scholar

Zhang, Q., Ren, H. & Baker, G. L. A practical and scalable process to selectively monofunctionalize water-soluble α,ω-diols. Tetrahedron Lett. 55, 3384–3386 (2014).

Article CAS Google Scholar

Wawro, A., Muraoka, T. & Kinbara, K. Chromatography-free synthesis of monodisperse oligo(ethylene glycol) mono-p-toluenesulfonates and quantitative analysis of oligomer purity. Polym. Chem. 7, 2389–2394 (2016).

Khanal, A. & Fang, S. Solid phase stepwise synthesis of polyethylene glycols. Chem. Eur. J. 23, 15133–15142 (2017).

Article CAS PubMed Google Scholar

Eriyagama, D. N. A. M., Yin, Y. & Fang, S. Automated stepwise PEG synthesis using a base-labile protecting group. Tetrahedron 119, 132861 (2022).

Article CAS Google Scholar

Dong, R. et al. Sequence-defined multifunctional polyethers via liquid-phase synthesis with molecular sieving. Nat. Chem. 11, 136–145 (2019).

Yeo, J. et al. Liquid phase peptide synthesis via one-pot nanostar sieving (PEPSTAR). Angew. Chem. Int. Ed. 60, 7786–7795 (2021).

Article CAS Google Scholar

Wang, J. et al. Monodisperse and polydisperse PEGylation of peptides and proteins: a comparative study. Biomacromolecules 21, 3134–3139 (2020).

Article CAS PubMed Google Scholar

Oxley, A. & Livingston, A. G. Anti-fouling membranes for organic solvent nanofiltration (OSN) and organic solvent ultrafiltration (OSU): graft modified polybenzimidazole (PBI). J. Membr. Sci. 662, 120977 (2022).

Article CAS Google Scholar

Kim, J. F., Székely, G., Valtcheva, I. B. & Livingston, A. G. Increasing the sustainability of membrane processes through cascade approach and solvent recovery—pharmaceutical purification case study. Green. Chem. 16, 133–145 (2014).

Article Google Scholar

Fee, C. J. & Van Alstine, J. M. Prediction of the viscosity radius and the size exclusion chromatography behavior of PEGylated proteins. Bioconjug. Chem. 15, 1304–1313 (2004).

Article CAS PubMed Google Scholar

Download references

This work was funded by the European Research Council (Advanced Grant 786398—EXACTYMER; M.J.B., N.A.Z, A.G.L, P.R.J.G.). A.G.L. acknowledges contribution of Kay Wenden.

School of Engineering and Materials Science, Queen Mary University of London, London, UK

Maria J. Burggraef, Adam Oxley, Naveed A. Zaidi, Piers R. J. Gaffney & Andrew G. Livingston

Department of Chemical Engineering, Imperial College London, London, UK

Maria J. Burggraef

Exactmer Ltd., Londoneast-UK Business and Technical Park, The CUBE, London, UK

Adam Oxley, Piers R. J. Gaffney & Andrew G. Livingston

Barts Cancer Institute, Queen Mary University of London, London, UK

Pedro R. Cutillas

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

M.J.B. and A.G.L. designed the idea and experiments. P.R.J.G. and P.R.C. contributed to the idea, experiments, and preliminary work. A.O. contributed to the membrane-based synthesis of defined PEG and synthesised the membranes used for this work. N.A.Z. supported with synthesising the PEG starting materials. M.J.B. synthesised and characterised the defined PEG, performed the PEGylation and digestion reactions, and obtained and interpreted the analytical data from the digestion experiments. All authors contributed to the draft of the paper.

Correspondence to Andrew G. Livingston.

Queen Mary University of London has filed a UK patent application (no. 2402623.9) related to defined molecular weight PEGylation agents. A.G.L. and M.J.B. are listed as inventors. All other authors declare no competing interests.

Nature Communications thanks Zhongxing Jiang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

Burggraef, M.J., Oxley, A., Zaidi, N.A. et al. Exactly defined molecular weight poly(ethylene glycol) allows for facile identification of PEGylation sites on proteins. Nat Commun 15, 9814 (2024). https://doi.org/10.1038/s41467-024-54076-6

Download citation

Received: 16 May 2024

Accepted: 31 October 2024

Published: 13 November 2024

DOI: https://doi.org/10.1038/s41467-024-54076-6

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative