# TBT4135 - Biopolymerkjemi

Hopp til: navigasjon, søk
 Fakta høst 2009 Foreleser: Kurt Ingar Draget Vurderingsform: Skriftlig eksamen (100 %) Eksamensdato: 10.12.09 Pensum: Biopolymerkjemi (Smidsrød og Moe, 1995) og Tillegskompendium i biopolymerkjemi (Christensen)
 Øvingsopplegg høst 2009 Antall godkjente: 5/6 Innleveringssted: Instituttkontor for bioteknologi Frist: Mandager kl 16.
 Lab høst 2009 Intensivlab, 2 uker med en kort lab omtrent hver dag.

Innføring i biologiske polymerer (polysakkarider, proteiner), med laboratorieøvinger i anvendte teknikker.

# Oppsummering

## DNA

Deoxyribose (2-deoxy D-ribose), attached to a phosphate group on 3', this is 3' end), and next phosphate group on 3' (3' end). $\beta$-linked to pyrimidine (cytosin and guanine) or pyrine (adenine or thymin/uracil) at C-1.

DNA adopts a double helix conformation. The chains are anti-parallel and the helix is right-handed. There are 10 bases per turn and a pitch of 3.4 nm. The backbone has many rotation possibilities and is relatively flexible.

### PCR

Melt DNA (double -> singlestranded). Add small primers of known sequence near region of interest, bases and DNA polymerase. Rinse and repeat.

### Sequencing

Maxam-Gilbert: Base specific cleavage of DNA after marking 5' end by radioactive phosphate, and seperated in gel electrophoresis. Can sequence up to 200 bases. Dideoxy: Synthesize DNA by biological methods, but add small amounts of a type of dideoxy base, which stops synthesis at certain places. Seperate by gel electrophoresis and put together sequence.

## Proteins

### Amino acids

All proteins formed of L-amino acids.

20 essential amino acids: Non-polar amino acids

• Alanine (Ala, A): methyl
• Valine (Val, V): isopropyl
• Leucine (Leu, L): isobuthyl
• Isoleucine (Ile, I): 1-methyl propane
• Proline (Pro, P): Propyl linked to amine in main
• Phenylalanine (Phe, F): Alanine with phenylfunction
• Tryptophan (Trp, W): Alanine with indole group
• Methionine (Met, M): CH3(2)-S-CH3, can be synthesized from cystein.

Polar amino acids

• Glycine (Gly, G): H
• Serine (Ser, S): methanol
• Threonine (Thr, T): sec-propanol
• Cysteine (Cys, C): methanethiol
• Tyrosine (Tyr, Y): Phenylalanine with hydroxy in para.
• Aspargine (Asp, N): Aspartic acid with amino instead of hydroxy
• Glutamine (Gln, Q): Glutamic acid with amino instead of hydroxy

Acidic amino acids

• Aspartic acid (Asp, D): Acetic acid
• Glutamic acid (Glu, E): Propylic acid

Basic amino acids

• Lysine (Lys, K): amino-buthane
• Arginine (Arg, R): propyl-guanidinium
• Histidine (His, H): methyl-imidazole

In general the $\alpha$-carboxyl group has a pKa of about 2 and the $\alpha$ amino group has a pKa of around 9.5. Asp and Glu have pKa around 4, Cys, Thr and Lys have around 10, Arg has around 12.5 while histidine is special at around 6. Calculate pI by testing what net charge the protein has at a given pH and then try again.

### Sequencing

Sanger's method: Attach dinitrofluorobenzene to N-terminal, degrade protein completely and then identify amino acid that is attached to reagent. Can do similar to carboxyl end. This can be used to sequence di- or tripeptides (middle amino acids identified by chromatography). Use mild degradation to obtain mixture, put together puzzle. Not used anymore.

Edman's method: Disconnect only amino-terminal amino acid, identify, and repeat. Can be done automatically in parallell.

Gene coding: Find the first few amino acids (7-10), use this to make a DNA probe (primer in PCR), amplify gene and sequence. Only works on prokaryotes, due to introns in eukaryotes.

### Structure

Partial double bond in peptide bond hinders rotation in peptide chain except on each side of the $\alpha$-carbon (with R-group) although limited to certain angles. R-groups alternating side of chain.

Arnfinsens experiment: Disrupt disulfide bonds with mercaptoethanol and denature with 8M urea, reverse and regain most of activity - folding is native low energy state.

Denaturation can cause changes in acid/base properties and IEP, changes in charge, higher accessibility for proteases, higher reactivity of many side groups, and general conformation changes that influence solution properties.

#### $\alpha$-helixes

Left-handed helix, full turn every 3.6 amino acids, with a rise of about 1,5 Å and pitch 5,4 Å. R-groups facing out of helix. Stabilized by uncharged, medium-sized amino acids: Ala, Leu, Phe, Tyr, Trp, Cys, Met, His and Asn. Small or large R-groups, or charged amino acids, de-stabilize the helix: Gly, Ile, Glu, Asp, Lys, Arg, Ser, Thr. Proline and hydroxyproline break the helix, due to hindered rotation in cyclobuthanol-ring. Threonine and serine have intramolecular hydrogen bonds that compete with intermolecular hydrogen bonds. There are other types of $\alpha$-helixes, such as $\alpha_10$ or pi helixes, which are similar but with less or more amino acid residues per turn. In the standard helix amino acid i and i+3 hydrogen bond. Keratine is rich in $\alpha$-helixes and $\beta$-sheets (see below).

Collagen has a triple helical structure that is right-handed with about 20 amino acids per turn, i.e. a much loser structure. Typical sequence is Gly-X-Y where X is often proline and Y is often hydroxyproline. These lock the bond angles to favour this type of helix. Gelatin is denatured collagen that partially reforms the helixes upon gelation. Collagen triple helixes form intermolecular hydrogen bonds to other helixes to make strong filaments, but does not form intramolecular hydrogen bonds.

Collagen is the most abundant protein in mammals, 25-35% of total protein content. Collagen type I, II and III are fibrillar collagen and are found in most connective tissues and bone, cartilage and vitreous humor, and extensible connective tissues respectively. Collagen type IV is part of the basal laminae. Together they account for >90% of the collagen in the body. There are different chain types in the different forms of collagen, type I has two $\alpha$1 and one $\alpha$2 chain, type II has three $\alpha$1 chains, type III has three $\alpha$3 chains, while type IV has a mixture.

Collagen is built by first forming tropocollagen (three helixes bound together). In the ECM the ends are cleaved and the tropocollagen assembles into fiber bundles. The bundles have a striated appearance. Allysine and lysine residues form Schiff base covalent crosslinks.

Elastin is another ECM protein with a random coil shape, rich in glycine, valine, alanine and proline. Gives flexability to the ECM.

There are many diseases associated with ECM disorders. Marfan syndrome (long arms, legs, extra stretchy) caused by mutation in fibrillin, an important structural protein holding elastin in place. Ehler-Danlos syndrome (stretchy skin, lesions, bruises, bendable limbs) caused by mutation in collagen III.

#### $\beta$-sheets

Stretched $\alpha$-keratins, $\beta$-keratins and silk fibroin have a common protein structure called a $\beta$-sheets. Hydrogen bonds are formed between the backbone amide groups, while the R-groups stick up and down in the plane, forming intra-layer bonds in addition.

From the amino acid sequence the secondary structures above can sometimes be estimated. Hydropathy plots can also be made to map regions heavy in hydrophilic or hydrophobic regions, to see where they are most likely to be found in a tertiary structure. Tertiary structures can be assembled into quaternary structures, which are stabilized by weak interactions or disulfide bonds between segments.

## Polysaccharides

Polysaccharides are the most abundant biopolymer. The basic building blocks are monosaccharides.

### Monosaccharides

They are designated D or L depending on the orientation of the highest numbered chiral carbon atom. If the hydroxy group is pointing right in the Fischer structure it is a D-sugar. This corresponds to the non-ring carbon to be pointing up in the Haworth projection. Opposite for L sugars. L sugars are mirror images of R sugars with the corresponding name, i.e. all groups are mirrored. If only one group is mirrored the sugars are C-X epimers. If the hydroxygroup on C-1 is cis with the non-ring carbon group the sugar is $\beta$, or $\alpha$ if trans. Pentoses are ribose (RR), arabinose (LR), xylose (RL) and lyxose (LL). Hexoses are allose (RRR), altrose (LRR), glucose (RLR), mannose (LLR), gulose (RRL), idose (LRL), galactose (RLL) and talose (LLL). The hexoses can be in furanose (5-ring) or pyranose (6-ring) forms.

These sugars can be in three forms: Chair, half-chair or boat. Chair is by far most common. The chair form can be 4C1 or 1C4. In general the sugars will be in the form that reduces the amount of bulky axial groups. This glycosidic bonds between monomers can be axial-axial, equatorial-axial, axial-equatorial or equatorial-equatorial, which greatly influences the secondary and tertiary structure of the polysaccharide.

There are many modifications that can be done to the monosaccharides. Some of the most common are: D-glucuronic acid (carboxylic acid at C-6), L-rhamnose (6-deoxy-L-mannose). 2-deoxy-D-glucosamine, N-acetyl-D-glucosamine, D-galactose-4-sulphate, D-glucose-6-phosphate and D-mannose-4,6-pyruvate. Some common disaccharides are the glucose dimers maltose ($\alpha$1-4 ax-eq) and cellobiose ($\beta$1-4 eq-eq).

### Polysaccharides

#### Cellulose

$\beta$1-4 linked D-glucopyranose, with eq-eq bonds, unbranched. Insoluble in water. Forms fibrous bundles with high degree of crystallinity, but can also be amorphous. Microcrystalline cellulose is purely crystalline cellulose because the amorphous cellulose has been removed by acid hydrolysis.

There are two main types of cellulose: Cellulose I and II. Cellulose I is the naturally occuring cellulose. The cellulose chains are arranged in a parallel fashion in fully stretched chains, and each glucose is turned 180 degrees compared to the neighbors. C-2 hydrogen bonds with C-6 and the ring oxygen hydrogen bonds with C-3 of the next monomer, and interchain cellulose stabilise the sheets/fibers. Cellulose II is formed when Cellulose I is swelled or dissolved and the precipitated. This form is more thermodynamically stable and has anti-parallel chains arranged in a slightly tilted way.

To functionalize cellulose it is first treated with a strong base so the hydroxyls deprotonate somewhat, then other reagents are introduced to modify the cellulose. Examples of cellulose ethers are carboxymethylcellulose (react with cloroacetic acid), hydroxyethylcellulose (react with ethylene oxide, an epoxide) and methylcellulose (react with methyl chloride). Cellulose esters are cellulose acetate (react with acetic acid anhydride) and cellulose nitrate (react with nitric acid).

#### Amylose, amylopectin, glycogen

$\alpha$1-4 linked glucose, i.e. axial-equatorial bonds. Amylose is linear, and may form ordered structures. Amylopectin has a branched structure consisting of amylose linked together with $\alpha$1-6 at branching points, which are every 12-15 monomers. Each branch keeps on branching, so the whole structure increases in thickness as one moves from the reducing end. Packed together in starch granules. Glycogen is similar to amylopectin, but has a somewhat less regular branching structure and is produced by animals and not plants.

#### Dextran

$\alpha$1-6 linked backbone with $\alpha$1-3 linked branching chains. Commercial dextran used for many experiments as a reference biopolymer is mostly unbranched.

#### Pullulan

Pullulan is a bacterial polymer produced by A. pullulans. Consists of maltotriose units (three $\alpha$1-4 linked D-glucose units) linked together with the flexible $\alpha$1-6 linkage. Easily soluble in water, flexible and available in monodisperse samples.

#### Chitin and chitosan

Linear $\beta$1-4 linked N-acetylglucosamine (2-acetamido-2-deoxy-D-glucose). Similar to cellulose, chitin is insoluble in water in unmodified form. Deacetylation of chitin leads to chitosan ($F_A$ less than 80%), which is soluble at low pH when the amine group is protonated. pKa for the amine group is about 6. Chitosan is practically the only positively charged biopolymer at pH 7, and therefore interact with many biological materials. Between 40 and 60% deacetylation the chitosan fibres are so irregular that they can not crystallize, and are therefore soluble at all pH.

#### Galactans

Galactans are linear polymers. The basic building blocks are galactans that are alternatily $\alpha$1-3 and $\beta$1-4 linked. The galactans are often modified. Carrageenan has 4-linked 3,6-anhydro-$\alpha$-D-galactose or just $\alpha$-D-galactose, and 3-linked $\beta$-D-galactose. $\kappa$-carrageenan is sulphated at the 4-linked residue, while $\iota$-carrageenan is sulphated at both repeating residues. Agarose is similar, but the anhydro residue is $\alpha$-l-galactose instead. Agarose is the ideal unmodified form, while agaropectin is agarose with charged substituents. All linkages in galactans are equatorial, and galactans in nature are structural units in red algae. Carrageenans gel thermoreversibly in the presence of K+ ions, and low sulphate content creates stiffer gels.

#### Xanthan

Produces by bacteria X. campestris and has a comples penta-saccharide repeating unit. The backbone is a cellulose chain $\beta$1-4 linked D-gluocse, and every second unit has a chain consisting of: $\alpha$-D-mannose acetylated at C6, $\beta$-D-glucuronic acid and 4,6-pyruvated-$\beta$-D-mannose. Xanthan forms a double helix in solution, making it very stiff with high intrinsic viscosity and shear thinning properties. Has high molecular weight and a persistence length of about 100 nm.

#### Pectins

Pectins occur in the cell walls of fruits and are rather complex. They have smooth regions consiting of unbranched $\alpha$-D-galacturonate. They can be partially esterfied. The linkages are diaxial, leading to an eggbox structure that can bind Ca2+ ions and induce gelling. Commercially pectins are used to gel jams, and are then highly esterfied (no charge) and used a low pH. Sucrose content needs to be high so the activity of water is changed.

The hairy regions are highly branched with complex branches, and the backbone can be the same as the smooth regions or alternating rhamnose and galacturonate residues.

#### Alginate

Alginate is a polysaccharide produced by brown algae. It is linear, and has to components: $\beta$-D-mannuronate (M) and $\alpha$-L-guluronate (G). The G-G link is axial-axial, while the M-M link is equatorial-equatorial. Alginate is produced by first making long chains of poly-mannuronic acid, then enzymes (epimerases) epimerize selected C-5, changing the sugar from $\beta$-D-mannuronic acid to $\alpha$-L-guluronic acid. Depending on the enzymes it can make long M-blocks, , long G-blocks or alternating MG-blocks, or a mixture. $F_G$ and $F_M$ denotes the fractions of G and M in the alginates, and these fractions can be extended to e.g. $F_{GG}$ for G-dimers, etc. Due to the axial linkages in G-blocks they for eggbox-like structures that bind Ca2+ in the cavities. This associates chains and causes gelation.

The structure of alginates is determined using NMR. NMR can identify the relative fractions of G-blocks, M-blocks, MG-alternating regions, and chain length (of the partially degraded chains used for analysis).

Commercial alginate is often delivered as Na-alginate instead of the acidic form. Natural alginate is very long and therefore behaves almost as a random coil (3-500 kDa), but somewhat stiffer than e.g. pullulan.

Alginate is often used for gelling applications, especially in food products. In medicine it can be used to encapsulate cells. Cells are mixed with an alginate solution, which is dripped into a CaCl2 solution. This causes gelling and cell encapsulation. The beads are then coated with a cationic polymer, and the alginate is dissolved using a chelating agent for Ca2+. Then one has a capsule containing cells. If one wishes homogenous gelation of alginate (not the case if one uses CaCl2) one can mix the alginate with insoluble CaCO3, then add a substance that slowly drops the pH so that the CaCO3 dissolves and starts the gelling.

Alginic acid can also form gels, probably by hydrogen bonds between nearby acidic groups. These show highest gel strength at low and high fractions of G (or M), and gelling increases with increasing molecular weight.

Adding free G-blocks can do one of two things, depending on the Ca2+ concentration: At low Ca2+ the G-blocks bind the calcium and "gel" with the alginate, reducing crosslinking between the alginate chains and thus reducing gel strength/viscosity. At high Ca2+ concentration the G-blocks seem to help crosslink the alginate chains and therefore increase the gel strength.

## Properties of biopolymers in solution

Biopolymers can adopt three idealized shapes in solution: The stiff rod, the compact sphere and the random coil.

### Random coil

There are many models for the random coil, ranging from simple to relatively complex. The worm-like chain model is the most accurate and can account for the transition from random coil to stiff chain. Starting simple we have a chain without restrictions. The end-to-end distance is the sum over all the monomer vectors: $\vec{r}=\sum_{i=1}^n \vec{l_i}$, while the square length is of course $r^2=\vec{r} \cdot \vec{r}=\sum_{i=1}^n \sum_{j=1}^n \vec{l_i} \vec{l_j}$. If we assume all lengths are identical, we can take the average over the square length: $\langle \vec{r}^2 \rangle=nl^2$, because all other $l_i\not=l_j$ will null each other out. This gives $\langle r^2 \rangle = n^{0.5}l$ for a random coil. For a stiff rod we naturally have $r=nl$.

If we extend the model so that we have hindered rotation around bonds, we get a parameter $\langle r^2 \rangle=n\beta^2$, where $\beta$ is a fictitious bond length that includes hindered rotation, and will always be larger than 1. This parameter is related to the stiffness parameter $C_\infty=\beta^2/l^2$, so we can write $\langle r^2 \rangle=n C_\infty/l^2$. The last modification we can do to this simple model is adding a monomer excluded volume, which accounts for the volume of the monomers in the chain. This parameter is called $\alpha$, and $\alpha < 1$ if we are in a bad solvent, because bad solvents cause contraction of the polymer, $\alpha=1$ is a special case called $\theta$-conditions, which is where a slightly bad solvent exactly balances the effect of the monomer volume. $\alpha > 1$ happens if we are in a good solvent. In good solvents $\alpha$ has a small molecular weight dependence, such that $\alpha \propto n^{0-0.1} \Rightarrow r \propto n^{0.5-0.6} \propto M^{0.5-0.6}$

There are other models for the random coil as well. Kuhn used equivalent statistical segments called Kuhn segments, which kind of average a little over the segments. We introduce the parameters $l_k = C_\infty l$, the equivalent Kuhn length, and $N_K=\frac{n}{C_\infty}$, the equivalent number of Kuhn segments. Another much used parameter is the persistence length $a=\frac{1}{2}(C_\infty +1)l \Rightarrow l_K \approx 2a$. The persistence length is defined as the projection of the average end-to-end distance onto the first vector, but can also be interpreted as the distance one must travel along the chain until the direction of the first bond is independent of the bond direction. The persistence length is valid for all chains, while $C_\infty$ only is valid for flexible coils.

Random coils in nature are typically mostly water, often 90-95% of the hydrodynamic volume is really water. Thus they are rather open structures.

### Worm-like chain model

This model is used for a common type of conformation: somewhere between a random coil and a stiff rod (called a stiff coil). It models the chain as a randomly, continuously curving chain of uniform thickness, and has the stiff rod and random coil as limiting values. The persistence length (above) a, the mass per unit length, the contour length and the diameter. $Sett inn formel her$

### Ionic strength

The ionic strength will affect the shape of the biopolymer. The ionic strength is given as $I=\frac{1}{2} \sum_i C_i z_i^2$, so it has a large dependance on the valency of the salt. Increasing ionic strength causes contraction of polyelectrolytes, due to charge screening between repulsive groups.

### Molecular weight averages

Three types of averages: Number average (average number of polymers at given weights), weight average (average weight of polymers at given weight) and z-average.

Mathematically: $\bar{M_n}=\frac{\sum_i N_i M_i}{\sum_i N_i}=\frac{\sum_i w_i}{\sum_i \frac{w_i}{M_i}}=\frac{\sum_i c_i}{\sum_i\frac{c_i}{M_i}}=$

$\bar{M_w}=\frac{\sum_i N_i M_i^2}{\sum_i N_i M_i}=\frac{\sum_i c_i M_i}{\sum_i c_i}$

$\bar{M_z}=\frac{\sum_i N_i M_i^3}{\sum_i N_i M_i^2}$

### Depolymerization

When degrading a polymer randomly we can talk about a degree of scission, $\alpha$. This is the probability of finding a broken bond at a randomly investigated bond. The number of n-mers (where n is the length of the oligomer in question) is then given by $W_n=n\alpha^2 (1-\alpha)^{n-1}$, two for the broken bonds and the rest for non-broken bonds. Maximizing gives $\alpha=\frac{2}{n+1}$.

We further have $\alpha=\frac{1}{DP_n}=\frac{M_0}{M_n}$. Assuming random depolymerization at even speed we have $\frac{d n}{dt}=-kn \Rightarrow ln (\frac{n}{n_0})=-kt \Rightarrow ln(1-\alpha)=-kt$. For small $\alpha$ we then have $ln(1-\alpha) \approx -\alpha = -kt$, or $\alpha = \alpha_0 + kt$ if starting at a given $\alpha_0$. Using the relation from above this gives $\frac{1}{M_n}=\frac{1}{M_{n,0}}+\frac{kt}{M_0}$. For randomly degraded unbranched chains and small degree of scission we have $PI=\frac{M_w}{M_n}\approx 2$, thus $\frac{1}{M_w}=\frac{1}{M_{w,0}}+\frac{kt}{2M_0}$

The following degradation mechanisms can occur.

The most common is acid hydrolysis. The linkage oxygen is protonated, and dissociates from C1. This forms a cation partially resonance stabilized by the ring oxygen. Water is added and H+ eliminated, completing the cleavage. This occurs more rapidly in e.g. sialic acid sugars where there is more resoance stabilization of the carbocation. Alginates are generally resistant, but at pH 1-5 they can self-hydrolyze. Some sugars such as chitosan are very resistant to acid hydrolysis because the positively charged amine group shields the linkage oxygen from protonation.

In general sugars are resistant to alkaline degradation, but some 4-linked sugars that have acid groups at C6 are prone to alkaline hydrolysis. The hydrogen of the $\beta$-carbon (C5) is removed, causing it to form a double bond with C4, breaking the glycosidic linkage. If the carboxyl group on C6 is esterfied (no negative charge), this reaction is much faster.

Polymers can also be degraded by hydrolases (acid-catalyzed) or ligases (base-catalyzed), as well as by oxygen radicals, catalyzed by sunlight, alkaline solutions and redox active substances.

The radius of gyration is the radius at which one could concentrate all the mass into a spherical shell and maintain the same moment of inertia. Other definitions are: the moment of inertia is the mass times the radius of gyration square. This can e.g. be defined as $R_G^2=\frac{\sum_i m_i r_i^2}{\sum_i m_i}$, i.e. the radius of gyration is the root mean square of the inertia divided by the total mass.

#### Spheres

Integrating the moment of inertia of a sphere and dividing by the total mass gives $R_G^2=3/5 R^2$. Since $R_G \propto R$ and $M \propto R^3$, this means $R_G(sphere) \propto M^{1/3}$.

#### Rods

Integrating the moment of inertia of a cylinder and dividing by the total mass gives $R_G^2=1/12 L^2$. Since $L \propto M$ for a rod, this means $R_G \propto M$ for a rod.

#### Random coils

Here we can only use averages. Calculations give $R_G = \sqrt{1/6 \langle r^2 \rangle}=\sqrt{1/6 n \beta^2 \alpha^2}=\sqrt{1/6 \beta^2} n^{0.5} \alpha$. Since $n \propto M$ and $\alpha \propto n^{0-0.1} \propto M^{0-0.1}$, $R_G \propto M^{0.5-0.6}$

Most biopolymers have intermediate forms, and these intermediate forms can be estimated from the radius of gyration, which gives if they are in the collapsed coils (between random coil and sphere) or stiff coils (between random coil and rods).

### Disorder and order

#### Bond angles

As mentioned above there can be hindered rotation around certain bonds. A Ramachandran plot can be used to gain an overview over allowed angles.

### Types of bonds

The main types of bonds are: covalent bonds (200-1000 kJ/mol), ion-ion bonds (40-400 kJ/mol), ion - induced dipole and hydrogen bonds (4-40 kJ/mol) and other types of bonds (0-4 kJ/mol). In comparison the thermal energy is 2.4 kJ/mol at 25C, the bond strength must be greater than this to be stable. A hydrogen bond can when a hydrogen atom is trapped between two highly electronegative atoms, such as oxygen or nitrogen. It is covalently bound to one of them at a distance of about 1 Å, and electrostatically bound to the other at a distance of 1.8 Å. The angle also plays a role in the bond strength.

#### Thermodynamics

The shape of the biopolymer in solution will depend on the thermodynamics of the system. In general, a random coil is a state of high entropy for the biopolymer, while an ordered conformation such as an insoluble crystal or a specific protein fold leads to loss of entropy for the polymer. For water, the entropy depends on the solute. If the solute is small, there is favorable entropy in dissolution. The larger the solute is, the less entropy gain one gets from dissolution, because water molecules must form ordered structures encapsulating the biopolymer. Thus solubility increases for smaller solutes. Enthalpy is also sometimes a driving force. However, if the enthalpic contribution to a specific type of conformation comes from hydrogen bonds, this change is almost 0 since water can form the same hydrogen bonds. Thus the thing that causes e.g. certain folds in proteins is often the entropy gain in water from "releasing" the protein into a folded shape.

The contributions from entropy and enthalpy lead to these possible scenarios for dissolution of biopolymers or denaturation of proteins, for the total change in entropy and enthalpy for the system:

$\Delta H > 0, \Delta S > 0$: Increasing temperature will make the entropy contribution dominate, dissolution or denaturation at increasing temperature.

$\Delta H < 0, \Delta S > 0$: Always soluble.

$\Delta H > 0, \Delta S < 0$: Never soluble.

$\Delta H < 0, \Delta S < 0$: Can be soluble at low temperatures where the entropy contribution is low compared to the enthalpy contribution.

For polyelectrolytes the entropy gain from the dissolution of counter-ions is large. Since electroneutrality must be maintained, polyelectrolytes are mostly soluble, but this solubility decreases with increasing ionic strength.

#### Denaturation/solubility

In light of the above the following factors can influence the solubility/denaturation of biopolymers:

• Chaotropic agents: Bind to water, decreasing the entropy gain of water leaving the biopolymer, denatures proteins.
• Temperature: Denaturation/solvation according to the balance of enthalpy and entropy, stated above.
• Acid/base equilibrium: Can cause charge changes, repulsion between charged groups, etc. Similar as ionic strength, and is a contribution to ionic strength.
• Organic solvents: Changing the solvent for something other than water can cause denaturation or solvation.
• Mechanical treatment: Stress on the system can cause changes in conformation.
• Pressure: Denaturation/solvation often causes a decrease in net volume. A higher pressure will favor this transition.

### Thermodynamics of dilute solutions

In a dilute solution the concentration must be lower than a given critical overlap concentration. For spheres this concentration is $c^*=\frac{2.5}{[\eta]}$, but values of 1-4 are observed.

An ideal solution has $\Delta H_mix = 0$ and $\Delta S_mix = -R \sum_i n_i ln X_i$, which comes from using Boltzmann's entropy definition and Sterling's formula on the system. The chemical potential of an ideal system is then given på $\Delta \mu_i = RT ln X_i$. In regular solutions the enthalpy is still 0 but the entropy is different. This is caused by a deviation from the assumption that all molecules are the same size, as is the case for biopolymersolutions. Irregular (real) solutions also have a contribution from the enthalpy, which can be negative (exothermic) or positive (endothermic).

If we look at a two-component system, we can view the chemical potential of the solvent as $\Delta \mu_1 = RT ln X_1$. Since we can exchange $X_2=1-X_1$ and expand the logarithm with a series expansion, we get $\Delta \mu_1 = -RT(X_2 +1/2X_2^2 + ...$, which is approximately $\Delta \mu_1 = -RT[\frac{c_2 V_1^0}{M_2}+\frac{1}{2}(\frac{c_2 V_1^0}{M_2})^2 ]$ in dilute solutions. Extracting $V_1^0 c_2$ we get $\Delta \mu_1 = -RTV_1^0 c_2(A_1 + c_2 A_2 + ...)$, where $A_n$ are virial coefficients. The first just gives the inverse of the molecular weight (number average), while the second tells about the deviation from ideality in the solution (if $A_2 \not= \frac{1}{2} \frac{V_1^0}{M_2^2}$).

For random coil we have: $A_2=\frac{16 \pi}{3}\frac{R^3 N_A}{M_2^2}=\frac{16 \pi}{3}\frac{N_A}{M_2^2}\gamma^3 R_G^3$, while spheres have $A_2=4 \frac{\bar{v}}{M_2}$ and rods have $A_2=\frac{L \bar{v}}{d M_2}$

We can define an excluded solute volume as $N_A u$, the ideal term being $u=1$ and $A_2=\frac{N_A u}{2 M_2^2}$. Leaping back a bit, we have that if we are in a $\theta$-solvent, $\alpha=1$, so u=0 and $A_2=0$. If we are in a good solvent, $\alpha > 1$, so $u > 0$ and $A_2 > 0$, while a bad solvent gives $\alpha < 1, u < 0, A_2<0$

#### Osmosis

Using the difference between chemical potential on two sides, one with pure solvent and one with solvent and solute, one can calculate $A_1$ and $A_2$. We get for the pressure on the solute side: $\frac{\Pi}{c_2}=RT \left(\frac{1}{M_2}+A_2 c_2\right)$. This plot can give information both on the molecular weight (number average) and the second viral coefficient. If we look at polyelectrolytes and don't add salt to the solvent, the counterions will totally dominate the osmotic pressure (remember number average, size does not matter), so measurements can not be done in this case.

In the case when salt is added measurements can be done on polyelectrolytes too. Using the fact that charge neutrality must be maintained as well as the chemical potential is equal at equilibrium, one finds that $A_2=\frac{z^2}{4 M_2^2 C_{BX}}$. At high ionic strengths this term is small, so then an accurate estimate of the molecular weight can be done.

Typical $A_2$ values $\left[\frac{ml\cdot mol}{g^2}\right]$:

• $10^{-3}$: Coils in good solvents, polyelectrolytes, rods
• $10^{-5}-10^{-4}$: Spheres
• 0: Coils in $\theta$-solvents
• Negative: Coils in bad solvents, or association phenomena

### Viscosity

Viscosity in a solution is caused by energy dissipation. Defined viscosity is $\eta=\frac{\sigma}{\dot{\gamma}}$, where $\sigma$ is applied shear stress and $\dot{\gamma}$ is shear rate. For Newtonian fluids this rate is constant, and measurements must be done in this region.

For spherical molecules in solution the solution viscosity is given by $\eta_s=\eta_0(1+2.5 \Phi)$, i.e. only a function of the volume fraction, not the size. Required is that we are in a dilute solution.

There are some other defintions: $\eta_r=\frac{\eta_s}{\eta_0}$, relative viscosity, and $\eta_{sp}=\eta_r-1$, specific viscosity. Combined with the relationship above, we get $\eta_{sp}=\nu \Phi$. We can alternatively express $\Phi=v_h c$ as a function of the hydrodynamic volume and the concentration, giving the formula $\frac{\eta_{sp}}{c}=\nu v_h + ... \Rightarrow [\eta]=\lim_{c \to 0} \lim_{\dot{\gamma} \to 0} \left(\frac{\eta_{sp}}{c}\right)=\nu v_h$ for low solution concentrations, where $[\eta]$ is the intrinsic viscosity, independent of the concentration, but dependent on the solvent/biopolymer system.

This is the origin of the definition for critical concentration for dilute solutions, since $v_h c^* =1$ means overlap, and v=2.5 for spheres. Combining the definition for intrinsic viscosity and that for the expanded specific viscosity, one obtains the important Huggins' equation:

$\eta_{sp}=[\eta]c+k[\eta]^2 c^2$

In practice one uses that $ln(\eta_r)/c$ intercepts at $[\eta]$. Combined with that $\eta_r \propto t$, where one measures the relative flow-through time $t$, this gives the intrinsic viscosity in a measurable way.

An extremely important relation is the Mark-Hoywink-Sakurada equation (MHS), which gives the relation between intrinsic viscosity molecular weight, with the general formula $[\eta]=KM^a$. As mentioned earlier for spheres the intrinsic viscosity is independent of the size of the spheres, and thus also the molecular weight, so a=0 for spheres. For stiff rods we have a=1.8. For random coils we find that $[\eta] \propto \frac{R_G^3}{M} \Rightarrow [\eta] = K M^{0.5-0.8}$.

The average intrinsic viscosity in a polydisperse system is given by $\bar{[\eta]}=K \lim_{c \to 0} \frac{\sum_i M_i^a c_i}{\sum_i c_i}$. We see that if a=1 this gives a weight average, while for other a-values it gives other averages.

### Light scattering

Interestingly, light scattering (actually multi-angle laser light scattering, MALLS) involves the chemical potential of the molecules involved. This gives us the basic scattering equation for point scatter:

$\frac{Kc}{R_\theta}=\frac{1}{M}+2A_2 c_2$,

where K is a constant that depends on the refractive index of the solution, the change in refractive index with changing concentration and the wavelength, while $R_\theta$ is the scattering intensity at a given angle in the specific system. Thus a plot of the function gives $\frac{1}{M}$ as the intercept and $2 A_2$ as the slope. The molecular weight average obtained here is the weight average molecular weight.

For radii of gyration between $\lambda/20$ and $\lambda/2$ a more advanced theory is needed, because then molecules no longer can be assumed to be point scatterers. This leads to an extra factor, and gives us the final light scattering equation:

$\frac{Kc}{R_\theta}=\left(1+\frac{16 \pi^2 R_G^2}{3 \lambda^2}\sin^2(\theta/2)\right)\left(\frac{1}{M}+2 A_2 c\right)$.

The radius of gyration determined here is the z-average.

To estimate the parameters one follows a 4-step method.

• Plot Kc/R as a function of $\sin^2(\theta/2)$ for each concentration, and then extrapolate to 0 angle.
• Plot the intercepts from above as a function of the concentration, this gives 1/M as the intercept and $2A_2$ as the slope.
• Plot Kc/R as a function of the concentration for each angle, then extrapolate to 0 concentration.
• Plot the intercepts as a function of $sin^2(\theta/2)$. The intercept of this line gives 1/M again, while the slope gives $\frac{1}{M}\left(\frac{16 \pi^2 R_G^2}{3\lambda^2}\right)$, which allows determination of $R_G$

Remember that the wavelength is the effective wavelength in the solution, e.g. $\lambda=\lambda_0/n$. The Zimm diagram automates this process and puts it all together into one diagram. For polyelectrolytes a high value of $A_2$ will influence the results, so measurements should be done at high ionic strength (0.01-0.1 M). Small amounts of dust and such also can disturb results.

SAXS is done on molecules to get more accurate data, especially in the region of $R_G=5-25 nm$.

### Size exclusion chromatography

Size exclusion chromatography (SEC) is a form of column chromatography. One sends a polydisperse solution through a column at a low flow rate. In the column are particles made of crosslinked gels of e.g. polyacrylamide and other synthetic polymers. They have pores that match the size of smaller particles, but are too small for larger particles to penetrate. The flow rate is low enough that a diffusion equilibrium is established for the small molecules. Since the liquid does not move in the pores, these elute slower than the larger molecules.

At the outlet of the column a concentration detector (can be refractive index, UV or fluorescent) detects the immediate concentration, and this is shown as a function of the elution volume. To convert the concentration at a given time (which is related to the particle size), one has that log(M) is linearly related to the concentration. For each biopolymer calibration curves must be made for accurate results.

In theory only the hydrodynamic volume will effect the elution volume at which a given molecular weight elutes at. Since $v_h'=\frac{[\eta]M}{N_A}$, this can be calculated from viscosity data. This works well for synthetic polymers, but if absorption effects play a role, as is the case for most biopolymers, one has to use standards for each polymer.

SEC can be combined with MALLS to obtain very much data. The elution volume is divided into "slices" of e.g. 0.1 mL which are assumed to be monodisperes. MALLS is done on each slice, so one obtains the weight average molecular weight and radius of gyration for each slice. Since we only have one concentration, $A_2$ cannot be found, but if one works with low enough concentrations this factor can be neglected.

Since each slice is assumed to be monodisperse the molecular weight can be plotted as a function of the elution volume, together with radius of gyration data and the concentration profile. The molecular weight averages one wishes (e.g. number average or weight average) can be calculated for the sample. Molecular weight fractions can also be plotted as a function of the molecular weight, to visualize the distribution. A cumulative distribution is often used as well. Since regular MALLS does not work for small particles (under about 30 nm) the data for $R_G$ becomes mostly noise under this limit. The radius of gyration can also be plotted as a function of molecular weight. Since we know that $R_G \propto KM^\alpha$, a plot of $log R_G = \alpha log M + K$ gives the exponent, allowing the determination of the conformation of the biopolymer. An example is data from chitosan, which shows $\alpha=0.57$, which means a random coil.

Adding a viscosity detector to the system can extend the range of analysis, because the viscosity is not limited by the wavelength of light like MALLS is. Then one can plot $log [\eta]=a log M + K$ and obtain information about the shape. If done for a wide range of molecular weights one is likely to obtain two different slopes, one more like the random coil (high molecular weight) and one more like a stiff rod (low molecular weight). Applying the worm-like chain model allows fitting to the entire dataset, and allows determination of the persistence length of the biopolymer.

## Concentrated solutions and rheology

If we are in the concentration range were $c > \frac{1-4}{[\eta]}$, we are in the semidilute region, where polymers retain their original shape, but they are in continuous contact with each other. They can form gels in this region. If the concentration is even higher then the polymers start mixing with each other and form a continuous network and one observes a uniform distribution of polymer segments. If one plots the viscosity vs. the concentration in a log-log plot, one get linear regions with different exponents with increasing concentration, as one passes from dilute to semi-dilute to concentrated solutions.

When looking at viscosity one can extend the concepts introduced for dilute solutions above. For non-Newtonian fluids the viscosity is dependent on the rate of shear. If the viscosity increases with increasing rate of shear the solution is shear thickening. This is caused by ordering of molecules such as starch granules into layers that don't move much. Shear thinning (low viscosity at high rate of shear) is typical of many polymer solutions with long rod-like molecules, such as xanthan solutions. Bingham plastics have a certain initial shear stress that is needed to obtain a shear rate at all, an example is toothpaste.

These can also be time dependent, such as in the thixotropic latex paints that thin after some stirring. The opposite is rheopotry, no known cases. Viscosity must always be measured in Newtonian range.

### Mechanical spectroscopy

Elasticity is in some ways the opposite of viscosity. While in viscosity energy is dissipated and lost, in elasticity the energy is merely stored and released entirely (for perfect elasticity). Here another modulus (than the viscosity) reigns: The shear modulus G, given by $G=\frac{\sigma}{\gamma}$. For compressive or tensile stresses one can define another modulus, Young's modulus $\frac{F}{A}=E\frac{\Delta l}{l}$. Typical gels, and in fact most materials, are viscoelastic, which means they display both properties, depending on the observation time one uses. Each material has a typical relaxation time $\tau$, and the ration $\tau/t$ is called the Deborah number of the material. A large Deborah number implies a solid material, while a low Deborah number implies a fluid material.

One common method of measuring the viscous and elastic components of a material is using oscillatory measurements. One applies a sinoidal deformation, and seperates the stress into an elastic and a viscous component. We call G' the modulus for the elastic response and $G=\eta \omega$ and the modulus for viscous response. The sinodal form of the stress gives that $G'=(\sigma_0/\gamma_0)cos \delta$ and $G=(\sigma_0/\gamma_0)sin \delta$, so $G/G'=tan \delta$, where $\delta$ is the phase angle between the responses. This gives the following relations:

• If the observation time $\omega \tau<1 \Rightarrow t>\tau \Rightarrow G'<G \Rightarrow \delta > 45^\circ$ the material is a viscoelastic liquid (viscous part dominating)
• If the observation time $\omega \tau=1 \Rightarrow t=\tau \Rightarrow G'=G \Rightarrow \delta = 45^\circ$ the material is in transition
• If the observation time $\omega \tau>1 \Rightarrow t<\tau \Rightarrow G'>G \Rightarrow \delta < 45^\circ$ the material is a viscoelastic solid (elastic part dominating)

One can do frequency sweeps of materials to obtain this information for the sample in different types of uses.

### Phase seperations

There are four situations were phase separations can that can be treated separately.

If two polyelectrolytes are in solution together they are expected to phase separate associatively if they are oppositely charged, forming complexes (such as nanoparticles). There is no enthalpy penalty because the association between opposite charges is similar between two polymers or between the polymer and the counterions. However, there is a large entropy gain in the release of counterions. Increasing ionic strength will reduce this drive though.

Segregative phase seperation happens between two like-charged polymers if they repel one another. Then the entropy of mixing is not especially much larger even if the polymers are in both phases, because the counter-ions will be distributed in the entire volume regardless. This also causes small changes even at high ionic strength. The situation is similar as with two uncharged polymers.

In the case of two non-charged polymers there is a general increase in the entropy of water if there is phase seperation. Like the situation above the type of phase separation will depend on the short-range interaction between the polymers.

If there is one polyelectrolyte and one neutral polymer, the polyelectrolyte will spread in the whole solution, while the neutral polymer will be miscible in this system. At high ionic strengths one might observe segregative phase separation if the interaction is repulsive between the polymers.

If both components gel and there is only one phase present they can form a mixed, interpenetrating gel, or a coupled gel, depending on the interaction between the polymers. If one component gels one can have a filled or unfilled gel. They can also make two phases, either bi-continuous gel phases or emulsion-like.