module Bio::Protparam::Local

Public Instance Methods

aa_comp(aa_code=nil) click to toggle source

Calculate the percentage composition of an AA sequence as a Hash object. It return percentage of a given amino acid if aa_code is not nil.

# File lib/bio/util/protparam.rb, line 744
def aa_comp(aa_code=nil)
  if aa_code.nil?
    aa_map = {}
    IUPAC_CODE.keys.each do |k|
      aa_map[k] = 0.0
    end
    aa_map.update(aa_comp_map){|k,_,v| round(v, 1) }
  else
    round(aa_comp_map[aa_code], 1)
  end
end
aliphatic_index() click to toggle source

Calculate aliphatic index of an AA sequence.

_The aliphatic index of a protein is defined as the relative volume occupied by aliphatic side chains (alanine, valine, isoleucine, and leucine). It may be regarded as a positive factor for the increase of thermostability of globular proteins._

# File lib/bio/util/protparam.rb, line 714
def aliphatic_index
  aa_map = aa_comp_map
  @aliphatic_index ||=  round(aa_map[:A]        +
                              2.9 * aa_map[:V]  +
                              (3.9 * (aa_map[:I] + aa_map[:L])), 2)
end
amino_acid_number() click to toggle source

Return the number of residues in an AA sequence.

# File lib/bio/util/protparam.rb, line 530
def amino_acid_number
  @seq.length
end
gravy() click to toggle source

Calculate GRAVY score of an AA sequence.

_The GRAVY(Grand Average of Hydropathy) value for a peptide or protein is calculated as the sum of hydropathy values [9] of all the amino acids, divided by the number of residues in the sequence._

# File lib/bio/util/protparam.rb, line 729
def gravy
  @gravy ||= begin
               hydropathy_sum = 0.0
               each_aa do |aa|
                 hydropathy_sum += HYDROPATHY[aa]
               end
               round(hydropathy_sum / @seq.length.to_f, 3)
             end
end
half_life(species=nil) click to toggle source

Return estimated half_life of an AA sequence.

_The half-life is a prediction of the time it takes for half of the amount of protein in a cell to disappear after its synthesis in the cell. ProtParam relies on the “N-end rule”, which relates the half-life of a protein to the identity of its N-terminal residue; the prediction is given for 3 model organisms (human, yeast and E.coli)._

# File lib/bio/util/protparam.rb, line 644
def half_life(species=nil)
  n_end = @seq[0].chr.to_sym
  if species
    HALFLIFE[species][n_end]
  else
    {
      :ecoli     => HALFLIFE[:ecoli][n_end],
      :mammalian => HALFLIFE[:mammalian][n_end],
      :yeast     => HALFLIFE[:yeast][n_end]
    }
  end
end
instability_index() click to toggle source

Calculate instability index of an AA sequence.

_The instability index provides an estimate of the stability of your protein in a test tube. Statistical analysis of 12 unstable and 32 stable proteins has revealed [7] that there are certain dipeptides, the occurence of which is significantly different in the unstable proteins compared with those in the stable ones. The authors of this method have assigned a weight value of instability to each of the 400 different dipeptides (DIWV)._

# File lib/bio/util/protparam.rb, line 669
def instability_index
  @instability_index ||=
    begin
      instability_sum = 0.0
      i = 0
      while @seq[i+1] != nil
        aa, next_aa = [@seq[i].chr.to_sym, @seq[i+1].chr.to_sym]
        if DIWV.key?(aa) && DIWV[aa].key?(next_aa)
          instability_sum += DIWV[aa][next_aa]
        end
        i += 1
      end
      round((10.0/amino_acid_number.to_f) * instability_sum, 2)
    end
end
molecular_weight() click to toggle source

Calculate molecular weight of an AA sequence.

_Protein Mw is calculated by the addition of average isotopic masses of amino acids in the protein and the average isotopic mass of one water molecule._

# File lib/bio/util/protparam.rb, line 609
def molecular_weight
  @mw ||= begin
            mass = WATER_MASS
            each_aa do |aa|
              mass += AVERAGE_MASS[aa.to_sym]
            end
            (mass * 10).floor().to_f / 10
          end
end
num_carbon() click to toggle source

Return the number of carbons.

# File lib/bio/util/protparam.rb, line 569
def num_carbon
  @num_carbon ||= total_atoms :C
end
num_hydrogen() click to toggle source
# File lib/bio/util/protparam.rb, line 573
def num_hydrogen
  @num_hydrogen ||= total_atoms :H
end
num_neg() click to toggle source

Return the number of negative amino acids (D and E) in an AA sequence.

# File lib/bio/util/protparam.rb, line 514
def num_neg
  @num_neg ||= @seq.count("DE")
end
num_nitro() click to toggle source

Return the number of nitrogens.

# File lib/bio/util/protparam.rb, line 581
def num_nitro
  @num_nitro ||= total_atoms :N
end
num_oxygen() click to toggle source

Return the number of oxygens.

# File lib/bio/util/protparam.rb, line 589
def num_oxygen
  @num_oxygen ||= total_atoms :O
end
num_pos() click to toggle source

Return the number of positive amino acids (R and K) in an AA sequence.

# File lib/bio/util/protparam.rb, line 522
def num_pos
  @num_neg ||= @seq.count("RK")
end
num_sulphur() click to toggle source

Return the number of sulphurs.

# File lib/bio/util/protparam.rb, line 597
def num_sulphur
  @num_sulphur ||= total_atoms :S
end
stability() click to toggle source

Return wheter the sequence is stable or not as String (stable/unstable).

_Protein whose instability index is smaller than 40 is predicted as stable, a value above 40 predicts that the protein may be unstable._

# File lib/bio/util/protparam.rb, line 693
def stability
  (instability_index <= 40) ? "stable" : "unstable"
end
stable?() click to toggle source

Return true if the sequence is stable.

# File lib/bio/util/protparam.rb, line 701
def stable?
  (instability_index <= 40) ? true : false
end
theoretical_pI() click to toggle source

Claculate theoretical pI for an AA sequence with bisect algorithm. pK value by Bjelqist, et al. is used to calculate pI.

# File lib/bio/util/protparam.rb, line 624
def theoretical_pI
  charges = []
  residue_count().each do |residue|
    charges << charge_proc(residue[:positive],
                           residue[:pK],
                           residue[:num])
  end
  round(solve_pI(charges), 2)
end
total_atoms(type=nil) click to toggle source

Return the number of atoms in a sequence. If type is given, return the number of specific atoms in a sequence.

# File lib/bio/util/protparam.rb, line 539
def total_atoms(type=nil)
  if !type.nil?
    type = type.to_sym
    if /^(?:C|H|O|N|S){1}$/ !~ type.to_s
      raise ArgumentError, "type must be C/H/O/N/S/nil(all)"
    end
  end
  num_atom = {:C => 0,
              :H => 0,
              :O => 0,
              :N => 0,
              :S => 0}
  each_aa do |aa|
    ATOM[aa].each do |t, num|
      num_atom[t] += num
    end
  end
  num_atom[:H] = num_atom[:H] - 2 * (amino_acid_number - 1)
  num_atom[:O] = num_atom[:O] - (amino_acid_number - 1)
  if type.nil?
    num_atom.values.inject(0){|prod, num| prod += num }
  else
    num_atom[type]
  end
end

Private Instance Methods

aa_comp_map() click to toggle source
# File lib/bio/util/protparam.rb, line 758
def aa_comp_map
  @aa_comp_map ||=
    begin
      aa_map  = {}
      aa_comp = {}
      sum = 0
      each_aa do |aa|
        if aa_map.key? aa
          aa_map[aa] += 1
        else
          aa_map[aa] = 1
        end
        sum += 1
      end
      aa_map.each {|aa, count| aa_comp[aa] = (Rational(count,sum) * 100).to_f }
      aa_comp
    end
end
charge_proc(positive, pK, num) click to toggle source

Return proc calculating charge of a residue.

# File lib/bio/util/protparam.rb, line 790
def charge_proc positive, pK, num
  if positive
    lambda {|ph|
      num.to_f / (1.0 + 10.0 ** (ph - pK))
    }
  else
    lambda {|ph|
      (-1.0 * num.to_f) / (1.0 + 10.0 ** (pK - ph))
    }
  end
end
each_aa() { |chr.to_sym| ... } click to toggle source
# File lib/bio/util/protparam.rb, line 777
def each_aa
  @seq.each_byte do |x|
    yield x.chr.to_sym
  end
end
positive?(residue) click to toggle source
# File lib/bio/util/protparam.rb, line 783
def positive? residue
  (residue == "H" || residue == "R" || residue == "K")
end
residue_count() click to toggle source

Transform AA sequence into residue count

# File lib/bio/util/protparam.rb, line 805
def residue_count
  counted = []
  # N-terminal
  n_term = @seq[0].chr
  if PK[:nterm].key? n_term.to_sym
    counted << {
      :num => 1,
      :residue => n_term.to_sym,
      :pK => PK[:nterm][n_term.to_sym],
      :positive => positive?(n_term)
    }
  elsif PK[:normal].key? n_term.to_sym
    counted << {
      :num => 1,
      :residue => n_term.to_sym,
      :pK => PK[:normal][n_term.to_sym],
      :positive => positive?(n_term)
    }
  end
  # Internal
  tmp_internal = {}
  @seq[1,(@seq.length-2)].each_byte do |x|
    aa = x.chr.to_sym
    if PK[:internal].key? aa
      if tmp_internal.key? aa
        tmp_internal[aa][:num] += 1
      else
        tmp_internal[aa] = {
          :num => 1,
          :residue => aa,
          :pK => PK[:internal][aa],
          :positive => positive?(aa.to_s)
        }
      end
    end
  end
  tmp_internal.each do |aa, val|
    counted << val
  end
  # C-terminal
  c_term = @seq[-1].chr
  if PK[:cterm].key? c_term.to_sym
    counted << {
      :num => 1,
      :residue => c_term.to_sym,
      :pK => PK[:cterm][c_term.to_sym],
      :positive => positive?(c_term)
    }
  end
  counted
end
round(num, ndigits=0) click to toggle source
# File lib/bio/util/protparam.rb, line 913
def round(num, ndigits=0)
  (num * (10 ** ndigits)).round().to_f / (10 ** ndigits).to_f
end
solve_pI(charges) click to toggle source

Solving pI value with bisect algorithm.

# File lib/bio/util/protparam.rb, line 860
def solve_pI charges
  state = {
    :ph => 0.0,
    :charges => charges,
    :pI => nil,
    :ph_prev => 0.0,
    :ph_next => 14.0,
    :net_charge => 0.0
  }
  error = false
  # epsilon means precision [pI = pH +_ E]
  epsilon = 0.001

  loop do
    # Reset net charge
    state[:net_charge] = 0.0
    # Calculate net charge
    state[:charges].each do |charge_proc|
      state[:net_charge] += charge_proc.call state[:ph]
    end

    # Something is wrong - pH is higher than 14
    if state[:ph] >= 14.0
      error = true
      break
    end

    # Making decision
    temp_ph = 0.0
    if state[:net_charge] <= 0.0
      temp_ph    = state[:ph]
      state[:ph] = state[:ph] - ((state[:ph] - state[:ph_prev]) / 2.0)
      state[:ph_next] = temp_ph
    else
      temp_ph    = state[:ph]
      state[:ph] = state[:ph] + ((state[:ph_next] - state[:ph]) / 2.0)
      state[:ph_prev] = temp_ph
    end

    if (state[:ph] - state[:ph_prev] < epsilon) &&
      (state[:ph_next] - state[:ph] < epsilon)
      state[:pI] = state[:ph]
      break
    end
  end

  if !state[:pI].nil? && !error
    state[:pI]
  else
    raise "Failed to Calc pI: pH is higher than 14"
  end
end