class Demultiplexer
Class containing methods for demultiplexing MiSeq sequences.
Adding VERSION
constant to class.
Constants
- DEFAULT
- VERSION
Attributes
Public Class Methods
Internal: Constructor method for Demultiplexer
object.
fastq_files - Array with paths to FASTQ files. options - Options Hash.
:verbose - Verbose flag (default: false). :mismatches_max - Integer value indicating max mismatches (default: 0). :samples_file - String with path to samples file. :revcomp_index1 - Flag indicating that index1 should be reverse-complemented (default: false). :revcomp_index2 - Flag indicating that index2 should be reverse-complemented (default: false). :output_dir - String with output directory (optional). :scores_min - An Integer representing the Phred score minimum, such that a reads is dropped if a single position in the index contain a score below this value (default: 16). :scores_mean=> - An Integer representing the mean Phread score, such that a read is dropped if the mean quality score is below this value (default: 16).
Returns Demultiplexer
object
# File lib/demultiplexer.rb, line 104 def initialize(fastq_files, options) @options = options @samples = SampleReader.read(options[:samples_file], options[:revcomp_index1], options[:revcomp_index2]) @undetermined = @samples.size @index_hash = IndexBuilder.build(@samples, options[:mismatches_max]) @data_io = DataIO.new(@samples, fastq_files, options[:compress], options[:output_dir]) @status = Status.new(@samples) end
Public: Class method to run demultiplexing of MiSeq sequences.
fastq_files - Array with paths to FASTQ files. options - Options Hash.
:verbose - Verbose flag (default: false). :mismatches_max - Integer value indicating max mismatches (default: 0). :samples_file - String with path to samples file. :revcomp_index1 - Flag indicating that index1 should be reverse-complemented (default: false). :revcomp_index2 - Flag indicating that index2 should be reverse-complemented (default: false). :output_dir - String with output directory (optional). :scores_min - An Integer representing the Phred score minimum, such that a reads is dropped if a single position in the index contain a score below this value (default: 16). :scores_mean=> - An Integer representing the mean Phread score, such that a read is dropped if the mean quality score is below this value (default: 16).
Examples
Demultiplexer.run(['I1.fq', 'I2.fq', 'R1.fq', 'R2.fq'], \ samples_file: 'samples.txt') # => <Demultiplexer>
Returns Demultiplexer
object
# File lib/demultiplexer.rb, line 71 def self.run(fastq_files, options) options = DEFAULT.merge(options) log_file = File.join(options[:output_dir], 'Demultiplex.log') demultiplexer = new(fastq_files, options) Screen.clear if options[:verbose] demultiplexer.demultiplex puts demultiplexer.status if options[:verbose] demultiplexer.status.save(log_file) end
Public Instance Methods
Internal: Method to demultiplex reads according the index. This is done by simultaniously read-opening all input files (forward and reverse index files and forward and reverse read files) and read one entry from each. Such four entries we call a set of entries. If the quality scores from either index1 or index2 fails the criteria for mean and min required quality the set is skipped. In the combined indexes are found in the search index, then the reads are writting to files according to the sample information in the search index. If the combined indexes are not found, then the reads have their names appended with the index sequences and the reads are written to the Undertermined files.
Returns nothing.
# File lib/demultiplexer.rb, line 128 def demultiplex @data_io.open_input_files do |ios_in| @data_io.open_output_files do |ios_out| ios_in.each do |index1, index2, read1, read2| @status.count += 2 puts(@status) if @options[:verbose] && (@status.count % 1_000) == 0 next unless index_qual_ok?(index1, index2) match_index(ios_out, index1, index2, read1, read2) # break if @status.count == 100_000 end end end end
Private Instance Methods
Internal: Method to check the mean quality scores of the given indexes. If the mean score is higher than @options the indexes are OK.
index1 - Index1 Seq object. index2 - Index2 Seq object.
Returns true if quality mean OK, else false.
# File lib/demultiplexer.rb, line 229 def index_qual_mean_ok?(index1, index2) if index1.scores_mean < @options[:scores_mean] @status.index1_bad_mean += 2 return false elsif index2.scores_mean < @options[:scores_mean] @status.index2_bad_mean += 2 return false end true end
Internal: Method to check the min quality scores of the given indexes. If the min score is higher than @options the indexes are OK.
index1 - Index1 Seq object. index2 - Index2 Seq object.
Returns true if quality min OK, else false.
# File lib/demultiplexer.rb, line 249 def index_qual_min_ok?(index1, index2) if index1.scores_min < @options[:scores_min] @status.index1_bad_min += 2 return false elsif index2.scores_min < @options[:scores_min] @status.index2_bad_min += 2 return false end true end
Internal: Method to check the quality scores of the given indexes. If the mean score is higher than @options or if the min score is higher than @options then the indexes are OK.
index1 - Index1 Seq object. index2 - Index2 Seq object.
Returns true if quality OK, else false.
# File lib/demultiplexer.rb, line 216 def index_qual_ok?(index1, index2) index_qual_mean_ok?(index1, index2) && index_qual_min_ok?(index1, index2) end
Internal: Method that matches the combined index1 and index2 sequences against the search index. In case of a match the reads are written to file according to the information in the search index, otherwise the reads will have thier names appended with the index sequences and they will be written to the Undetermined files.
ios_out - DataIO
object with an accessor method for file output handles. index1 - Seq object with index1. index2 - Seq object with index2. read1 - Seq object with read1. read2 - Seq object with read2.
Returns nothing.
# File lib/demultiplexer.rb, line 161 def match_index(ios_out, index1, index2, read1, read2) key = "#{index1.seq.upcase}#{index2.seq.upcase}".hash if (sample_id = @index_hash[key]) write_match(ios_out, sample_id, read1, read2) else write_undetermined(ios_out, index1, index2, read1, read2) end end
Internal: Method that writes a index match to file according to the information in the search index.
ios_out - DataIO
object with an accessor method for file output handles. read1 - Seq object with read1. read2 - Seq object with read2.
Returns nothing.
# File lib/demultiplexer.rb, line 179 def write_match(ios_out, sample_id, read1, read2) @status.match += 2 io_forward, io_reverse = ios_out[sample_id] io_forward.puts read1.to_fastq io_reverse.puts read2.to_fastq end
Internal: Method that appends the read names with the index sequences and writes the reads to the Undetermined files.
ios_out - DataIO
object with an accessor method for file output handles. index1 - Seq object with index1. index2 - Seq object with index2. read1 - Seq object with read1. read2 - Seq object with read2.
Returns nothing.
# File lib/demultiplexer.rb, line 197 def write_undetermined(ios_out, index1, index2, read1, read2) @status.undetermined += 2 read1.seq_name = "#{read1.seq_name} #{index1.seq}" read2.seq_name = "#{read2.seq_name} #{index2.seq}" io_forward, io_reverse = ios_out[@undetermined] io_forward.puts read1.to_fastq io_reverse.puts read2.to_fastq end