class Subconv::Scc::Reader
SCC reader Parse and render an SCC file sequentially into a background and foreground grid like a TV set would do and store the resulting closed captions as grid snapshots into an array whenever the foreground grid changes.
Only captions in data channel 1 are read. Also, invalid byte parity will raise an error unless checking is disabled. The advanced recovery methods mentioned in CEA608 are not implemented since the source is assumed to contain no errors (e.g. DVD source).
Constants
- EXTENDED_CHARACTER_MAPS
Extended character maps for Western European languages and box drawing
- LINE_REGEXP
Regular expression for parsing one line of data
- PREAMBLE_ADDRESS_CODE_ROW_MAP
Map of preamble address code high bytes to their corresponding base row numbers (counted from 0)
- SPECIAL_CHARACTER_MAP
Map of special characters to unicode codepoints 0x39 transparent space is handled specially since it is not a real character
- STANDARD_CHARACTER_MAP
Map of standard characters that do not match the standard ASCII codes to their corresponding unicode characters
Attributes
Actual conversion result
Public Instance Methods
Read an SCC file from the IO object io for a video
# File lib/subconv/scc/reader.rb, line 304 def read(io, fps, check_parity = true) # Initialize new grids for character storage @foreground_grid = Grid.new @background_grid = Grid.new # Initialize state @state = State.default @captions = [] @now = Timecode.new(0, fps) @data_channel = 0 update_active_grid magic = io.readline.chomp! fail InvalidFormatError, 'File does not start with "' + Scc::FILE_MAGIC + '"' unless Scc::FILE_MAGIC == magic io.each_line do |line| line.chomp! # Skip empty lines between the commands next if line.empty? line_data = LINE_REGEXP.match(line) fail InvalidFormatError, "Invalid line \"#{line}\"" if line_data.nil? # Parse timecode old_time = @now timecode = Timecode.new(line_data[:timecode], fps) @now = timecode fail InvalidFormatError, 'New timecode is behind last time' if @now < old_time # Parse data words parse_data(line_data[:data], check_parity) end end
Private Instance Methods
Check a byte for odd parity
# File lib/subconv/scc/reader.rb, line 593 def correct_parity?(byte) byte.ord.to_s(2).count('1').odd? end
Insert a CEA608 character into the grid at the current position, converting it to its unicode representation
# File lib/subconv/scc/reader.rb, line 435 def handle_character(byte) # Ignore filler character return if byte.zero? char = STANDARD_CHARACTER_MAP[byte.chr] insert_character(char) end
Process a miscellaneous control code
# File lib/subconv/scc/reader.rb, line 501 def handle_control_code(hi, lo) if hi == 0x14 && lo == 0x20 # Resume caption loading @state.mode = :pop_on update_active_grid elsif hi == 0x14 && lo == 0x21 # Backspace unless @state.column.zero? # Ignore in the first column @state.column -= 1 # Delete character at cursor after moving one character back @active_grid[@state.row][@state.column] = nil end elsif hi == 0x14 && lo == 0x24 # Delete to end of row (@state.column...GRID_COLUMNS).each do |column| @active_grid[@state.row][column] = nil end elsif hi == 0x14 && lo == 0x28 # Flash on # Flash is a spacing character insert_character(' ') @state.style.flash = true elsif hi == 0x14 && lo == 0x29 # Resume direct captioning @state.mode = :paint_on update_active_grid # elsif hi == 0x14 && lo == 0x2b # Resume text display -> command not described in spec # fail "RTD" elsif hi == 0x14 && lo == 0x2c # Erase displayed memory @foreground_grid = Grid.new post_frame update_active_grid elsif hi == 0x14 && lo == 0x2e # Erase non-displayed memory @background_grid = Grid.new update_active_grid elsif hi == 0x14 && lo == 0x2f # End of caption (flip memories) @foreground_grid, @background_grid = @background_grid, @foreground_grid # This also forces pop-on mode @state.mode = :pop_on post_frame update_active_grid elsif hi == 0x17 && lo >= 0x21 && lo <= 0x23 # Tab offset # Bits 0 and 1 designate how many columns to go @state.column += (lo & 0x3) else puts "Ignoring unknown control code #{hi}/#{lo}" end end
Insert an extended character into the grid at the current position, converting it to its Unicode representation
# File lib/subconv/scc/reader.rb, line 457 def handle_extended_character(map, byte) char = EXTENDED_CHARACTER_MAPS.fetch(map).fetch(byte) # Extended characters include automatic backspace+overwrite @state.column -= 1 insert_character(char) @state.char_replaced = true end
Process a mid-row code
# File lib/subconv/scc/reader.rb, line 557 def handle_mid_row_code(_hi, lo) # Mid-row codes are spacing characters insert_character(' ') # Low byte bit 0 indicates whether underlining is to be enabled @state.style.underline = ((lo & 1) == 1) # Low byte bits 1 to 3 are the color code color = (lo >> 1) & 0x7 if color == 0x7 @state.style.italics = true else # Color mid-row codes disable italics @state.style.italics = false @state.style.color = Color.for_value(color) end # All mid-row codes always disable flash @state.style.flash = false end
Set drawing position and style according to the information in a preamble address code
# File lib/subconv/scc/reader.rb, line 466 def handle_preamble_address_code(hi, lo) @state.row = PREAMBLE_ADDRESS_CODE_ROW_MAP.fetch(hi) # Low byte bit 5 adds 1 to the row number if set @state.row += 1 if lo & (1 << 5) != 0 # Low byte bit 0 indicates whether underlining is to be enabled @state.style.underline = ((lo & 1) == 1) # Low byte bit 4 indicates whether it is an indent or a formatting code is_indent = (((lo >> 4) & 1) == 1) # Low byte bits 1 to 3 are the color or indent code, depending on is_indent color_or_indent = (lo >> 1) & 0x7 # Reset style @state.style.flash = false @state.style.italics = false if is_indent # Indent code always sets white as color attribute @state.style.color = Color::WHITE # One indent equals 4 characters @state.column = color_or_indent * 4 else # Style code always sets first column @state.column = 0 if color_or_indent == 7 # "color" 7 is white with italics @state.style.color = Color::WHITE @state.style.italics = true else @state.style.color = Color.for_value(color_or_indent) end end end
Insert a special character into the grid at the current position, or delete the current column in case of a transparent space.
# File lib/subconv/scc/reader.rb, line 445 def handle_special_character(byte) if byte == 0x39 # Transparent space: Move cursor after deleting the current column to open up a hole @active_grid[@state.row][@state.column] = nil @state.column += 1 else char = SPECIAL_CHARACTER_MAP.fetch(byte) insert_character(char) end end
Insert one unicode character into the grid at the current position and with the current style, then advance the cursor one column
# File lib/subconv/scc/reader.rb, line 429 def insert_character(char) @active_grid[@state.row][@state.column] = Character.new(char, @state.style.dup) @state.column += 1 end
Parse one line of SCC data
# File lib/subconv/scc/reader.rb, line 341 def parse_data(data, check_parity) last_command = [0, 0] data.split(' ').each do |word_string| begin @state.start_new_frame # Decode hexadecimal word into two-byte string word = [word_string].pack('H*') # Check parity fail ParityError, "At least one byte in word #{word_string} has even parity, odd required" unless !check_parity || (correct_parity?(word[0]) && correct_parity?(word[1])) # Remove parity bit for further processing word = word.bytes.collect { |byte| # Unset 8th bit (byte & ~(1 << 7)) } hi, lo = word # First check if the word contains characters only if hi >= 0x20 && hi <= 0x7f # Skip characters if last command was on different channel if @data_channel != 0 puts 'Skipping characters on channel 2' next end [hi, lo].each do |byte| handle_character(byte) end # Reset last command last_command = [0, 0] else if word == last_command # Skip commands transmitted twice for redundancy # But don't skip the next time, too last_command = [0, 0] next end # Channel information is encoded in the 4th bit, read it out @data_channel = (hi >> 3) & 1 if @data_channel != 0 puts 'Skipping command on channel 2' next # If channel 2 processing is needed, parse the file two times and # change the above condition as needed, then unset the channel bit # for further processing. end # rubocop:disable Style/NumericPredicate if hi == 0x11 && (0x30..0x3f).cover?(lo) # Special character handle_special_character(lo) elsif (0x12..0x13).cover?(hi) && (0x20..0x3f).cover?(lo) # Extended character handle_extended_character(hi & 1, lo) elsif (0x10..0x17).cover?(hi) && (0x40..0xff).cover?(lo) # Preamble address code handle_preamble_address_code(hi, lo) elsif [0x14, 0x17].include?(hi) && (0x20..0x2f).cover?(lo) handle_control_code(hi, lo) elsif hi == 0x11 && (0x20..0x2f).cover?(lo) handle_mid_row_code(hi, lo) elsif hi == 0x00 && lo == 0x00 # Ignore filler else puts "Ignoring unknown command #{hi}/#{lo}" end # rubocop:enable Style/NumericPredicate last_command = word end post_frame if @state.paint_on_mode? ensure # Advance one frame for each word read @now += 1 end end end
Insert the currently displayed foreground grid as caption into the captions array Must be called whenever the foreground grid is changed as a result of a command
# File lib/subconv/scc/reader.rb, line 578 def post_frame # Only push a new caption if the grid has changed, but do not push out an empty grid initially return unless @foreground_grid != @last_grid && !(@last_grid.nil? && @foreground_grid.empty?) # Save space by not saving the grid if it is completely empty grid = @foreground_grid.empty? ? nil : @foreground_grid @captions.push(Caption.new(timecode: @now, grid: grid.clone, mode: @state.mode, char_replacement: @state.char_replaced)) @last_grid = @foreground_grid.clone end
# File lib/subconv/scc/reader.rb, line 588 def update_active_grid @active_grid = @state.paint_on_mode? ? @foreground_grid : @background_grid end